Why Do You Need A High Performance Event Log Reader?
I ran into this issue when trying to figure out how to keep up with Windows’ ability to generate logs out the nose when full auditing is turned on. WMI would bury a CPU and refuse to keep up. The next solution was to write an event log reader in C#, it was barely able to keep up (and fell behind at times), and put a significant strain on the server, there must be an easier way if Windows is able to write logs this quickly, surely I can read them.
The problem is that the format that you consume these logs in through the various methods listed above is not the format Windows writes them to disk in.
Can’t This Be Done In .NET’s Libraries?
The short answer is no. The long answer is “well, yes, but you’re going to be calling C level libraries via PInvoke, and have to redeclare a lot of things defined in Windows.h + WinBase.h”, I will provide all the tools needed to write one in C, at which point you can wrap it in a C++/CLR or PInvoke your way to victory.
Typical Event Log Data Access Via C# And Performance Issues
namespace TestApplication { using System; using System.Diagnostics; class Program { static void Main(string[] args) { EventLog eventLog = new EventLog("System"); EventLogEntry entry = eventLog.Entries[0]; Console.WriteLine("EventID: {0}", entry.EventID); Console.WriteLine("TimeGenerated: {0}", entry.TimeGenerated); Console.WriteLine("Message: {0}", entry.Message); Console.ReadKey(); } } }
Wow that was easy! Well sort of… this requires us to read all of the system event log into from eventLog.Entries, this can take awhile to load if you’re on a system with a large number of event logs. On top of having to read all of the data in at once with no ability to seek (corrected by Dmitry below), we have to process the entry’s message… we’ll get into that when we dive into the C code to get that done. There are additional methods (EventLogQuery/EventLogReader), but these don’t help enough.
How Can We Overcome The Inefficiencies?
Like with most other things: go down to C! There are many benefits to the raw WINAPI methods, those include:
- Only format the event body when we want it, otherwise we can just inspect the string payload.
- We could cache the event log format strings as we need them improving processing performance for repetitive messages.
- Read from the front or the back (struct has length at the beginning and end of it for easy skipping of records).
- C WINAPI calls is about as light weight as you’re going to get it and not go insane (you’ll go a little insane from WINAPI alone).
These of course can be consumed using PInvoke and juggling types into .NET/using marshals/unsafe blocks. I'm not going to get into it as we're exploring these data structures in C, I'll probably be releasing a full managed library later so keep an eye out for it.#include "stdafx.h" #include <iostream> #include <Windows.h> #include <WinBase.h> #include <strsafe.h> using namespace std; #define MAX_TIMESTAMP_LEN 23 + 1 void GetTimestamp(const DWORD Time, WCHAR DisplayString[]); int main(array<System::String ^> ^args) { // Simple enough, we're just opening the System event log from our // local system. // Detailed here: // http://msdn.microsoft.com/en-us/library/windows/desktop/aa363672(v=vs.85).aspx HANDLE hEventLog = OpenEventLog(NULL, L"System"); if(hEventLog == INVALID_HANDLE_VALUE) { cout << "Invalid handle value returned." << endl; } LPVOID lpBuffer = 0; DWORD dwReadFlags = EVENTLOG_SEQUENTIAL_READ | EVENTLOG_FORWARDS_READ; DWORD dwRecordOffset = 0; DWORD nNumerOfBytesToRead = 1024; DWORD pnBytesRead = NULL; DWORD pnMinNumberOfBytesNeeded = NULL; DWORD recordCount = 0; PBYTE lastRecord = 0; PBYTE currentRecord = 0; lpBuffer = (PBYTE)malloc(nNumerOfBytesToRead); if(lpBuffer == NULL) { cout << "Not enough memory." << endl; } // ReadEventLog is detailed here: // http://msdn.microsoft.com/en-us/library/windows/desktop/aa363674(v=vs.85).aspx if(!ReadEventLog( hEventLog, dwReadFlags, dwRecordOffset, lpBuffer, nNumerOfBytesToRead, &pnBytesRead, &pnMinNumberOfBytesNeeded )) { cout << "Error: " << GetLastError() << endl; cout << "Failure to read event log" << endl; } GetNumberOfEventLogRecords(hEventLog, &recordCount); cout << "Records found: " << recordCount << endl; // PEVENTLOGRECORD is detailed here: // http://msdn.microsoft.com/en-us/library/windows/desktop/aa363646(v=vs.85).aspx PEVENTLOGRECORD pRecord = (PEVENTLOGRECORD)lpBuffer; // Breaking EventID down. // http://msdn.microsoft.com/en-us/library/windows/desktop/aa363651(v=vs.85).aspx int code = (pRecord->EventID & 0x0000FFFF); int facility = (pRecord->EventID & 0x0FFF0000) >> 16; int reserved = (pRecord->EventID & 0x10000000) >> 28; int customer = (pRecord->EventID & 0x20000000) >> 29; int sev = (pRecord->EventID & 0xC0000000) >> 30; // The source is the first record // See remarks: // http://msdn.microsoft.com/en-us/library/windows/desktop/aa363646(v=vs.85).aspx // WCHAR SourceName[] <--- We're getting this one. // WCHAR Computername[] <-- We could do a "wcslen(pSource) + 1" to read this. // SID UserSid <-- "wcslen(pComputerName) + 1" // WCHAR Strings[] <-- We get this later using StringOffset. // BYTE Data[] // CHAR Pad[] // DWORD Length <-- This is the length at the end of the record for // reading from the end. LPWSTR pSource = (LPWSTR)(((PBYTE)pRecord) + sizeof(EVENTLOGRECORD)); // We're going to print out some basic information. cout << "EventID: " << pRecord->EventID << endl; cout << "Code: " << code << endl; cout << "facility: " << facility << endl; cout << "reserved: " << reserved << endl; cout << "customer: " << customer << endl; cout << "sev: " << sev << endl; cout << "Number of strings: " << pRecord->NumStrings << endl; wcout << L"Source:" << pSource << endl; // Pull the timestamp. WCHAR TimeStamp[MAX_TIMESTAMP_LEN]; GetTimestamp(pRecord->TimeGenerated, TimeStamp); wcout << L"Time: " << TimeStamp << endl; // This information is found under // HKLM/SYSTEM/CurrentControlSet/services/eventog/System/ // Service Control Manager/EventMessageFile // We know this is from "Service Control Manager" by it's source listed in // the event log. In reality we'd use the source to dig the registry and // load this... HANDLE hResources = LoadLibraryEx( L"c:\\Windows\\system32\\services.exe", NULL, LOAD_LIBRARY_AS_IMAGE_RESOURCE | LOAD_LIBRARY_AS_DATAFILE ); if (NULL == hResources) { cout << "LoadLibrary failed with " << GetLastError() << endl; } // Build out our argument list from the strings stored with the event. DWORD dwFormatFlags = FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_FROM_HMODULE | FORMAT_MESSAGE_ALLOCATE_BUFFER; DWORD_PTR* pArgs = (DWORD_PTR*)malloc(sizeof(DWORD_PTR) * pRecord->NumStrings); LPWSTR pString = (LPWSTR)(((PBYTE)pRecord) + pRecord->StringOffset); // This is only used for arrays, but for a quick PoC we'll just code it // because it's easier to understand, see how Microsoft does it in // http://msdn.microsoft.com/en-us/library/windows/desktop/bb427356(v=vs.85).aspx // under "GetMessageString". dwFormatFlags |= FORMAT_MESSAGE_ARGUMENT_ARRAY; for(DWORD i = 0; i < pRecord->NumStrings; i++) { pArgs[i] = (DWORD_PTR)pString; pString += wcslen(pString) + 1; } // Get our formatted message from our resource file + argument list we built. LPWSTR pMessage = NULL; if(!FormatMessage( dwFormatFlags, hResources, // This is the library that we loaded earlier, it contains // the string format information. pRecord->EventID, 0, (LPWSTR)&pMessage, 0, (va_list*)pArgs)) { cout << "Format failed with: " << GetLastError() << endl; } else { wcout << "strings: " << pMessage << endl; } free(pArgs); cin.ignore(); return 0; } // Taken from http://msdn.microsoft.com/en-us/library/windows/desktop/bb427356(v=vs.85).aspx void GetTimestamp(const DWORD Time, WCHAR DisplayString[]) { ULONGLONG ullTimeStamp = 0; ULONGLONG SecsTo1970 = 116444736000000000; SYSTEMTIME st; FILETIME ft, ftLocal; ullTimeStamp = Int32x32To64(Time, 10000000) + SecsTo1970; ft.dwHighDateTime = (DWORD)((ullTimeStamp >> 32) & 0xFFFFFFFF); ft.dwLowDateTime = (DWORD)(ullTimeStamp & 0xFFFFFFFF); FileTimeToLocalFileTime(&ft, &ftLocal); FileTimeToSystemTime(&ftLocal, &st); StringCchPrintf(DisplayString, MAX_TIMESTAMP_LEN, L"%d/%d/%d %.2d:%.2d:%.2d", st.wMonth, st.wDay, st.wYear, st.wHour, st.wMinute, st.wSecond); }Further Improvements To Be Made
A few things need to be done to this design, first we need to be able to load new resources depending on the source so that we always can pull the formatted strings we need, for that we'll need to dig around the registry. Of course this should be rewritten to spit out managed objects (either via C++/clr or PInvoke) and give a much more developer-friendly interface.
did you ever release the managed library for this?
Shoot, no I did not. Sadly the project I needed this for never got past the prototype stage.
A couple of comment on your article:
Look into the EventLog source code via reflector. It reads records one by one. So in your example (EventLogEntry entry = eventLog.Entries[0];) only the first record will be read.
FormatMessage API is not called unless you request data from EventLogEntry.Message field
Event log uses ReadEventLog under the hood, so PInvoke does not give you much. EVENTLOG_SEEK_READ does not work on logs with big size (documented issue).
If I query remote log and measure performance, I get about 200 events\sec (local computer is Windows 8.1 and remote is Windows Server 2008 R2)
Using new Vista Eventing API (implemented via EventLogReader in C#) I get about 3000 events\sec. This is also not enough speed, I believe for large event log files.
So, from my point of view the only solution to process large event log files with millions of event records is to use C\C++ and call WinApi directly
Hey Dmitry, thanks for the comments!
Shoot, it’s been so long, I don’t remember if I either assumed it was reading the entire log due to the speed, or if I meant to be able to get any information on the log I’d have to pull from the entries one by one.
Right, and unlike the C APIs (if I remember correctly…), I can’t just pull the metadata out in bulk and process the message templates later or easily store data in a more efficient manner. If I can ignore the Message field then I can speed things up a lot. But this project was concerned about basically bulk-dumping all logging information.
I wonder if this is improved at all with .NET Core.
I’ve made some minor corrections basically striking out what is wrong. I appreciate the points you made.