Creating an ETW Provider Step by Step

Event Tracing for Windows is a strongly-typed, high-volume, low-latency tracing framework for Windows. It’s also frustratingly difficult to use. MSDN has a ton of documentation but no simple ‘Hello World’ end-to-end example. Hopefully this post will help you get started writing your own ETW events from your application.

Creating an ETW provider requires the following steps. Each step is described in detail below.

  1. Create the event manifest.
  2. Compile the manifest to produce headers and resources.
  3. Register application as a provider and write events.
  4. Register the event manifest with ETW.

Background

Before diving in, you will need to know a few basics about ETW. First of all ETW is not a very friendly API. It’s fast and extensible but producing and consuming events is challenging. Just getting a program set up to receive events takes almost 100 lines of C code (mostly esoteric flags and parameters). Strangely, producing events is much easier than reading them.

In C# you can use the EventSource and EventListener classes in System.Diagnostics.Tracing but those libraries don’t produce events that are easily read by existing tools like logman or Message Analyzer (which Microsoft deprecated). These libraries aren’t an option from native code so you’re stuck with the C API.

Before you continue, take a look over the following articles to get acquainted with ETW jargon.

Creating the Event Manifest

ETW events are structured – they have a strongly typed schema. This means that you need to plan for what you will be logging before you get started. You could simply define an event with a single string field and use it to log strings (there’s actually a special case for that), but you’d be missing out on one of the best features of ETW.

Define the events

The first thing you need to do is decide what your events should look like. For this tutorial we’ll build a system that logs an event that carries a single integer metric and a string describing the metric.

// simple metric event
struct metric {
    const wchar_t*  metric_name
    int             metric_value;
};

NOTE: this code is not required, this just shows the shape of the data to emit

You can use this event in your code to emit metrics like the number of items processed or the number of milliseconds spent in a call. It’s very simple but it allows you to add arbitrary metrics in new code without recompiling the manifest and you can write a consumer application that can monitor these metrics in real time.

Create the manifest file

Once you know the shape of the data you want to log, you will need to write a manifest. The manifest provides the following information about events.

  • The provider Id – this is unique on the system and is used to identify your provider.
  • The resource file name – this is the absolute path to the location where you will eventually put the binary (DLL or exe) that contains the schema for your events. Yes, you have to include a path to the location where you will put the DLL that contains the compiled version of the manifest in the manifest!
  • The event Id – this Id is unique within the provider
  • The event name and description – these provide human-readable strings that describe the event. These strings can be localized.
  • The event template – this is basically a schema but it can be reused across multiple events. It’s optional if you don’t have any additional data to send.

Or as MSDN puts it:

Metadata fields are combined for each event and defined in an tag that is uniquely associated with an event ID. Event layouts are specified with a tag which describes user-specified context data for each event. The layout can specify data fields such as strings and integers, or other more complex data structures. Template information does not have to be specified for all events; however, if unspecified for a particular event, it will have no user context data.

There are many other things that can be defined in the manifest file but these are the only things we need to know for now.

The manifest file is an XML document that has a pretty complex structure. You can download an XSD and wing it in your favorite editor or you can use ECManGen.exe which provides a questionably (try it, you’ll see) nicer experience for creating and editing manifest files. ECManGen.exe comes with the Windows SDK and on my machine is found here: C:\Program Files (x86)\Windows Kits\8.1\bin\x64\ecmangen.exe. I’m not going to describe how to use this tool – although not beautiful, it’s pretty self-explanatory.

TIP: after creating your provider, create the template first – you will need to reference it when creating the event.

Here’s the finished manifest file. It defines a provider called Hello World Metric Provider with a symbol HELLO_METRIC. The symbol is the name given to the GUID value in the generated headers. The provider has one event called SIMPLE_METRIC with an event Id of 1 and a schema defined by template Simple Metric. The Simple Metric template has a Unicode string field called metric_name and an integer field called metric_value.

Notice the resource and message file names. helloworld.dll will need to be generated and added to the Windows folder so other applications will be able to find the schema.

helloworldmetric.man

<?xml version="1.0" encoding="UTF-16"?>
<instrumentationManifest
    xsi:schemaLocation="http://schemas.microsoft.com/win/2004/08/events eventman.xsd"
    xmlns="http://schemas.microsoft.com/win/2004/08/events"
    xmlns:win="http://manifests.microsoft.com/win/2004/08/windows/events"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:trace="http://schemas.microsoft.com/win/2004/08/events/trace">
    <instrumentation>
        <events>
            <provider
                name="Hello World Metric Provider"
                guid="{06A2EA53-FC6C-42E5-9176-18749AB2CA13}"
                symbol="HELLO_METRIC"
                resourceFileName="c:\Windows\helloworld.dll"
                messageFileName="c:\Windows\helloworld.dll">
                <events>
                    <event
                        symbol="SIMPLE_METRIC"
                        value="1"
                        version="1"
                        template="Simple Metric"
                        message="$(string.Hello World Metric Provider.event.1.message)">
                    </event>
                </events>
                <templates>
                    <template tid="Simple Metric">
                        <data
                            name="metric_name"
                            inType="win:UnicodeString"
                            outType="xs:string">
                        </data>
                        <data
                            name="metric_value"
                            inType="win:Int32"
                            outType="xs:int">
                        </data>
                    </template>
                </templates>
            </provider>
        </events>
    </instrumentation>
    <localization>
        <resources culture="en-US">
            <stringTable>
                <string
                    id="Hello World Metric Provider.event.1.message"
                    value="A simple metric">
                </string>
            </stringTable>
        </resources>
    </localization>
</instrumentationManifest>

Note: the file encoding must match the encoding specified in the XML tag.

Compile the Event Manifest

Once you have a manifest defined you need to compile it. This will produce a header file that has the provider GUID and event descriptors defined as well as a resource that needs to be included in a binary which will be registered with ETW as the source for schemas for schemas for events from your provider. You can also produce C# files that can log your events. The generated code is… functional (read: it’s gross).

To compile the manifest, you need the mc.exe Message Compiler tool. Message compiler is installed with Visual Studio and is found here on my machine: C:\Program Files (x86)\Windows Kits\8.1\bin\x64\mc.exe. Specify the -um parameter which will generate macros for logging your events. This is optional, but without it, you’ll have to manage registering the provider and logging events yourself.

mc.exe -um helloworldmetric.man

This will compile the manifest and generate the resources and headers that you need to include when generating the helloworld.dll.

  • helloworldmetric.h – defines the provider guid and event descriptors. If -um was specified, also contains generated helper macros to register/unregister the provider as well as log events.
  • helloworldmetric.rc – a resource script that includes the compiled bin files.
  • helloworldmetricTEMP.BIN – the compiled schema for your events. This is a binary file that the ETW system will use to unpack the events logged by your provider.
  • MSG00001.bin – contains the localized strings from the manifest.

From Visual Studio

You can include these files directly into a project or generate them during the build. The simplest way is to add a pre-build step that runs mc.exe to generate the files. You can specify parameters to mc that let you control where generated files are written. Then you’ll have to manually modify the project file to include the generated files. Just remember that the output of the build must eventually be written to the path specified in the manifest.

In my sample project, I use a batch file to run mc.exe and rc.exe to produce the files above and compile the resource file. In the linker options for the project, specify the .res file generated by rc.exe as an “Additional Dependency”. Once built, you can verify that your executable has the manifest embedded by opening the .dll or .exe in Visual Studio. In the resources you should see a resource name starting with WEVT_.

From command line

If you don’t want to set up a project, you have everything you need at this point to build the helloworld.dll. You need to compile the resource file using Resource Compiler found in the same directory as mc.exe. Finally, you’ll have to link the compiled resource into a DLL using link.exe (in the Visual Studio bin folder).

From a Visual Studio Developer Prompt (which has the PATH set up already)

mc.exe `helloworldmetric.man`
rc.exe `helloworldmetric.rc`
link.exe  /dll /noentry /machine:x64 helloworldmetric.res /OUT:helloworld.dll

Registering the Provider and Writing Events

Now that you have the generated header, you can register your program as a provider. This basically tells ETW that you’ll be writing events as this provider (based on its GUID). You might not have considered this, but because the program gets to pick which providers and events to log, it means that any program can log as any provider. This makes sense for things like .NET were any program can log the .NET events like garbage collections or context switches. It also means that you can write applications to simulate other providers emitting events which can be useful for testing or spoofing events.

Any program can log events as any provider.

You’ll need to include Windows.h, Evnprov.h and the header from the generated manifest in the code you want to log from.

#include 
#include 
#include "helloworldmetric.h"

void do_work(int i)
{
    // actually log an event
    EventWriteSIMPLE_METRIC(L"test event", i);
}

int main(int argc, char** argv)
{
    // register this program with ETW
    EventRegisterHello_World_Metric_Provider();

    for (int i = 0; i < 10; ++i) {
        do_work(i);
    }

    // unregister this program with ETW
    EventUnregisterHello_World_Metric_Provider(); 

    return 0;
}

Register the Manifest

Now that you have some code that writes an ETW event, you can start to listen to the provider. Run Message Analyzer and listen to the provider by specifying the GUID from the manifest (you’ll have to add a Custom Provider). When you run the test program, you’ll see your events but they won’t have a schema associated with them. This is because the manifest has not been registered with ETW.

To register the manifest, you’ll need to use WevtUtil.exe with the im parameter.

wevtutil.exe im helloworldmetric.man

Remember the helloworld.dll? This is when you actually need the file specified in the manifest to exist. The compiled resources themselves don’t actually contain the path so you can update the paths in the manifest before registering the event, the DLL just has to exist when you run wevtutil.exe.

Now if you run Message Analyzer, you’ll find ‘Hello World Metric Provider’ in the list of system providers. If you listen to that provider, and run the test program, you’ll see the events are received but have a schema now. Huzzah!

If you want to modify the manifest, you must unregister the manifest first using the um parameter before reinstalling.

wevtutil.exe um helloworldmetric.man

Wrap up

Getting set up to log and consume manifest events is complicated, but once you’ve done it a few times it’s not too bad. Using ETW allows you to log events that you can dump to files, the Windows Event Log or even consume the traces in real-time from other processes. You can build powerful watchdog/monitor systems this way or use existing tools like Message Analyzer to collect and analyze trace data.

UPDATE: Microsoft has deprecated the Message Analyzer. Their suggestion is to use something like WireShark. I haven’t tried that, but I have found a few extensions for WireShark which claim to process ETW. Your best bet for visualization may be to write your own tooling with KrabsETW or EventTrace in .NET.


Additional Resources

BDD-Style Testing with MSTest

A way to make BDD-style “specs” using the MSTest framework in Visual Studio.

Why?

BDD tests are a nice, human-understandable way to write tests.
The idea is to write the test code in a way that it can be read and understood
like a functional spec. The benefit of this approach is that you end up with
a set of tests that are self-descriptive and act as a living spec that
can both validate and describe the code that they test.

There are many BDD test frameworks available, but the ones I looked at
lacked good Visual Studio integration. One option would be to write
adapters that allow other frameworks – like nspec
to run with the MSTest runner. The biggest problem with that is that
without significant work, all you end up with is a single pass/fail
for the hosted framework result. Also, debugging through multiple test
harnesses is a pain. So I ended up writing BDD-style in MSTest directly.

Concepts

There are a few core concepts to grok first.

  • Tests should describe expected behavior not implementation details.
    You don’t want the test to fail because some expected API wasn’t called.
    It’s more important that the expected outcome be tested.
  • Tests should only test one behavior at a time. When a test fails, it should
    obvious to the developer what aspect of the behavior is broken. Mondo tests
    with tons of setup and assertions make debugging failures a problem.
  • Tests (and their context) must accurately describe a behavior. If you
    cannot describe the behavior you’re trying to test in a few words, you’re
    trying to test too many things at once. It’s better to break the test
    into multiple tests or add a new context.
  • Test should be written first. With many BDD frameworks it’s simple to write
    a set of basic tests that assert the behavior the code should have. Once in
    place, it’s very useful to have the tests there as a feedback mechanism that
    the code is behaving as expected.

Check out betterspecs.org for more concepts and
best practices. It’s targeted toward Ruby, but the concepts are the same.

Specs, Contexts, and Tests

The basic layout of a test looks something like this:

  • describe_MyClass (a spec)
    • when_some_condition (a context)
      • it_should_behave_like_this (an assertion)

A “spec” describes some piece of code. Typically this will be a container
object in the test framework – a class in C#. The spec contains a set of
“contexts”. Contexts describe some state surrounding a test. This can be
both literal state (of fields, etc.) or it can be a description of an
action that is being taken.

For example “when running” would be an example of a state context and
“when concatenating with a string” would be an example of an action.

Contexts will then have “tests/assertions”. These are the behaviors
that the code is expected to have given the context they are in. Usually
a context will have multiple assertions which both test and describe the
behavior of the code for a given context.

For example in the context of “when running”: “IsRunning should return true”
and “calling start should throw InvalidOperation” would be tests that
explain and test the expected behavior.

BDD in MSTest

With a basic understanding of BDD concepts, lets look at how we can write
BDD-style tests using the MSTest framework built into Visual Studio.

To get started, you’ll need to add a new UnitTest project to your solution.

Namespaces

There are a few ways to organize your test code so that it’s easy to
determine what product code it’s meant to test.

One way is to make a parallel namespace structure:

  • ProductName.TheNamespace
  • ProductName.TheNamespace.Utils
  • ProductName.Test.TheNamespace
  • ProductName.Test.TheNamespace.Utils

The problem with this approach is that you end up with namespace name
collisions which leads to confusing code and usually ends up with
name and type aliasing. It also really confuses Intellisense which makes
authoring the tests a chore.

Another way, and the way I’m recommending, is to have flattened test namespaces.

  • ProductName.TheNamespace
  • ProductName.TheNamespace.Utils
  • ProductName.Test.TestTheNamespace
  • ProductName.Test.TestUtils

This makes it clear that you’re testing things in TheNamespace and also
prevents the annoying namespace collisions. One thing to note is that
because the namespaces are flattened, you should avoid the having namespaces
with the same name in your product code. This isn’t a requirement, but
it makes things saner and is generally a better practice.

Specs

Specs in MSTest are mapped 1:1 with classes. Spec classes should be named
like describe_<ClassName>. The top-level test class should NOT be
marked with the [TestClass] attribute because it itself should not contain
any test code*.

*One exception may be if you want a [TestInitialize] method that
would apply to all of the tests in your spec.

For example let’s write a spec for a class called Base64Utilities. I’ll
build on this example throughout the rest of the doc.

public class describe_Base64Utilities {
  // ... test code ...
}

The only code that should be in the top-level class is member fields
possibly some helper code that is applicable to all of the tests in the spec.

Contexts

We need to have a way to define some test contexts. This could be state, actions,
or in this case, a function. To express the context, use an inner class.

For the example, I’m going to describe the static helper method EnsureValidLength.
To add this context I would define an inner class that inherits from the top-level class.
This class will be marked with [TestClass].

Inheritance gives us access to the helpers and state defined in the top-level class.
Also, inner classes show up in the test explorer as TopLevel.InnerClass which makes
the test explorer and test results more readable.

public class describe_Base64Utilities
{
    [TestClass]
    public class EnsureValidLength : describe_Base64Utilities
    {
      // ... test code ...
    }
}

In the inner class, you can also define a method and mark it as [TestInitialize]
and MSTest will ensure that the code will be run before any test methods.

Tests

Once your context has been written, you can write some tests. The general naming
convention I prefer is something like “given_blah_it_should_return_foobar” or
“it_should_run_the_initializer”. This style is more readable than the same
names in camel or Pascal case. “ForExampleThisIsTheAlternativeAndItIsNotAsNice”.

The test method itself should do very little work itself and it should only
assert the things that it’s explicitly trying to validate. This is true even
if there are other things in the test that you might want to assert; resist the
temptation and create two tests instead. It will make test failures more useful and
will make your spec more comprehensive.

Here are a few example tests.

public class describe_Base64Utilities
{
    [TestClass]
    public class EnsureValidLength : describe_Base64Utilities
    {
          [TestMethod]
          public void given_null_or_empty_it_string_should_return_null()
          {
              Assert.IsNull(Base64Utilities.EnsureValidLength(null));
              Assert.IsNull(Base64Utilities.EnsureValidLength(String.Empty));
          }

          [TestMethod]
          public void given_string_less_than_4_chars_it_should_return_null()
          {
              Assert.IsNull(Base64Utilities.EnsureValidLength("AAA"));
          }

          // ... more tests ...
    }
}

Test Explorer and Test Results

What do you get from writing your tests this way?

First, in the Visual Studio Test Explorer, if you sort by type
the tests are nicely grouped by the inner class names.

  • describe_Base64Utilities.EnsureValidLength
    • given_null_or_empty_it_string_should_return_null
    • given_string_less_than_4_chars_it_should_return_null
    • … other tests
  • describe_Base64Utilities.DecodePartialBase64String
    • given_invalid_base64_it_should_return_null
    • … other tests

When you run the tests the results are more meaningful because it’s
easy to understand what behavior is working and what is not.

Appendix

Gotchas

  • If you inherit from a class “context” that has TestMethods in it
    the derived class will also run the test methods defined in its parents.

    • To deal with that you should either not have test methods defined in parent scopes or you must hide them in the derived class by redefining them with ‘new’.
  • Nested classes need to have different names for their before methods.
    • If they are hidden with new, the base implementations will not be run.

Testing MSTest

Included in case you want to play around with the MSTest behavior yourself.

public class describe_NestedContexts
{
    public string theState = String.Empty;

    [TestInitialize]
    public void before_all()
    {
        theState += "Before Parent";
    }

    [TestClass]
    public class when_running_tests_in_inner_classes : describe_NestedContexts
    {
        [TestInitialize]
        public void before()
        {
            theState += ", Before Child";    // before names must be unique or they will collide
        }

        [TestMethod]
        public void it_should_run_all_test_initializers()
        {
            Assert.AreEqual("Before Parent, Before Child", theState);
        }

        [TestClass]
        public class when_going_deeper : when_running_tests_in_inner_classes
        {
            [TestInitialize]
            public void before_deeper()   // before names must be unique or they will collide
            {
                theState += ", Before Other Child";
            }

            // have to hide the parent's test methods
            public new void it_should_run_all_test_initializers() { }
            public new void it_should_not_be_affected_by_child_classes() { }

            [TestMethod]
            public void it_should_run_all_test_initializers_again()
            {
                Assert.AreEqual("Before Parent, Before Child, Before Other Child", theState);
            }
        }

        [TestMethod]
        public void it_should_not_be_affected_by_child_classes()
        {
            Assert.AreEqual("Before Parent, Before Child", theState);
        }
    }
}

Disabling the Stack Cookie Generation in Visual Studio 2013

I wanted to write a simple program that would let me test out some basic stack exploits. Here’s a very simple example that should crash because of stack corruption.

#include <string.h>

int test_bof() {
  char buffer[10];
  strcpy(buffer, "aaaaaaaaaaaaaaaaaaaaaaaaaaa");
}

int main(int argc, char** args) {
  test_bof();
  return 0;
}

When I ran the program, instead of an access violation, I got this error:

Unhandled exception at 0x00412059 in SimpleStackBof.exe:
Stack cookie instrumentation code detected a stack-based buffer overrun.

Interesting. I enabled the Assembler Output (C++ > Output Files > Assembler Output /FAs) and looked at the generated assembly code.

_TEXT SEGMENT
_buffer$ = -16 ; size = 10
__$ArrayPad$ = -4 ; size = 4
test_bof PROC ; COMDAT

; 5 : {

push ebp
mov ebp, esp
sub esp, 80 ; 00000050H
mov eax, DWORD PTR ___security_cookie
xor eax, ebp
mov DWORD PTR __$ArrayPad$[ebp], eax
push ebx
push esi
push edi

; 6 : char buffer[10];
; 7 : strcpy(buffer, "aaaaaaaaaaaaaaaaaaaaaaaaaaa");

push OFFSET ??_C@_0BM@CINAKCFB@aaaaaaaaaaaaaaaaaaaaaaaaaaa?$AA@
lea eax, DWORD PTR _buffer$[ebp]
push eax
call _strcpy
add esp, 8

; 8 : }

pop edi
pop esi
pop ebx
mov ecx, DWORD PTR __$ArrayPad$[ebp]
xor ecx, ebp
call @__security_check_cookie@4
mov esp, ebp
pop ebp
ret 0
_test_bof ENDP

First thing to notice is that there’s a magic 4-byte stack variable called __$ArrayPad$ that wasn’t in the C code. Also notice that the buffer[10] variable has been aligned and so it actually takes up 12 bytes instead of the 10 I requested. This will be important later when crafting an exploit payload. One thing that struck me was how much stack space the compiler gave itself (sub esp, 80). It’s kind of a weird number too – doesn’t even seem aligned. I don’t know.

The next thing that’s different is that after the usual stack preamble, it’s grabbing a value from the address __security_cookie, xor’ing with the frame base pointer and then saving in __$ArrayPad$. The rest of the code looks pretty normal until just before the return. The value at __$ArrayPad$ is put into ecx and then xor’d with the current frame base pointer value. Then @__security_check_cookie@4 is called. Here’s the disassembled version of @__security_check_cookie@4.

--- f:\dd\vctools\crt\crtw32\misc\i386\secchk.c
004112D0 3B 0D 00 60 41 00 cmp ecx,dword ptr ds:[416000h]
004112D6 75 02 jne failure (04112DAh)
004112D8 F3 C3 rep ret
004112DA E9 6C FD FF FF jmp ___report_gsfailure (041104Bh)

So what’s really going on here. A secret value __security_cookie (ds:[416000h]) is read and xor’d with the value of the frame base pointer. That value is stored on the stack at __$ArrayPad$, right above the saved address of the previous frame pointer which is right above the very important saved eip (instruction pointer) value (important because that’s the address that execution will jump to when ret is executed). The function code runs as it normally would but before returning, a check is added. The value of __$ArrayPad$ is read and xor’d again with the frame base register. The result is in ecx. Then the security check routine runs and compares ecx with the original __security_cookie value. If they are the same then there was no stack frame corruption detected. If the ecx is different than __security_cookie, that means that either ebp was modified or more likely there was a stack buffer overflow that overwrote the original value of __$ArrayPad$. This canary value is checked before the postamble runs and crashes the program if the stored frame pointer or eip may have been compromised.

If you take any number A, and xor it with another number B you’ll get C.
If you then take C and xor it with B, you’ll get A.
E.g. in binary: 101 ^ 011 => 110 and 110 ^ 011 => 101, the original value.


That’s great for security but I’m trying to write an insecure program for demonstration so we need to turn off the stack cookie. In the properties for the project set C++ -> Code Generation -> Security Check to Disable Security Check /GS-. Here’s the regenerated assembly (abridged) with the Security Check disabled.

_buffer$ = -12		; size = 10
_test_bof PROC		; COMDAT

; 5    : {
push	ebp
mov	ebp, esp
sub	esp, 76		; 0000004cH
push	ebx
push	esi
push	edi
...
pop	edi
pop	esi
pop	ebx
mov	esp, ebp
pop	ebp
ret	0

Notice that the magic stack variable is gone and the preamble/postamble no longer contain the security cookie setup or check. Now it should be ready to exploit. Let’s run it.

Unhandled exception at 0x61616161 in SimpleStackBof.exe:
0xC0000005: Access violation reading location 0x61616161.

Error reading location 0x61616161 (ASCII ‘aaaa’) that’s more like it.

Disabling ASLR and DEP in Visual Studio 2013

I’m working my way through some buffer overflow exploits on Windows. The ‘Hello World’ of stack-based buffer overflows is to pack a payload on the stack and ret into it. By default – for very good reason – the VS linker has DEP and ASLR enabled for native projects. ASLR tries to make it so an attacker can’t predict the stack address (or other module addresses) to jump into because the base image will be loaded into a random location. DEP marks the stack section not executable so payloads placed on the stack will not run.

To turn these security features off change the following linker settings on your project

  1. Linker->Advanced set Randomized Base Address to No
    (/DYNAMICBASE:NO)
  2. Linker->Advanced set Data Execution Prevention (DEP) to No
    (/NXCOMPAT:NO)

Rebuild the project and break in VS to examine EBP. It should be the same each time you run.

MASM in Visual Studio 2013

I started playing around with MASM in Visual Studio. Much easier to get started than in used to be and you can even debug!

These pages were useful in getting a simple program set up and compiled

A few tweaks were needed to get started. Once MASM has been enabled in the build configuration, you need to make the following change in the Linker settings (assuming you started with a blank project).

  1. Under Linker->System set the SUBSYSTEM to CONSOLE