BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Overriding Sealed Methods in C#

Overriding Sealed Methods in C#

This item in japanese

Bookmarks

Key Takeaways

  • Methods have runtime metadata that we can examine and modify.
  • A method handle can be modified to point to a different method.
  • We can generate machine code in C# and execute it directly.
  • We can override any method this way, including built-in ones.
  • We can use this trick to modify wrappers around the WinAPI (or any other wrappers).

A method is a block of code that contains a series of instructions. It can be declared within a class, structure, or interface by specifying the method signature, which consists of the method’s name, parameters, the return value, and modifiers like its access level, abstract, or sealed.

A method signature must uniquely determine the method used in a given execution context. Depending on the context, the return value may be part of the signature (e.g., when determining the compatibility between a delegate and the method to which it points), or it may be ignored (e.g., when the method is overloaded). When we call a method, we need to provide both the method name and the method parameters. We don’t specify the return type in C#, but we need to do it in Intermediate Language (IL). This way, we can specify which method to call, and the .NET platform takes care of the rest of the process.

A method can also be virtual, which adds support for polymorphic execution via late-binding mechanism (the method can be redefined in derived classes). This is one of the cornerstones of Object-Oriented Programming, and it is widely used in C#. However, not all methods can be virtual and support overriding - static methods, constructors, or operators do not support this mechanism. We can also mark a method as sealed to stop it from being overridden in subclasses even if it is marked as virtual in the base class.

There are, however, ways to modify how a sealed method is executing, overriding it to some extent. Before we get into that, we need first to understand how methods are implemented in the .NET platform.

Method internals

A method’s code is typically compiled twice. The C# compiler does the first compilation. This process takes the C# source code as an input and generates intermediate language (IL) code as an output. Later, the IL code is compiled again, typically at runtime by the Just-In-Time (JIT) compiler. It may also be compiled before an application gets executed in an Ahead-Of-Time (AOT) mode with ngen or ReadyToRun (R2R) mechanisms. The second compilation takes the IL code as an input and generates machine code matching the current hardware (CPU) architecture as an output. The machine code can be later executed directly by the CPU, with no help from the .NET platform.

To call a method on a machine code level, we need to be aware of multiple things that we can safely ignore when writing code in C#. Not only do we need to provide a name and parameters to a method, but we also need to know how to pass values to the method (via registers or stack). Not only that but also who cleans up the stack after the method finishes (callee or caller), how a value is returned, what’s the order of parameters (left-to-right or right-to-left), and many more details. When writing in C#, we ignore these details because the .NET platform takes care of them. However, on a machine code level, we need to adhere to the binary protocol carefully. Otherwise, we’ll most likely get a segmentation fault or an access violation.

JIT-compilation is a multi-step process, relying both on the internals of the .NET platform and on the specifics of the CPU architecture. It must consider multiple aspects of the runtime:

  1. How to pass parameters to a method? Depending on the architecture, a different set of registers is used. In 32-bit architectures, the first two parameters are passed via the ecx and edx registers, and all the others are passed via stack. In 64-bit architectures, the first four parameters are passed via the rcx, rdx, r8, and r9 registers, and the others are passed via the stack. This, however, can be changed at any time and is not guaranteed to remain the same between compiler versions.

  2. How is the return value returned? Integer values are returned in the eax register, while floating-point ones are returned through FP or XMM registers.

  3. What is the order of parameters? Whether parameters are passed left-to-right or right-to-left depends on the architecture and is controlled by the platform.

  4. How is the memory for the machine code allocated? Since the machine code is not available when the application starts, it must be written by the application and stored somewhere. Typically, a new memory page is allocated and is then marked as executable with either the VirtualProtectEx or mprotect functions from the operating system.

  5. Who removes parameters from the stack? If it’s the caller, we risk significant code duplication because every time we call a method, we need to remove its parameters from the stack when it returns. However, if it’s the callee who cleans up, we cannot reliably implement methods with variadic parameters, like printf (which can accept any number of parameters).

  6. Is it worth calling the method? Should it be inlined? Maybe it cannot be inlined because it is either too big or it uses try-catch, which may change the stack trace.

  7. Can we optimize the method? Precalculate constants, remove dead code, reorder instructions?

  8. What’s the endianness? How do we encode instructions and addresses?

We typically don’t need to think about these aspects when writing a C# code. They only become important once we start calling methods from other platforms with the P/Invoke mechanism.

Starting with .NET Core 2.1, a method may be JIT-compiled multiple times due to the multi-tiered compilation mechanism. The first compilation is rough and dirty. It generates non-optimized machine code. The second compilation may happen after some time (e.g., when the .NET platform observes that the method is on a hot path and is executed often). The compiler then spends more time in the compilation and produces optimized code. This may result in using fewer registers, removing dead code, or precalculating values and using constants. Multi-tiered compilation is enabled by default starting from .NET Core 3. Effectively, a method always has one instance of an IL code, but it may have multiple instances of a machine code.

Reflection also allows the programmer to query details of a method. It can be used to get the method name, parameters, return types, and all other specifiers. Reflection uses method descriptors (metadata) under the hood. Each descriptor is a structure providing a unique method handle (used for calling the method), holding expensive metadata (like method modifiers), and capturing the runtime state of the method. By querying the descriptor, we can determine if a method was already JIT-compiled and where the generated machine code is. We can access the method descriptor via reflection with Type.GetMethod(), and then access its method handle with the MethodHandle property. 

By examining the method handle, we can access the internal structures of a method. For instance, we can find the actual pointer specifying where the machine code of the method is. This gives us a way to point a method to some different code or modify the logic in place, which we can use to override a sealed method. Let’s say we have a non-virtual, non-static method X, and we want to modify it to call method Y instead. If X were virtual, we could inherit the method from the base class, override it with Y, and then use the polymorphic invocation to achieve our goal. However, since X is non-virtual, we need to modify it on a lower level.

Overriding sealed methods with metadata modification

The first approach we can use for overriding sealed methods is based on metadata modification. We want to get the metadata for method X, find the pointer for the machine code of the method, and then modify it to point to some other method with a matching signature. Let’s take the following code as an example:

using System;
using System.Linq;
using System.Numerics;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Threading;

namespace OverridingSealedMethodNetCore
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine($"Calling StaticString method before hacking:\t{TestClass.StaticString()}");
            HijackMethod(typeof(TestClass), nameof(TestClass.StaticString), typeof(Program), nameof(StaticStringHijacked));
            Console.WriteLine($"Calling StaticString method after hacking:\t{TestClass.StaticString()}");

            Console.WriteLine();

            var instance = new TestClass();
            Console.WriteLine($"Calling InstanceString method before hacking:\t{instance.InstanceString()}");
            HijackMethod(typeof(TestClass), nameof(TestClass.InstanceString), typeof(Program), nameof(InstanceStringHijacked));
            Console.WriteLine($"Calling InstanceString method after hacking:\t{instance.InstanceString()}");

            Console.WriteLine();

            Vector2 v = new Vector2(9.856331f, -2.2437377f);
            for (int i = 1; i <= 35; i++)
            {
                MultiTieredClass.Test(v, i);
                Thread.Sleep(100);
            }

            Console.WriteLine($"Examine MethodDescriptor: {typeof(MultiTieredClass).GetMethod(nameof(MultiTieredClass.Test)).MethodHandle.Value.ToString("X")}");
            Console.ReadLine();
        }

        public static void HijackMethod(Type sourceType, string sourceMethod, Type targetType, string targetMethod)
        {
            // Get methods using reflection
            var source = sourceType.GetMethod(sourceMethod);
            var target = targetType.GetMethod(targetMethod);

            // Prepare methods to get machine code (not needed in this example, though)
            RuntimeHelpers.PrepareMethod(source.MethodHandle);
            RuntimeHelpers.PrepareMethod(target.MethodHandle);

            var sourceMethodDescriptorAddress = source.MethodHandle.Value;
            var targetMethodMachineCodeAddress = target.MethodHandle.GetFunctionPointer();

            // Pointer is two pointers from the beginning of the method descriptor
            Marshal.WriteIntPtr(sourceMethodDescriptorAddress, 2 * IntPtr.Size, targetMethodMachineCodeAddress);
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        public static string StaticStringHijacked()
        {
            return "Static string hijacked";
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        public string InstanceStringHijacked()
        {
             return "Instance string hijacked";
        }
    }

    class TestClass
    {
        [MethodImpl(MethodImplOptions.NoInlining)]
        public static string StaticString()
        {
            return "Static string";
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        public string InstanceString()
        {
            return "Instance string";
        }
    }

    class MultiTieredClass
    {
        [MethodImpl(MethodImplOptions.NoInlining)]
        public static void Test(Vector2 v, int i)
        {
            v = Vector2.Normalize(v);
            Console.WriteLine($"Vector iteration {i:0000}:\t{v}\t{TestClass.StaticString()}");
        }
    }
}

In the example above, we have a class named TestClass with two methods: StaticString (line 71) and InstanceString (line 77). Both of them are non-virtual and return a hard-coded string. Our goal is to hijack these methods so when calling StaticString the .NET platform executes the StaticStringHijacked method (line 56). Similarly, when calling InstanceString, we want to get InstanceStringHijacked (line 62) called.

The Main method proceeds in the following manner: it first calls the StaticString method and prints its output, then it hijacks it with StaticStringHijacked, and then it calls StaticString again to see if it was successfully overridden. After that, it does the same for the InstanceString method. The whole magic happens in the method HijackMethod (line 38).

HijackMethod accepts four parameters. The first two define the source method to be overridden (method X, or StaticString in our example). The last two parameters define the target method (method Y, or StaticStringHijacked in our case). To specify a method we need the Type instance holding the method and the method name. Since this is an example only, we don’t handle situations when there are multiple methods of the same name with different sets of arguments, but the code above can be easily extended to that end.

We start by taking method descriptors of the methods (lines 41-42) by calling the regular GetMethod function from the reflection mechanism:

// Get methods using reflection
var source = sourceType.GetMethod(sourceMethod);
var target = targetType.GetMethod(targetMethod);

Since the methods may not be JIT-compiled yet, we trigger the compilation manually by calling RuntimeHelpers.PrepareMethod (lines 45-46):

// Prepare methods to get machine code (not needed in this example, though)
RuntimeHelpers.PrepareMethod(source.MethodHandle);
RuntimeHelpers.PrepareMethod(target.MethodHandle);

With that, we get actual pointers that we can modify. The first one is the address of the internal method descriptor of the source method. It’s a structure that holds the address of the machine code backing the method. The address is stored in two pointers from the beginning of the structure (8 bytes from the beginning in a 32-bit application or 16 bytes from the beginning in a 64-bit application). This depends on the .NET version as the internal representation may be changed at any time, but it is consistent from .NET Framework 1 until .NET 5. We get the address of the structure in line 48

var sourceMethodDescriptorAddress = source.MethodHandle.Value;

Next, we get the address of the machine code of the target method. The .NET platform provides a method named GetFunctionPointer that does exactly that, but we could as well extract this value manually by getting the internal descriptor address and then reading the pointer, which is 8 or 16 bytes from the beginning depending on the CPU architecture in use (line 49):

var targetMethodMachineCodeAddress = target.MethodHandle.GetFunctionPointer();

To override the method, we take the pointer and modify it directly in the internal descriptor structure (line 52):

Marshal.WriteIntPtr(sourceMethodDescriptorAddress, 2 * IntPtr.Size, targetMethodMachineCodeAddress);

After this modification, we effectively changed the pointer of the StaticString method, so now it points to the machine code of StatingStringHijacked. When we call the method, it will effectively execute the latter’s machine code, as we can see in the application output:

Calling StaticString method before hacking:     Static string
Calling StaticString method after hacking:      Static string hijacked

So the behavior of our program is structured as follows:

Before the method hijacking:

  • Start calling StaticString method
  • Get the address of the machine code, which points to the actual code of the StaticString method
  • Execute the StaticString code

After the method hijacking:

  • Start calling StaticString method
  • Get the address of the machine code, which points to the code of the StaticStringHijacked method
  • Execute the StaticStringHijacked code

The same structure applies to the InstanceString method which we hijack afterwards. This hijacking method works in .NET 5.0.102 in Windows 10 x64 and in .NET 5.0.401 in WSL2 Ubuntu 20.04. It works for both Debug and Release configurations, for both x86 and x64.

However, it is not 100% bulletproof and may not work reliably. This is because .NET introduced a code cache infrastructure for multi-tiered compilation, which is prone to the time effect. In other words, it may be that the hijacked method will be picked up only “shortly after” we actually change pointers and the first couple calls., which still point to the old method. To observe this effect, we will examine the second part of the code.

We have a method named Test in line 86. It takes a two-dimensional vector as an argument, normalizes it, and prints its value. The details of this mathematical operation are not important here. What is important is the fact that the Vector2.Normalize() method may be highly optimized to use SSE instructions, which we will observe due to multi-tiered compilation. In line 89, we print the following:

Console.WriteLine($"Vector iteration {i:0000}:\t{v}\t{TestClass.StaticString()}");

So we print the iteration number, normalized vector, and we call the StaticString method. We call the method Test from the Main method multiple times, as seen in line 27:

Vector2 v = new Vector2(9.856331f, -2.2437377f);
for (int i = 1; i <= 35; i++)
{
	MultiTieredClass.Test(v, i);
	Thread.Sleep(100);
}

As the initial output of this code fragment we may obtain:

Vector iteration 0001:  <0.9750545, -0.22196561>        Static string
Vector iteration 0002:  <0.9750545, -0.22196561>        Static string
Vector iteration 0003:  <0.9750545, -0.22196561>        Static string
Vector iteration 0004:  <0.9750545, -0.22196561>        Static string
Vector iteration 0005:  <0.9750545, -0.22196561>        Static string hijacked

We can see that even though we already hijacked the StaticString method to point to StaticStringHijacked, the first iterations still call the regular code (not the hijacked one). However, after half a second, we see that the output has changed. This is the code caching effect in practice.

However, things get even more interesting as we continue. Around the 35th iteration, a multi-tiered compilation kicks in and recompiles the method. The output we get is:

Vector iteration 0034:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0035:  <0.97505456, -0.22196563>       Static string

Two important things happened here. First, the output of the Vector2.Normalize() method changed. Previously it was printing 0.9750545, and now it returned 0.97505456 (note the additional 6 at the end of the value). This shows that the code under the hood was recompiled and actually changed the value. The reason is that the optimized version uses SSE instructions which provide higher precision. You can read more about this behavior here

The second important observation is that in iteration 35 a regular StaticString method was called. This is the effect of method inlining. If we debug the code and observe internal structures, we can see the following:

Method Name:          OverridingSealedMethodNetCore.MultiTieredClass.Test(System.Numerics.Vector2, Int32)
Class:                00007ffa38474978
MethodTable:          00007ffa38464d48
mdToken:              0000000006000009
Module:               00007ffa3843f888
IsJitted:             yes
Current CodeAddr:     00007ffa383aded0
Version History:
  ILCodeVersion:      0000000000000000
  ReJIT ID:           0
  IL Addr:            0000000000000000
     CodeAddr:           00007ffa383aded0  (OptimizedTier1)
     NativeCodeVersion:  0000018E3EF1D140
     CodeAddr:           00007ffa383a7ae0  (QuickJitted)
     NativeCodeVersion:  0000000000000000

So we can see there are two instances of the machine code. The first instance (labeled QuickJitted) calls the method, but the second instance (labeled OptimizedTier1) inlines the string literal. Effectively neither StaticString nor StaticStringHijacked is called.

This technique will not work for all scenarios. It may not work for methods compiled in the AOT manner. It may not support all methods from the standard library as method descriptors differ for them. As we can see, it may break due to multi-tiered compilation or code inlining.

Pros:

  • This technique doesn’t require an understanding of the machine code
  • it doesn’t destroy the original machine code

Cons:

  • It may not be reliable as it is prone to the time effect
  • Results may be reversed due to multi-tiered compilation
  • Not all methods can be modified this way
  • Inlining breaks this technique

Overriding sealed methods with machine code modification

The second technique we’ll use does not modify the runtime metadata. This time we will modify the method’s machine code directly, making it jump to another place. We’ll find the machine code of the source method and then modify it on a binary level to execute a jump instruction to move to the target method.

To generate the machine code, first we need to understand how the jump instruction works. In x86 architectures, it uses one value as a parameter (4 or 8 bytes long), which is a numerical offset of how far to move in the memory address (literally jumping to another place). Since it uses an offset instead of an absolute memory address, it’s slightly harder to use as we need to calculate the distance (offset) to jump. However, we can use a trick to move to an absolute address. In 32-bit mode, we can push the address into the stack and then execute the return instruction, which takes the address from the stack and moves to it. In 64-bit architectures, we can’t push the address directly (as there is no instruction to push 8 bytes into the stack), so we first move the address to the register and then push the register on the stack.

We want to generate this code and put it at the beginning of the StaticString method to execute it. Effectively, we’ll always execute the source method and only then jump to the target.

Let’s take the following code as an example:

using System;
using System.Linq;
using System.Numerics;
using System.Reflection;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Threading;

namespace MethodHijackerNetCore
{
    public class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine($"Calling StaticString method before hacking:\t{TestClass.StaticString()}");
            HijackMethod(typeof(TestClass), nameof(TestClass.StaticString), typeof(Program), nameof(StaticStringHijacked));
            Console.WriteLine($"Calling StaticString method after hacking:\t{TestClass.StaticString()}");

            Console.WriteLine();

            var instance = new TestClass();
            Console.WriteLine($"Calling InstanceString method before hacking:\t{instance.InstanceString()}");
            HijackMethod(typeof(TestClass), nameof(TestClass.InstanceString), typeof(Program), nameof(InstanceStringHijacked));
            Console.WriteLine($"Calling InstanceString method after hacking:\t{instance.InstanceString()}");

            Console.WriteLine();

            Vector2 v = new Vector2(9.856331f, -2.2437377f);
            for (int i = 1; i <= 35 ; i++)
            {
                MultiTieredClass.Test(v, i);
                Thread.Sleep(100);
            }

            Console.WriteLine($"Examine MethodDescriptor: {typeof(MultiTieredClass).GetMethod(nameof(MultiTieredClass.Test)).MethodHandle.Value.ToString("X")}");
            Console.ReadLine();
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        public static string StaticStringHijacked()
        {
            return "Static string hijacked";
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        public string InstanceStringHijacked()
        {
            return "Instance string hijacked";
        }

        public static void HijackMethod(Type sourceType, string sourceMethod, Type targetType, string targetMethod)
        {
            var source = sourceType.GetMethod(sourceMethod);
            var target = targetType.GetMethod(targetMethod);

            HijackMethod(source, target);
        }

        public static void HijackMethod(MethodBase source, MethodBase target)
        {
            RuntimeHelpers.PrepareMethod(source.MethodHandle);
            RuntimeHelpers.PrepareMethod(target.MethodHandle);


            var offset = 2 * IntPtr.Size;
            IntPtr sourceAddress = Marshal.ReadIntPtr(source.MethodHandle.Value, offset);
            IntPtr targetAddress = Marshal.ReadIntPtr(target.MethodHandle.Value, offset);

            var is32Bit = IntPtr.Size == 4;
            byte[] instruction;

            if (is32Bit)
            {
                instruction = new byte[] {
                    0x68, // push <value>
                }
                 .Concat(BitConverter.GetBytes((int)targetAddress))
                 .Concat(new byte[] {
                    0xC3 //ret
                 }).ToArray();
            }
            else
            {
                instruction = new byte[] {
                    0x48, 0xB8 // mov rax <value>
                }
                .Concat(BitConverter.GetBytes((long)targetAddress))
                .Concat(new byte[] {
                    0x50, // push rax
                    0xC3  // ret
                }).ToArray();
            }

            Marshal.Copy(instruction, 0, sourceAddress, instruction.Length);
        }
    }

    class TestClass
    {
        [MethodImpl(MethodImplOptions.NoInlining)]
        public static string StaticString()
        {
            return "Static string";
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        public string InstanceString()
        {
            return "Instance string";
        }
    }

    class MultiTieredClass
    {
        [MethodImpl(MethodImplOptions.NoInlining)]
        public static void Test(Vector2 v, int i)
        {
            v = Vector2.Normalize(v);
            Console.WriteLine($"Vector iteration {i:0000}:\t{v}\t{TestClass.StaticString()}");
        }
    }
}

The important part happens in HijackMethod starting in line 59. We first compile both the source and the target methods to generate the machine code.

We get the machine code address of both source and target methods (line 65). To do that, we read the address from the method descriptor:

var offset = 2 * IntPtr.Size;
IntPtr sourceAddress = Marshal.ReadIntPtr(source.MethodHandle.Value, offset);
IntPtr targetAddress = Marshal.ReadIntPtr(target.MethodHandle.Value, offset);

If we run this on a 32-bit platform, the code pushes the address to the stack and returns. This is what happens in lines 74-80. You can see that 0x68 is the number of the instruction for pushing the value onto the stack. After that, we cast the address to integer (which is 4 bytes long, but we know that we are on a 32-bit platform) and convert it to bytes. The last instruction is 0xC3, which takes the address from the stack, removes it and jumps to it.

instruction = new byte[] {
	0x68, // push <value>
}
 .Concat(BitConverter.GetBytes((int)targetAddress))
 .Concat(new byte[] {
	0xC3 //ret
 }).ToArray();

On a 64-bit platform, we can’t push the value directly. We first need to move it to the rax register, then push the register into the stack and return. This is done in lines 84-92. Notice that this time we cast the address to a long instead of an integer (since on 64-bit platforms the addresses are 8-bytes long):

instruction = new byte[] {
	0x68, // push <value>
}
 .Concat(BitConverter.GetBytes((int)targetAddress))
 .Concat(new byte[] {
	0xC3 //ret
 }).ToArray();

Finally, we copy the code to the beginning of the source method’s machine code.

Marshal.Copy(instruction, 0, sourceAddress, instruction.Length);

So the behavior of our program with this technique is as follows:

Before the hijacking:

  • Start calling StaticString method
  • Get the address of the machine code which points to the actual code of the StaticString method
  • Execute the StaticString code
  • When StaticString finishes, return to the caller

After the hijacking:

  • Start calling StaticString method
  • Get the address of the machine code which points to the same code address as before hijacking
  • Execute the StaticString code
  • The first part of the StaticString is a jump to the StaticStringHijacked, so we jump to the other method
  • Execute the StaticStringHijacked code
  • When StaticStringHijacked finishes, return to the caller directly (because the return address on the stack is the one used when calling StaticString)

This method is not prone to the time effect as we don’t modify the metadata (we only modify the code that gets executed). However, it is still prone to the inlining, as we can see with multi-tiered compilation kicking in and inlining it directly:

Vector iteration 0033:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0034:  <0.97505456, -0.22196563>       Static string

Also, this technique works for any method as long as we can get the address of the machine code. It may be harder for AOT-compiled methods (as the address is not stored directly in the 8th/16th byte of the method descriptor) or external native code called with P/Invoke (as the code may not be writable and we’ll need to call VirtualProtectEx or mprotect to modify it) but conceptually it works for all the cases.

Pros:

  • It is not prone to the time effect
  • It works for all methods.

Cons:

  • It destroys the original machine code
  • Results may be reversed due to multi-tiered compilation
  • Inlining breaks this technique
  • It requires a better understanding of the machine code and operating system

Practical application - modifying the wrapper for WinAPI process creation

There are multiple situations when we can use method hijacking to provide some business values. Examples from my experience (deployed into production) include:

  • Handling StackOverflowException via Vectored Exception Handling (VEH) mechanism to avoid stopping the test suite from being killed
  • Injecting try-catch blocks for new thread creation to avoid getting the process terminated due to an unhandled exception
  • Modifying the WinAPI wrapper to be able to execute the process in a different desktop. 

Let’s examine the last example in detail.

Windows supports multiple desktops to isolate applications. This mechanism has been in place for over 20 years, but it was never exposed in the UI. There is an application called Desktops that allows us to control multiple desktops and switch between them. This may be useful when we need to automate applications that capture user input or steal focus (e.g., automated UI tests with Puppeteer in a headful mode).

In order to run an application on a different desktop, we need to specify the lpDesktop field in the STARTUPINFO structure. 

However, in C#, we don’t call this API directly: we use the wrapping code provided by the standard library. Unfortunately, if we examine the .NET Framework code, it doesn’t allow us to set the value of IpDesktop, and always initializes it to a null value

If we want to run the application on a different desktop in C#, there are a few solutions:

  • We can call the WinAPI directly, but then we lose the support of the .NET API and need to control processes on our own (including marshaling and input/output redirection)
  • We can copy the wrapping code on the side and modify it, but then we need to maintain it and keep updated when it changes in the standard library.
  • We can modify the code directly in place to inject the lpDesktop value. To do that, we can override sealed methods with the technique described earlier.

In order to hijack the code, we need to find a way to inject some of our code before the STARTUPINFO structure gets passed to the WinAPI but after it is created. We can use the constructor for that

First get the method descriptor of the constructor:

var matchingType = AppDomain.CurrentDomain.GetAssemblies().SelectMany(a => a.GetTypes()).First(t => t.Name.Contains("STARTUPINFO"));
var constructor = matchingType.GetConstructor(new Type[0]);
var newConstructor = typeof(Program).GetMethod(nameof(NewConstructor), BindingFlags.Static | BindingFlags.Public);

Now, we hijack it with the following constructor replacement:

public static void NewConstructor(object startupInfo)
{
	startupInfo.GetType().GetField("cb", BindingFlags.Instance | BindingFlags.Public).SetValue(startupInfo, Marshal.SizeOf(startupInfo));
	startupInfo.GetType().GetField("lpDesktop", BindingFlags.Instance | BindingFlags.Public).SetValue(startupInfo, desktopNameStringHandle.AddrOfPinnedObject());
}

In the code above,the original constructor sets the cb field to the correct value directly. Then we provide a new constructor, which sets the cb field via reflection and also sets the lpDesktop field to the name of the desktop we want to use.

So the code after hijacking works in the following way:

  • We call the Process.Start() method
  • Process.Start() creates an instance of the STARTUPINFO structure 
  • Instead of a regular constructor being called, our custom one gets executed, and we set the field values via reflection

I have been using this technique with .NET Framework and Windows Server 2012/2016 for many years.

Summary

We can see that by getting our hands on the internal structures, we can change the behavior of the platform. It requires an understanding of code generation, Operating System mechanisms and internals of the .NET platform. However, at the end of the day, these are just bytes that we can modify to suit our needs.

About the author

Adam Furmanek is a professional software engineer with over a decade of experience. In his career he worked with all layers of software engineering and multiple types of applications, including logistics, e-commerce, machine learning, data analysis and database management. He is always interested in digging deeper, exploring machine code and going through implementation details to better understand the internals of the technologies he uses every day. That's why he likes debugging, decompiling and disassembling the code to understand memory models, concurrency problems and other details hidden deeply inside. In his free time he plays ping-pong, watches Woody Allen's movies and blogs stuff.


 

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT