Welcome to our deep dive into the world of .NET malware reverse engineering. As a security researcher or analyst, you’re likely aware that the .NET framework, famed for its ability to enable rapid and robust application development, is a double-edged sword. The same features that make it attractive to legitimate developers also make it a favorite among malware authors.
So why invest time and effort in unraveling .NET malware? Simply put, the cyber threat environment is filled with malware built using .NET frameworks, and countering these threats necessitates a deep understanding of their underlying code structures and the ability to analyze their operational intricacies. Armed with knowledge and reverse engineering skills, you’ll be able to unpack the mechanics of malware, uncover its attack vectors, and strengthen defenses against these threats.
This blog aims to demystify the process of .NET reverse engineering, making it more approachable and comprehensible. Our journey will equip you with the essential knowledge to analyze .NET files and gain insights into their harmful functionality and behaviors. We’ll start with an in-depth exploration of the .NET executable format to provide you with a solid understanding of the structure of these files. This will be followed by an extensive review of the tools and techniques available for reverse engineering .NET malware, ensuring you are well-prepared to tackle these threats.
The .NET Framework
The .NET software development framework and ecosystem developed by Microsoft was first released in 2002. It’s designed to provide a controlled programming environment where software can be developed, installed, and executed on Windows-based operating systems. Although with the introduction of .NET Core, now succeeded by .NET 5 and onwards, it has expanded to offer cross-platform support for Linux and macOS. Unlike traditional programming languages, .NET is not a language but a framework supporting multiple managed languages, including C#, VB.NET, and F#.
The .NET framework includes a large class library known as the Framework Class Library (FCL), which affords developers a comprehensive range of ready-to-use, tested, and optimized functionality ranging from data access to encryption to XML parsing. This extensive support and the managed execution environment make .NET distinctive from languages and frameworks that require more manual intervention for such tasks. Combining these features creates a productive environment favored for a range of applications, from web to desktop to mobile.
.NET Threat Landscape
Malware developers might prefer using .NET framework over C/C++ due to its user-friendly development process, rich feature set, and smooth integration with Windows. However, tools like dnSpy make it simpler to reverse engineer .NET-based malware, prompting these creators to employ obfuscation methods to make analysis harder. Additionally, .NET’s capacity to enable malware to change its behavior or hide makes it harder to detect and reverse-engineer. While C/C++ allows for finer control over system resources and could lead to more discreet and efficient malware, it requires a more in-depth knowledge of the system’s inner workings. This makes .NET a more appealing choice for those looking for a balance between development speed and ease.
The .NET threat landscape continually evolves, with attackers regularly exploiting the adaptable and widely adopted .NET framework to craft and deploy a diverse array of sophisticated threats. This framework underpins many cyber threats, like the notorious ransomware Locky and Killnet. Credential stealers, including RedLine Stealer, and banking trojans, such as CryptoClippy are also .NET-based threats. Additionally, destructive wipers written in .NET are emerging, with examples like DoubleZero and the more recently discovered Hatef Wiper illustrating this trend.
Moreover, .NET has also been instrumental in creating remote access trojans (RATs). Examples include QuasarRAT and NanoCore, praised in underground circles for their rich feature sets and the ease with which they can be modified and obfuscated. Additionally, .NET is a common tool for creating malware loaders, which discreetly install and execute other types of malware.
The Process of .NET Compilation and Runtime
Compilation – Managed Code
The execution of managed languages (C#, F#, or VB.NET) is controlled by the runtime. When the suitable language compiler compiles the source code, the output is an Intermediate Language (IL), also known as MSIL (Microsoft Intermediate Language), Managed Code, or Common Intermediate Language (CIL).
For example, when C# code is compiled in the .NET framework, the output of the C# compiler (csc.exe) is a .NET assembly. This assembly can be an executable file (EXE) for standalone programs or a dynamic-link library (DLL) for reusable libraries.
This is unlike the compilation of unmanaged languages like C/C++, where the source code is translated directly to machine code. The managed code is then packaged into an assembly accompanied by a manifest that contains the necessary metadata.
The beauty of managed code in this process is its portability and flexibility; the same assembly can run on any platform supported by .NET without recompilation. Additionally, managed code allows for cross-language inheritance code access security. It provides the advantage of late-binding support, whereby method calls can be resolved at runtime rather than compile time. This level of abstraction provided by the managed code and assembly structure is a cornerstone of the versatility and strength of the .NET framework, enabling developers to create applications that are more secure, manageable, and adaptable to change.
The illustration below demonstrates the compilation and execution process in .NET.
We demonstrate this using C#, but it applies to all .NET languages. This includes essential information for debugging, garbage collection, security attributes, and details necessary for the runtime to manage the code. The metadata includes the assembly’s name, version, culture, and potentially a strong name for unique identification. It also contains the runtime header, which guides the Common Language Runtime (CLR) to execute the assembly properly. The metadata header organizes streams such as the #~ (Tilde) Stream, #Strings Stream, #US (User Strings) Stream, #GUID Stream, #Blob Stream, and #Pdb Stream, which store information about types, methods, fields, and other elements defined in the assembly. These streams provide detailed binary information, string values, and debugging information necessary for the runtime and tools like dnSpy and ILDasm to access and manage the code effectively. The information about the types defined in the code (classes, interfaces, enums, etc.), member definitions (methods, properties, fields, events), references to other types and members, and the assembly itself is stored in Metadata Tables within the PE file. These tables contain various types of information, such as:
1. Definition Tables:
– TypeDef Table: Contains details of classes or interfaces, including name, visibility, base type, and methods or properties.
– MethodDef Table: Details of methods, including name, signature, and IL code.
– FieldDef Table: Details of fields (class variables), including name and type.
2. Reference Tables:
– TypeRef Table: Information about types defined in other assemblies.
– MemberRef Table: Descriptions of members defined in other modules or assemblies.
3. Manifest Metadata Table:
– Assembly Table: Information about the assembly, such as name, version, culture, and strong name signature.
– AssemblyRef Table: Details of other assemblies that this assembly depends on.
4. Other Metadata Tables:
– Module Table: Information about the current module.
– CustomAttribute Table: Details of custom attributes applied to elements within the assembly.
– Event Table and Property Table: Describe events and properties.
– Param Table: Information about method parameters.
– Constant Table: Stores constants defined in the code.
Metadata tokens are unique identifiers assigned to metadata elements within these tables, enabling the CLR to efficiently reference and access them. Tokens consist of a high byte indicating the type of metadata and a row index for the specific entry. Additionally, the RVA and File Offset values in metadata tokens help locate and execute methods’ IL code at runtime and provide file offset positions within the assembly file for analysis or manipulation. For instance, the JIT compiler utilizes metadata tokens to access method signatures, type information, and other details when compiling IL to native code. This metadata token system also supports dynamic CLR features like reflection, enabling the effective execution of runtime services such as type safety, security checks, and cross-language interoperability.
An example of the metadata tables can be seen in the start function of the SolarWinds malware. By examining the metadata of the start method, we can identify the metadata table number and the entry number in that table. This information helps in understanding the structure and organization of the assembly.
Furthermore, the .NET manifest plays a crucial role in describing the hierarchy and relationships within an assembly. It contains essential data for the assembly’s functioning, including version requirements, security identity, and reference resolution. The manifest ensures that the assembly is self-descriptive and aids in managing dependencies and versioning effectively. The .NET Manifest contains crucial information for maintaining version control and ensuring compatibility between different assemblies and their components. It includes details such as the assembly name, version number, culture information, list of files in the assembly, type reference information, and information on referenced assemblies. This information is essential for the common language runtime to enforce version policies, ensure integrity and completeness of the assembly, and maintain type safety and correctness.
Additionally, in .NET, the method body structure can be encoded in two formats: the “Tiny” format and the “Fat” format. The Tiny header is used for smaller methods that meet specific criteria, while the Fat header is used for larger, more complex methods that exceed the limitations of the Tiny header. The choice between Tiny and Fat headers is made by the .NET compiler based on the method’s complexity and requirements. The Tiny header is more compact and optimized for small methods, while the Fat header provides additional information for larger and more complex method bodies. The header size includes the entire header, not just the method body, and it is an essential part of understanding the structure of .NET executable files. The MaxStack field indicates the maximum number of items on the operand stack during method execution, while the CodeSize field specifies the size of the method body’s IL code in bytes. The LocalVarSigTok field contains the metadata token for the signature of local variables, which is present only if the method has local variables. Additional sections, such as exception handling clauses, are included if specified in the Flags.
For a more in-depth look at the structure of these headers, you can refer to this resource.
An example of a method body is provided with the function DeleteDiscoveryProfileInternal, showcasing the header highlighted in the HEX view window. By clicking on the offset or RVA, you can navigate to the header of the method. DnSpy highlights the header fields when hovering over the bytes in the header, providing insights into the content of the function’s instructions. To further analyze the instructions’ values (opcodes), IDA can be used to view the malware in a more detailed manner.
In conclusion, this exploration of .NET executable file structures sheds light on the complexities of the .NET framework and its dual use for both legitimate development and malware creation. By understanding the compilation process, runtime execution, metadata, and assemblies, one can effectively analyze and counteract .NET-based threats through reverse engineering.
The appendix includes a table of token types with their respective values and descriptions, offering a comprehensive reference for understanding the different types of tokens in .NET files. text to make it more concise:
Please rewrite the text to be more concise. sentence to make it more concise: “Please provide a brief explanation of your absence.” following sentence with a more formal tone:
Original: Can you please let me know if you received my email?
Rewritten: Kindly inform me if you have received my email. given sentence:
The cat chased the mouse through the house and eventually caught it.
Rewritten sentence:
The mouse was chased by the cat throughout the house and was finally caught. text to make it more concise:
“Please provide a brief summary of the book.”