Category: Forensics

Unpacking StrelaStealer

Unpacking StrelaStealer

I was digging into a new version of StrelaStealer the other day and I figured it may help someone if I wrote a quick blog post about it. This post is not an in-depth analysis of the packer. It’s just one method of quickly getting to the Strela payload.

Here is the sample I am analysing (SHA256:3b1b5dfb8c3605227c131e388379ad19d2ad6d240e69beb858d5ea50a7d506f9). Before proceeding, make sure to disable ASLR for the Strela executable by setting its DLL characteristics. Ok, let’s dig in.

A quick assessment of the executable in PEStudio reveals a few interesting things that I’ve highlighted. Note the TLS storage (callbacks). When the sample first executes, it makes two TLS callbacks as we’ll see in a bit.

Viewing the strings in PEStudio reveals several large strings with high-entropy data. These strings are part of the packed payload.

Let’s open the file in a disassembler to investigate further. I’ll be using IDA Pro for this analysis. If we inspect the “histogram” at the top of the IDA Pro window, we can see a large olive green segment which indicates data or code that IDA can’t make sense of. IDA Pro calls this data blob unk_14012A010:

As we saw in the strings earlier, this is likely the packed payload. I’ll rename this blob in IDA Pro to obfuscated_payload_blob. If we view the cross-references to this blob (ctrl+x in IDA), we can see several references:

Double-click one of these (I’ll select the 2nd one from the bottom), and you’ll see the following:

It seems our blob is being loaded into register rdx (lea rdx, obfuscated_payload_blob), and a few instructions later there is a call instruction to the function sub_140096BA0. Inspecting the code of this function and you may notice there are quite a few mathematical instructions (such as add and sub), as well as lots of mov instructions and a loop. This all indicates that this is highly likely a deobfuscation routine. Let’s rename this function deobfuscate_data. We won’t be analysing the unpacking code in depth, but if you wished to do so, you should rename the functions you analyse in a similar manner to better help you make sense of the code.

If we then get the cross-references to the deobfuscate_data function, we’ll see similar output to the cross-references for the obfuscated payload blob:

Inspect these more closely and you’ll see that the obfuscated blob is almost always being loaded into a register followed by a call to the deobfuscate_data function. This malware is unpacking its payload in multiple stages.

If we walk backwards to identify the “parent” function of all this decryption code, we should eventually spot a call to a qword address (0x14008978D) followed by a return instruction. This call looks like a good place to put a breakpoint as this is likely the end of the deobfuscation routine (given that there is also a return instruction that will take us back to the main code):

Let’s test this theory by launching the malware in a debugger (I’ll be using x64dbg). When you run the malware, you’ll hit two TLS callbacks (remember I mentioned those earlier?), like the below:

Just run past these. TLS callbacks are normally worth investigating in malware but in this case, we are just trying to unpack the payload here and will not investigate these further. You’ll eventually get to the PE entry point:

Put a breakpoint on the call instruction at 0x14008978D (using the command bp 14008978D) and run the malware. You should break on that call instruction:

If we step into this call instruction, we’ll get to the OEP (original entry point) of the payload! Inspect the Memory Map and you’ll see a new region of memory with protection class ERW (Execute-Read-Write):

This new memory segment (highlighted in gray in the image above) contains our payload. Don’t believe me? Dump it from memory (right-click -> Dump to file) and take a look at the strings. You should see something like the following:

You’ll spot some interesting data like an IP address, a user agent string, registry keys, and so on. If you don’t see any cleartext strings, you likely dumped the payload too early (before the malware deobfuscated all the data in this memory region), or too late, after the malware cleared its memory. Start reading this blog post again and try again ☺

Let’s open this dump file in IDA. After opening the file in IDA, be sure to rebase it (Edit -> Segments -> Rebase Program) to match it to the memory address in x64dbg:

After opening this dumped payload in IDA, you’ll see some inconsistencies, however:

See the problem? Some call instructions are not resolved to function names. However, in x64dbg, these functions are labeled properly:

This is because in x64dbg, these function names are being resolved to addresses in memory. In our IDA dump, they are not mapped properly.

Normally, what I would do next is try get my IDA database as close as possible to the code in x64dbg. We could spend more time analysing the unpacking code to identify where the malware is resolving its imports and this may help us get a better dump of the payload. Or, we could automate this by writing a python script to export all function names from x64dbg and import them into IDA. But why spend 1 hour automating something when we can spend 2 hours doing it manually? 🙂

We can manually fix this up IDA by cross-referencing each unknown function with the function name in x64dbg. For example, at address 0x1B1042 there is a call to InternetOpenA (according to our x64dbg output) and address at 0x1B107B is a call to InternetConnectA.

And now, we have something a lot more readable in IDA:

After you spend a bit of time manually renaming the unknown functions in your IDA database file, you should have some fairly readable code. Congrats! You unpacked Strela’s payload. Spend some time analysing the payload and see what you can learn about this sample.

Happy reversing! 🙂

— d4rksystem

Creating Quick and Effective Yara Rules: Working with Strings

Creating Quick and Effective Yara Rules: Working with Strings

This is a quick post to outline a few ways to extract and identify useful strings for creating quality Yara rules. This post focuses on Windows executable files, but can be adapted to other files types. Let’s start with an overview of the types of strings we are interested in when developing Yara rules.

tl;dr

In this post, you will learn:

  • How to extract ASCII and Encoded strings from malware samples.
  • How to analyse strings from a malware sample set and choose strings for your Yara rule.
  • Tips and other tools to assist in Yara rule creation.

ASCII vs. Encoded Strings

Windows executables normally contain both ASCII and encoded strings. A “string” typically refers to a sequence of alphanumeric and special characters arranged in a specific order. Strings are used to represent various types of data, including file names, paths, URL’s, and other content within files. ASCII and encoded strings refer to different concepts in the context of character representation.

An ASCII string is a character encoding standard that uses numeric codes to represent characters. ASCII is a straightforward encoding, but it has limitations when it comes to representing characters from other languages or special symbols. Encoded strings generally refer to representing text using a specific character encoding scheme, such as Unicode (16-bit Unicode Transformation Format, or UTF-16, and sometimes referred to as “wide” strings) which is standard in Windows executable files. When writing Yara rules for Windows executables, we normally want to focus on both ASCII and Unicode strings. So, how do we extract these strings from an executable file? Glad you asked.

Extracting Strings

The simplest way to extract ASCII strings is using the strings tool in Linux/Unix (also available in Windows and MacOS). Execute the command on your malware executable target, and save the output to a text file like so:

strings -n 4 malware.exe > malware-ascii-strings.txt

Encoded strings are also easy to extract:

strings -n 4 -e l malware.exe > malware-encoded-strings.txt

Once we have our strings, let’s dump them into a Yara rule, shall we? Heh… Not so fast, cowboy. We have some strings analysis work to do first.

Analyzing Strings

One of the challenges with using strings for detecting malware is that there are so.. many.. strings. A single executable file could have thousands. How do we know the good strings, from the bad strings, from the ugly strings? How can we know which to include in our Yara rule?

If you have a single malware executable, you’ll have lots of strings to dig through (depending on the size of the executable file, of course). The trick is to identify the strings that are likely related to the malware itself, while disregarding and filtering out the strings that are not directly related to the malware that we may not be interested in (such as compiler data and code, common strings that also reside in benign files, etc.). It takes experience to know what to look for and what to ignore.

If you have a number of files of the same malware family, this process can be a bit more efficient. What we need to do is gather our malware sample set, extract all strings from these samples, and compare these strings to identify the strings we should zero in on for our Yara rule.

This malware sample set must meet the following requirements:

  • The malware samples should be part from the same malware family. For example, if you are developing a Yara rule for Ryuk ransomware, all samples should be Ryuk ransomware, otherwise bad samples/strings will taint your Yara rule.
  • The malware samples should be unpacked/deobfuscated. If the samples are packed, encrypted, obfuscated, etc., you are no longer writing a Yara rule for the malware itself, but rather for the packer/obfuscator. If this is your intention, that’s perfectly fine, as there are valid use cases for this as well!
  • The malware samples should be of the same file type. It’s not a good idea to mix Windows executables with MS Office documents, for example.
  • The more malware samples you have in your set, the more accurate your Yara rule could be.

We can extract and analyse all strings in a malware sample set with a one-liner command. First, make sure you have your malware samples together in one directory called “samples”. (I am assuming you are on a *Nix system here, but the following command can be adapted for Windows as well with a bit of work):

for file in $(ls ./samples/*); do strings -n 4 $file | sort | uniq; done | sort | uniq -c | sort -rn > count_malware_strings.txt

In the above command, we create a for loop that iterates over all files in our samples directory (“samples”). Each file’s strings are extracted and sorted, and finally we append a “count” value to each string and save this to a text file “count_malware_strings.txt”. Here is a screenshot of the result:

You may be able to spot some interesting strings. The number “9″ next to each line denotes the number of samples this string resides in. My sample set consists of 9 samples, so each string with a 9 next to it means that this string resides in all my malware samples!

We should also run this same command, but for encoded strings:

for file in $(ls ./samples/*); do strings -n 4 -e l $file | sort | uniq; done | sort | uniq -c | sort -rn > count_malware_strings_encoded.txt

Here is the result:

See any interesting strings here? Perhaps the references to WMI (SELECT * …), the sandbox-related strings (“sandbox”), and strings such as “Running Processes.txt”?

Selecting Strings for the Yara Rule

So, now we have a much better idea of what strings to use in our Yara file. Ideally, we’ll want to select strings that are in all or most of the sample set. Selecting strings that are in only one file may result in lots of false-positives (depending on what type of rule you are creating and what your objectives are, of course). However, selecting only strings that appear in all files may result in your Yara rule being too specific. Again, this will depend on your objectives for the rule.

Consider also that even though you are dealing with malware, there will be “benign” strings (sometimes called “goodware strings”) in these files that are not part of the malware’s code or functionalities. You’ll likely want to weed these out. Optionally, you could create a goodware strings database or list that simply contains strings you wish to exclude from your Yara rules. But this is a topic for another day.

Creating our Yara Rule

Based on the strings I observed in the strings text files I created previously, I chose the following strings and created my basic Yara rule:

Notice how I added the “wide” attribute to some of the strings. This tells Yara that these are encoded strings. For the conditions at the bottom, I am specifically looking for samples that have the header bytes 0x5A4D (meaning a Windows PE file), and the sample must have 15 or more of these strings residing in them. Lowering this number will result in more of a “hunting” rule, where you may catch additional malware (with a wider net) but have more false positives. Increasing this number will create a higher-fidelity rule, but may be too specific.

Other Tools and Tips

Here are a few other random tips/tricks for dealing with strings in Yara rules:

PE Studio – PE Studio is a great PE executable file analysis tool that also has a nice “goodware” and “malware” strings database built-in. You can open an executable file in PE Studio and the tool will provide you with some hints on which strings may be interesting.

Strings-Sifter – A tool created by Mandiant, it can “sift” through strings and sort them based on how unique or “malicious” they are. This is very useful for quickly identifying the interesting strings.

Yargen – A full-on cheatmode for Yara rules. Yargen is a tool from Florian Roth that takes an input sample set and automatically generates Yara rules based on interesting strings or code in the files. This is a great tool if you are pressed for time or if you have lots of rules to create. However, nothing beats a well-tuned, manually-written rule (in my humble, old-school, boomer opinion). Also, if you are new to Yara and/or malware analysis, stay away from the automatic tools and just do it manually, please 🙂

Conclusion

I hope this short post helps you create better Yara rules! If you have further suggestions or ideas, send them to me and I may include them in this post or in future posts!

@d4rksystem

How Malware Abuses the Zone Identifier to Circumvent Detection and Analysis

How Malware Abuses the Zone Identifier to Circumvent Detection and Analysis

I was investigating a malware sample that uses an interesting trick to circumvent sandboxes and endpoint defenses by simply deleting its zone identifier attribute. This led me on a tangent where I began to research more about zone identifiers (which, embarrassingly enough, I had little knowledge of prior). Here are the results of my research.

The Zone.Identifier is a file metadata attribute in Windows that indicates the security zone where a file originated. It is used to indicate a level of trustworthiness for a file when it is accessed, and helps Windows determine the security restrictions that may apply to the file. For example, if a file was downloaded from the Internet, the zone identifier will indicate this, and extra security restrictions will be applied to this file in comparison to a file that originated locally on the host.

The zone identifier is stored as an alternate data stream (ADS) file, which resides in the file’s metadata. There are five possible zone identifier values that can be applied to a file, represented as numerical values of 0 to 4:

  • Zone identifier “0”: Indicates that the file originates on the local machine. This file will have the least security restrictions.
  • Zone identifier “1”: Indicates that the file originated on the local Intranet (local network). Both zone identifier 0 and 1 indicate a high level of trust.
  • Zone identifier “2”: Indicates that the file was downloaded from a trusted site, such as an organization’s internal website.
  • Zone identifier “3”: Indicates that the file was downloaded from the Internet and that the file is generally untrusted.
  • Zone identifier “4” – Indicates that the file came from a likely unsafe source. This zone is reserved for files that must be treated with extra caution as they may contain malicious content or pose a security risk.

You can use the following PowerShell command to check if a file has a zone identifier ADS:

Get-Item <file_path> -Stream zone*

An example of this output can be seen below. Notice the highlighted area that denotes the ADS stream (“Zone.Identifier”) and its length. Also note that if no data is returned after running this command, the file likely does not have a zone identifier stream.

To view this file’s zone identifier stream, you can use the following PowerShell one-liner:

Get-Content <file_path> -Stream Zone.Identifier

An example of this can be seen below:

A zone identifier file will look like something like this:

[ZoneTransfer]
ZoneId=3
ReferrerUrl=https://www.evil.com
HostUrl=https://download.evil.com/malware.doc

In this example, this Zone.Identifier indicates that the associated file originates from “zone 3”, which typically corresponds to the Internet zone. The ReferrerUrl denotes the domain of the webpage where the file was downloaded from or potentially the referrer domain, and the HostUrl specifies the precise location where the file was downloaded from.

These zones are also referred to as the Mark of the Web (MoTW). Any file that originates from Zone 3 or Zone 4, for example, are said to have the mark of the web.

Malware can abuse the zone identifier in a few different ways, with a couple different goals:

Defense Evasion

Malware can manipulate the zone identifier value to spoof the trust level of a file. By assigning a lower security zone to a malicious file, the malware can trick Windows and defense controls into treating the file as if it came from a trusted source.

To accomplish this, malware can simply modify its files’ zone identifiers. Here is how this can be accomplished via PowerShell:

Set-Content file.exe -Stream Zone.Identifier -Value "[ZoneTransfer]`nZoneId=1"

This PowerShell one-liner modifies a file’s zone identifier to be a certain value (in this case, setting the zone ID to “1”). This may help the malware slip past certain defensive controls like anti-malware and EDR, and may make the malware look less suspicious to an end user.

Or, the zone identifier stream can simply be deleted, which may trick some defense controls. In order to attempt to bypass defenses, a variant of the malware family SmokeLoader does exactly this. SmokeLoader calls the Windows API function DeleteFile (see code below) to delete its file’s zone identifier stream. You can investigate this for yourself in a SmokeLoader analysis report from JoeSandbox (SHA256: 86533589ed7705b7bb28f85f19e45d9519023bcc53422f33d13b6023bab7ab21).

DeleteFileW (C:\Users\user\AppData\Roaming\ichffhi:Zone.Identifier)

Alternatively, malware authors can wrap their malware in a container such as a IMG or ISO file, which do not typically have zone identifier attributes. Red Canary has a great example in this report.

Anti-Analysis and Sandbox Evasion

Malware may inspect the zone identifier of a file to circumvent analysis. Malicious files that are submitted to an analysis sandbox or are being analysed by a reverse engineer may have a different zone identifier than the original identifier the malware author intended. When the malware file is submitted to a sandbox, the zone identifier may be erroneously set to 0, when the original value is 3. If malware detects an anomalous zone identifier, it may cease to execute correctly in the sandbox or lab environment.

The pseudo-code below demonstrates the logic of how malware may check its file’s zone identifier:

zone_identifier_path = current_file_path + ":Zone.Identifier"

with open(zone_identifier_path, 'r') as f:
     zone_info = f.read()

     # Check if the zone is Internet zone (zone ID 3 or higher)
     if "ZoneId > 2" in zone_info:
          
          # File is from the Internet zone (as expected), continue running
          return()

     else:
          # File may be running in a sandbox or analysis lab!
          TerminateProcess()

If you are craving more information on this topic, other good resources are here and here.

— Kyle Cucci (d4rksystem)

Hunting BlackEnergy3 in Memory

Hunting BlackEnergy3 in Memory

I recently was investigating a memory dump from a host infected with BlackEnergy3. BlackEnergy3, which is a modified version of the original BlackEnergy malware families, was used in the attacks on the Ukrainian power grid in 2015. BlackEnergy3 is similar to its version 2 counterpart, but has been modified with additional modules that serve multiple purposes such as extraction of credentials, keystroke logging, and destruction capabilities.

This post is a sort of a step-by-step methodology for investigating BlackEnergy3 infections, and more generally, rootkit behavior in memory. I will be using Volatility as my primary tool for this investigation.

Edit: One reader asked which sample I used for this investigation. This write-up is from a memory image provided by SANS and was included with the Advanced Memory Forensics and Threat Detection course. (This course is highly recommended if you are interested in memory forensics and hunting advanced malware!). I don’t know exactly which sample was used on the infected system, but I found a possible similar sample on VirusTotal here.

Investigating Userland

I always start a memory forensics investigation by inspecting the processes that were running on the system before the memory was extracted. The Volatility “Pstree” command provides an output of processes in a nice tree-based form:

vol.py -f memdump.img --profile=Win7SP1x64 pstree

What we should be looking for here are strange process parent/child relationships, orphaned processes (processes with no parent), and processes that seem out of place, such as strange or misspelled process names. We see no clear evidence of any of this type of activity:

Output from pstree command.
Output from pstree command.

Let’s dig a bit deeper. One of my goto Volatility modules for quick wins is “malfind”. “Malfind” will enumerate the Virtual Address Descriptors (VADs) tables for each process running on the system, and attempts to find anomalies and possible evidence of code injection.

vol.py -f memdump.img --profile=Win7SP1x64 malfind

After running “malfind”, we can see an anomaly right off the bat – possible code injection into “svchost.exe” (PID 1468) process:

Output of malfind command.
Output of malfind command.

We can see above that the memory permission for this region is “PAGE_EXECUTE_READWRITE”, which means that this area of memory possibly contains executable code. We can also see the “MZ” header synonymous with Windows PE files, so this is highly likely malicious code injection. For closer inspection, let’s dump out this region of memory into a file using “vaddump”:

vol.py -f memdump.img --profile=Win7SP1x64 vaddump -p 1468

We can now inspect this area of memory by simply running the “strings” command on the dumped memory region we are interested in (“0x1a0000”):

strings -n8 svchost.exe.7e4aa060.0x00000000001a0000-0x00000000001affff.dmp | less

Output of strings command.
Output of strings command.

There are several interesting strings here. There is a reference to “aPLib”, which is a library for compressing and packing executable files. This means that the injected malicious code was likely packed, which is definitely a red flag and out of place in a process such as “svchost.exe”. Also, there are references to a user agent string, references to DLL files and a DAT file, and several references to possible API function calls.

A quick Google search shows that many of these strings are actually part of the Command & Control functionality of BlackEnergy3:

  • DownloadFile – Retrieves a file from the Internet.
  • RkLoadKernelImage – Used to load code into kernel memory address space.
  • RkLoadKernelObject – Used to load a new driver module into kernel memory from userland memory.
  • SrvAddRequestBinaryData – Used to append binary data to the C2 HTTP POST data (for C2 communication and payload download).
  • Srv* – These commands are used for C2 communication.
  • “main.dll” – The internal name of BlackEnergy’s primary DLL file.

The presence of these kernel-related functions signal that we are dealing with a rootkit.

Hunting for Rootkits

After our brief analysis of the injected code into svchost.exe, we know we are dealing with some sort of rootkit behavior. Rootkits typically will load a kernel module or driver into kernel memory space. Let’s hunt for this.

“Modscan” is able to scan kernel memory for loaded drivers and modules, and is the perfect command to use here:

vol.py -f memdump.img --profile=Win7SP1x64 modscan

Output of modscan command.
Output of modscan command.

There are a few potentially suspicious modules listed here, but one in particular stands out: “adp94xx.sys”. I was able to determine that this module is out of place by Googlng the other good, benign modules. The only way to know what is not normal is to know what is normal – so it’s good to do some Googling or have a list of normal drivers handy 😉 Let’s dump this kernel driver from memory, using the base address listed above:

vol.py -f memdump.img --profile=Win7SP1x64 moddump -D ./ --base=0xfffff88003fbf000

Once again, I use the strings command to run a quick inspection of this file:

strings driver.fffff88003fbf000.sys | less

Output of strings command.
Output of strings command.

We can see several kernel function calls here. Running the same strings command but for Wide strings (16-bit little-endian) encoding, we can see a bit more:

strings -e l driver.fffff88003fbf000.sys | less

Output of encoded strings.
Output of encoded strings.

A few items stick out to here. The most obvious is that this driver file appears to be published by Microsoft and is called the “AMD IDE driver”. In addition, we can see several Windows API functions. One example is “SeImpersonaltePrivilege”, which is a Windows API function that can be used to impersonate privileges and access tokens, and is used in some rootkits and privilege escalation exploits. This function is just a clue into the functionality of this driver. Finally, we see a reference to “svchost.exe”, which is what we saw earlier in malscan!

A quick Google search for “AMD IDE driver” and “adp94xx.sys” reveal a few discrepancies. First, “AMD IDE driver” is a real driver name, but does not seem to relate to the file name “adp94xx.sys”. Second, the “adp94xx.sys” could be a legitimate driver, but is related to Adaptec, and not to AMD IDE drivers. This discrepancy proves that hunting for kernel rootkits is a lot about knowing what is and what is not normal, and knowing how to Google 😉

We can dump the imports from this module as well:

vol.py -f memdump.img --profile=Win7SP1x64 impscan –base=0xfffff88003fbf000

Output of impscan.
Output of impscan.

There are a few imports we should focus on here. One function of interest is “KeStackAttach”. According to Microsoft, KeStackAttachProcess attaches a specified process thread to the address space of another process. This functionality can be used to run code from the kernel module rootkit in the context of a userland process, which essentially serves as a very stealthy way to run code.

As a quick tip, we can also extract the imports in a format that can be imported into IDA for later analysis:

vol.py -f memdump.img --profile=Win7SP1x64 impscan --base=0xfffff88003fbf000 –output=idc >> module-imports-ida.idc

Later, we can look at this module in IDA or another disassembler in order to better understand it. This is out of the scope of this post, but this is something that should be done during an investigation.

Wrapping Up

From the investigation above, we can make several inferences (or at least, educated guesses) from this data. 

  • Malicious code was injected into the “svchost.exe” process. Once executed, this code likely downloads an additional module from the Internet (using the DownloadFile function).
  • The malware may have executed its rootkit behavior be leveraging its “RkLoadKernelObject” function, which allows code execution in the context of the kernel.
  • Once in kernel memory, the rootkit is able to hide on the system, and inject additional malicious code into other userland processes, further embedding itself in the system in stealthy way.

This of course is not a complete investigation of BlackEnergy3, but shows what can be done to quickly triage rootkit behaviors. You can likely see that hunting rootkits, and memory hunting in general, takes a combined approach of cross-referencing the output of multiple tools, Googling things, and understanding what is and what is not normal Windows behavior.

Bonus: For sticking with me this long, you may have noticed 2 “iexplore” processes in the “pstree” output:

Internet Explorer processes from pslist output.
Internet Explorer processes from pslist output.

These are actually the product of a special module that BlackEnergy3 is able to deploy called the “Ibank” module. This module injects itself into Internet Explorer processes and is able to steal banking credentials from its victims 🙂

As always, thanks for reading.

— @d4rksystem

Hiding Virtual Machines from Malware – Introducing VMwareCloak & VBoxCloak

Hiding Virtual Machines from Malware – Introducing VMwareCloak & VBoxCloak

Many malware families are still using fairly trivial techniques for the detection of virtual machine environments. Once malware detects that it may be running in a virtual machine, it may terminate itself, or worse, execute code that will cause a diversion and potentially lead the malware analyst down the wrong paths :O

Malware often uses the following techniques for virtual machine detection:

Registry Enumeration

Registry enumeration is one of the most common techniques that evasive malware may use to determine if it is running in a VM. Some registry keys malware may look for is hardware information, system BIOS information, and any other registry keys and values that contain references to hypervisors such as VMware Workstation and VirtualBox. Many of these registry keys can be renamed or removed without heavily affected the performance or usability of the VM!

File & Directory Enumeration

Malware may enumerate files and directories on the system to get an understanding of the environment it is running in. Malware may look for files and directories that reference common hypervisors, such as “VMware” or “VBox” directories in the “C:\Programs” directories. Malware may also enumerate the “C:\Windows” directory, typically looking for hypervisor-related drivers and system files. An interesting fact is that many of these files (even system and driver files!) can be removed or renamed without affecting the VM, since these files are loaded into memory and not often accessed from the disk!

Process Enumeration

Finally, malware often enumerates the running processes on the system to determine if any hypervisor-related processes are running. Typically, hypervisors such as VirtualBox and VMware have processes running that are used to enable “helper” related functionalities such as drag-and-drop, clipboard sharing, and shared drives. These processes are often not required for the general functionality of the VM, so they can be safely killed in order to better hid the VM from malware.

Because these detection techniques are fairly trivial, we as malware analysts can also use trivial methods to bypass them! I wrote VMwareCloak (for VMware Workstation) and VBoxCloak (for VirtualBox) for just this reason. These tools are Powershell scripts that are designed to sanitize your Windows sandbox VM’s. The scripts kill processes, and remove or rename registry keys, files, and directories that may lead malware to believe it is running in a virtualized environment.

You can download the scripts here:

For VirtualBox: https://github.com/d4rksystem/VBoxCloak
For VMware: https://github.com/d4rksystem/VMwareCloak

To run the scripts, simply execute your chosen script as an Admin on your Windows VM:

If all goes well, your VM will be sanitized and the evasive malware may now run as if it was not in a VM! (I have tested this script with several malware families. However, these scripts will not work for all malware, especially more advanced variants that are, for example, using hardware detection or timing-based detection techniques.)

A bit more information can be found in my writeup here:

Enjoy! Feel free to yell at me when you inevitably find bugs in the script 😉

Extracting Malware from Memory with Hollows_Hunter

Extracting Malware from Memory with Hollows_Hunter

Sometimes I come across a tool that makes me stop and think what I have been doing all my life life without it. This is how I feel about hollows_hunter. Hollows_hunter is essentially a tool for automatic extraction of evil objects and malicious code from memory. The tool is able to hunt for things such as process injection and replacement, process hollowing, DLL injection, malicious shellcode, and has a host of other uses. Hollows_hunter is based on pe-sieve, written by the same author, hasherezade.

Once a malicious object is found in memory (for example, malicious code in a running process), the tool will auto-magically extract the process from memory, and attempt to rebuild the headers and Import Address Table (IAT). In many cases, the tool is able to completely fix up the dumped file and rebuild the IAT so I could load it into IDA for further analysis. This is great because some of the malware samples I attempt to unpack and extract end up as warped shells of their once-glorious selves, un-analyzable in IDA. Hollows_hunter has saved me a lot of time recently, and seems to not get as much attention as it deserves. This drove me to write a quick tutorial post for it.

Let’s dive into hollows_hunter to see why it is so useful. Hollows_hunter gives a wealth of options that we can enable to help us better extract the data we are looking for:

hollows_hunter command line parameters.

For the following post, I’ll be using the sample with the SHA256 hash:

f8d281ee13bd7bde9b236a21966e7868836c452f1b2b10ad7c6dd1c395fbf149

You can find this file on VirusTotal or elsewhere online, in case you want to follow along.

First, I’m going to run the sample in a Windows 7 VM sandbox. Immediately after executing the sample, I run hollows_hunter:

Running hollows_hunter.

The command line options I used above are “hooks”, “shellc”, “data”, and “imp”.

  • “hooks” essentially tells hollows_hunter to look for malicious hooks and patches in memory.
  • shellc” instructs the tool to look for shellcode (this sometimes produces false-positives, so handle with care.)
  • data” tells hollows_hunter to inspect non-executable memory segments. This is sometimes important because malware may write data to non-executable areas of memory. This data may be useful strings, or it may be code that the malware will later try to execute.
  • Finally, “imp 1” instructs hollows_hunter to attempt an automated rebuild of the Import Address Table (IAT). This is super helpful, as we will see in a minute.

I ran hollows_hunter as Administrator so that it can scan system processes. I could have also ran the sample with the parameter “loop”. This parameter instructs hollows_hunter to continually run on repeat, constantly looking for malicious activity.

After running hollows_hunter, we can see that it scanned all running processes, found some malicious activity in memory, and dumped those memory regions:

hollows_hunter summary.

PID 2892 and 2944 are, in my case, false positives. I know that because hollows_hunter seems to always detect malicious activity in svchost.exe and wmpnetwk.exe when running as an Administration. Powershell.exe (PID 2520) is the PowerShell process I am running hollows_hunter in, so this is also a false positive.

EDIT: After messing around with the latest version of HollowsHunter (HollowsHunter Portable 0.2.4), I have confirmed that the issues with most of these false positives have been fixed . Obviously, there will always be false positives with any tool, but @hasherezade did a great job at identifying the common FP’s.

We can see that the malware sample has spawned two strange processes (WhatsAppWeb.exe, PID 2480/2088), and hollows_hunter was able to dump those processes from memory.

If we navigate to the hollows_hunter directory that we ran the tool from, we can see new directories that match the name of the PID that was dumped. In my case, the directories I am interested in are “process_2480” and “process_2088”. Let’s see what hollows_hunter dumped for us in the “process_2480” directory:

hollows_hunter dumped process.

Here we can see the dumped executable from memory, the list of imports that the sample is likely utilizing, and a report of the findings in JSON format. Now let’s inspect the “WhatsAppWeb.exe” here. I will open this file in IDA, my disassembler of choice.

Interesting strings in "WhatsAppWeb.exe".

Looking at the strings, we can immediately see some of the capabilities of this malware sample. This sample seems to be utilizing keylogging functionalities, and possibly attempting to disable anti-virus software.

"WhatsAppWeb.exe" imports.

If we look at the Imports section, we can see all of the imports are listed here. This means that the IAT was successfully rebuilt! Rebuilding the IAT is always a huge pain in my %@$, so hollows_hunter really helped here!

Because the sample has been unpacked, the IAT and PE structure have been rebuilt, and the strings appear readable, this malware should be easy to analyze further in IDA. Whoot.

Analyzing "WhatsAppWeb.exe" in IDA.

Well, that’s all for now. I hope you learned a bit about the usefulness of hollows_hunter. Thanks to hasherezade for putting in the work to create such a useful tool!

If you enjoyed this post, follow me on Twitter (@d4rksystem) 🙂