Author: d4rksystem

Deceiving the Deceivers: A Review of Deception Pro

Deceiving the Deceivers: A Review of Deception Pro

TL;DR: This is my personal experience and a quick review of the Decepton Pro sandbox. Deception Pro is a specialized sandbox for long-duration analysis of malware and threat actor behavior. Overall, it is a promising product and fills a niche gap in the malware sandbox market.

One challenge facing malware analysts, reverse engineers, and threat intelligence analysts is understanding how malware behaves over a longer period of time. Capturing behaviour in a traditional sandbox for 3, 5, or even 20 minutes is possible, and analysts can also run samples in custom virtual machines or a baremetal analysis system to watch what they do. But there are still key challenges, such as:

  • It’s difficult to make the environment realistic enough to “convince” malware to continue execution, and even more difficult to capture follow-on actions such as commands issued to a bot or hands-on-keyboard events. Advanced malware and actors are looking for active directory environments or corporate networks, for example, and this can be difficult to simulate or maintain.
  • Even if an analyst can create a realistic enough environment to capture meaningful actor activity, it’s difficult to randomize the environment enough to not be fingerprinted. If an actor sees the same hostname, IP, or environment configurations over and over again, the analysis machine can easily be tracked and/or blocklisted.
  • Scalability, especially in baremetal setups, is always an issue. In my baremetal analysis workstation, I can’t detonate multiple malware samples at a time (while preventing cross-contamination), for example, and I can’t easily add snapshots for reverting after detonation.

Introducing Deception Pro

I was introduced to Deception Pro by a colleague who spoke highly of Paul Burbage’s team and the work they’ve done on other products (like Malbeacon). After reaching out to Paul, he was kind enough to offer me a demo account to help me understand the product and how it could fit into my threat research workflow. So without further ado, here’s my disclaimer:

Disclaimer: Paul and the Deception Pro team provided me with a free demo license to evaluate the product and see if it meets my needs. I’m not being paid for this review, and Paul and the team did not ask me to write one. This review is entirely my own doing.

In this post, I’ll be covering what Deception Pro is, how it can fit into a malware analysis and reverse engineering workflow, and some of its features.

Overview

Deception Pro is what I’d call a “long-term observability sandbox.” Essentially, it’s a malware sandbox designed to run malware for extended periods – several hours or even days – while also fooling the malware into thinking it’s running in a legitimate corporate environment. Long-term observation can be beneficial for a couple reasons, most notably:

  • Advanced malware often “sleeps” for long periods, waiting for an elapsed period of time before continuing execution or downloading additional payloads. 
  • When the analyst wants to observe additional payload drops (for example, in a loader scenario) or hopes to catch hands-on-keyboard actions or follow-up objectives the attackers are trying to execute.

Pretend for a moment I’m a malware analyst (which I am, so there’s not much stretch of the imagination here). I detonated an unknown malware sample in my own virtual machines as well as several commercial sandboxes. Using publicly available commercial and free sandboxes, I determined that the malware belongs to a popular loader family. (Loaders are a class of malware that download additional payloads. They typically perform sandbox detection and other evasion techniques to ensure the target system is “clean” before executing the payload.)

I know this malware is a loader, but I want to understand what payload it ultimately drops. This behavior isn’t observable in the other sandboxes I’ve tried. I suspect that’s because the malware only communicates with its C2 and deploys its payload after a long period of time. I then submit the malware sample to Deception Pro.

When starting a new Deception Pro session, you’re greeted by an “Initiate Deception Operation” menu, which is a cool, spy-like way of saying, “start a new sandbox run.” James Bond would approve.

In this menu, we can choose from one of three randomly generated profiles, or “replicas,” for the user account in your sandbox – essentially, your “target.” This person works for one of the randomly generated company names and is even assigned a fancy title. Deception Pro then generates fake data to populate the sandbox environment, and this replica acts as a starting point or seed. I chose Mr. Markus Watts, a Supply Chain Data Scientist at the company Pixel Growth. Looks legit to me.

In the next menu, we’re prompted to upload our malware sample and choose additional details about the runtime environment. The two primary options are “Detonate Payload” and “Stage Environment Only.” Detonate Payload does what you’d expect and immediately detonates the payload once the environment spins up. Stage Environment Only allows the operator (you) to manually interact with the analysis environment. I haven’t experimented with this option.The final menu before the sandbox starts is the Settings menu. Here, we can select the detonation runtime (days, hours, minutes), the egress VPN country, some additional settings, and most importantly, the desktop wallpaper of the user environment. I’ll choose a relaxing beach wallpaper for Mr. Watts. He probably needs a nice beach vacation after all the work he does at Pixel Growth.

As Deception Pro is designed for long-term observation, it’s best to set a longer duration for the run. Typically, I set it to 5–8 hours, depending on my goals, and I’ve had good results with this.

After clicking the Submit button, the analysis environment is set up and populated with random dummy data, such as fake files, documents, and other artifacts, as well as an entire fake domain network. This creates a realistic and believable environment for the malware to detonate in.

Deception Pro - Generating environment

Behavioral and Network Analysis

Fast-forward eight hours, and our analysis is complete. I’m excited to see what behaviors were captured. We’ll start with the Reports → Detections menu.

The Detections menu shows key events that occurred during malware detonation. There are a few interesting entries here, including suspicious usage of Invoke-WebRequest and other PowerShell activity. Clicking on these events provides additional details:

In the Network tab, we can view network connections such as HTTP and DNS traffic, along with related alerts:

In the screenshot above, you may notice several web requests as well as a network traffic alert for a “FormBook C2 Check-in.” This run was indeed a FormBook sample, and I was able to capture eight hours of FormBook traffic during this specific run.

I was also able to capture payload downloads in another run:

In this run (which was a loader), a 336 KB payload was delivered roughly five hours into execution. This highlights the fact that some loaders delay payload delivery for long periods of time.

The Artifacts menu allows analysts to download artifacts from the analysis, such as PCAPs, dropped files, and additional downloaded payloads:

Regarding PCAPs, there is currently no TLS decryption available, which is a drawback, so let’s touch on this now.

Conclusions

It’s important to remember that Deception Pro is a specialized sandbox. I don’t believe it needs to have all the features of a traditional malware sandbox, as that could cause it to become too generalized and lose its primary strength: creating believable target users and lightweight environments while enabling long-term observation of malware and follow-on actions. Here are some of the benefits I noticed when using Deception Pro, and some potential room for improvement:

Benefits

  • Generates operating environments that simulate very realistic enterprise networks. This can expose additional malware and threat actor activities that other sandboxes may miss, like pivoting or network reconnaissance.
  • Allows users to specify long detonation runtimes (hours to days) for observance of full attack chains (from initial infection to command and control, data exfiltation, and additional module and payload drops.
  • Captures key events, behaviors, and network traffic of interest for investigators and researchers
  • Allows interaction with the running sample and environment

Room for Improvement

  • PCAP decryption is currently missing (though this is reportedly coming)
  • Behavioural output is somewhat limited in its current state. This wasn’t too detrimental for my use case, as I primarily used Deception Pro as a long-term detonation environment rather than a full-fledged analysis sandbox. I rely on other tools and sandboxes for deeper analysis.
  • Currently no memory dump capabilities or configuration extraction

Also, note that the operating system environment is randomly generated, which limits customization (such as usernames, company names, etc.). This will rarely be an issue, but could matter when attempting to detonating highly targeted malware.

Overall though, I think the team behind Deception Pro is well on its way to creating a solid specialty sandbox, and I’m excited to see where it goes. Big thanks to Paul and the team for letting me spam their servers with malware.

Elephant in the Sandbox: Analyzing DBatLoader’s Sandbox Evasion Techniques

Elephant in the Sandbox: Analyzing DBatLoader’s Sandbox Evasion Techniques

This blog post is mostly a copy/paste from two talks I recently gave at the security conferences BotConf 2025 and DEF CON (Malware Village). You can find the original slides here along with more detailed information.

Disclaimer: I used an LLM to generate some of this blog post, but the original slides were written entirely by me 😉

Sandbox Evasion and anti-analysis techniques in malware are surely not new. But every once in a while, I run across a malware sample or family that just.. well.. boggles the mind. In this case, it’s DBatLoader. In this post, we’ll talk about DBatLoader’s strange design decisions: How it works, what makes it distinctive, and what defenders can do about it.

What is DBatLoader?

DBatLoader goes by a few different alternative names, including NatsoLoader and ModiLoader. It is primarily a loader/downloader: its job is to fetch and deploy other malware (RATs, stealers, etc.). Some examples of payloads it delivers are Remcos, AveMaria, Formbook/XLoader.
 DBatLoader uses multi-stage infection chains, such as:

LNK File → Embedded PowerShell → DBatLoader


JavaScript File → Embedded BAT script → DBatLoader


The DBatLoader payloads are often stored on legitimate cloud services (OneDrive, Google Drive, sometimes Discord), which helps it hide or avoid suspicion.

Anti-Sandbox Tricks Up It’s Sleeve

Here are several of the more interesting (and messy) techniques that DBatLoader pulls off to try to detect, frustrate or overwhelm sandbox environments:

Evasion TechniqueWhy It Helps (or Doesn’t)
The malware tries to allocate multiple large chunks of memory (in the order of ~500 MB or more), more than many sandboxes can comfortably provide. If the sandbox lacks enough RAM, the payload isn’t executed and the malware fails to detonate.If the sandbox can’t fulfill the request, the malware might detect something’s wrong, possibly abort or delay. But large allocations are noisy and may themselves raise flags.
Attempts to change memory protection on addresses it doesn’t necessarily have permission to. This often causes PAGE_NOACCESS errors.It’s a kind of stress‑test: if the sandbox is strict, these operations will error. The malware might observe the behavior and change its execution path (e.g. stop or not reveal full payload).
If direct memory protection changes fail, DBatLoader tries writes to and free memory it doesn’t fully own. This leads to access violations or “partial copy” errors.These errors may disrupt sandbox detonation or detection tools; or the malware may decide to abort in the presence of certain behaviors, avoiding giving a full behavior trace to the sandbox, similarly to the above technique.
AMSI (the Anti‑Malware Scan Interface) is a Windows component that lets code analysis tools scan scripts or code at runtime. Malware often tries to patch (disable) AMSI to avoid detection. DBatLoader does this, but in a sloppy way. Some pointer references or the method of patching are wrong, so it doesn’t work as designed.It’s not fully clear why DBatLoader uses this flawed AMSI patching technique. It begs the question: Is this a bug or feature? I suspect it’s intentionally trying to be noisy to trigger anti-malware and EDR to detect the loader. If the loader is detected early-on in the attack chain, it will surely disrupt the attack, but will protect the payload (and the payload’s C2 addresses and other IOC’s). More on this below.

Design Choices: Why “YOLO” Over Stealth?

Rather than being subtle, DBatLoader takes an aggressive posture: lots of junk code, lots of large, risky memory operations, etc. But Why? Here are my theories:

Sandbox smashing: The malware seems more interested in physically denying the sandbox a chance to observe, rather than going undetected.

Detect & Abort: If something in the environment doesn’t match what it expects (e.g. low memory, missing permissions), DBatLoader likely will not execute fully (or at all). Better to evade being fully analyzed than be caught in full operation.


Intentional Detection: In an enterprise environment, DBatLoader wants to be detected as quickly as possible. But why would it wish to achieve this goal? If anti-malware or EDR detects the loader stage, the payload that DBLoader is trying to deliver will not be executed, thus protecting the payload and its valuable C2 addresses and other potential IOC’s. Why risk burning a C2 when the loader is detected anyway?

How Defenders Can Respond

Even though DBatLoader’s tactics are somewhat brute‑force, there are multiple points where defenders (sandbox architects, malware analysts, endpoint security) can fight back:

  1. Monitor / hook memory APIs
    • Watch for calls like NtAllocateVirtualMemory, NtProtectVirtualMemory, NtWriteVirtualMemory. Large allocations or unusual protection changes should raise alerts.

    • If sandbox or detection systems stub, hook, or limit those calls, they can interfere with DBatLoader’s attempts.

  2. Observe failed operations
    • For example, if memory protection changes fail or access violations are frequent, that suggests attempts to do invalid or unauthorized memory operations.

    • These errors themselves can be a clue.

  3. Strengthen AMSI implementation
    • Ensure AMSI patching attempts are more difficult, or monitor code that attempts to locate/modify AMSI functions.

  4. Resource constraints / threshold detection
    • Have reasonable resource quotas in sandbox: e.g. limit memory, but monitor when malware tries to allocate more.

    • Compare expected behavior: many real benign applications don’t try to allocate huge amounts of memory at startup.

  5. Behavioural signatures
    • Because DBatLoader uses multi-stage loaders, with intermediary BAT, JS, PowerShell, etc., guard at multiple layers (e.g. flag suspicious .lnk → PS chains, cloud storage download of payloads).


Go Big or Go Home (and Other Terrible Go Puns): Tips for Analyzing GoLang Malware

Go Big or Go Home (and Other Terrible Go Puns): Tips for Analyzing GoLang Malware

A few days ago, Dr. Josh Stroschein invited me on his livesteam channel to talk about Golang malware. I wanted to get a quick blog post up before I forget everything we talked about. So, here is a summary of the key points we discussed in the livestream. You can also just watch the livestream here if you’re feeling lazy. Ok, let’s get on with it.

Go (or Golang) has gained some traction over the past few years, not just among developers, but increasingly among malware authors also looking for flexibility and portability. As reverse engineers and malware analysts, that means we need to get more comfortable navigating Go binaries, understanding how they’re structured, and knowing what makes them different from traditional malware written in C, C++, or Delphi (vomit face).

What Is Go, and Why Does It Matter?

Go is a statically typed, compiled programming language developed by Google. It’s designed to be simple, fast to compile, and efficient. Some of the things Go does well:

  • Cross-compilation: Go makes it easy to build binaries for different OS’s and architectures.
  • Static linking: Most Go binaries are self-contained, meaning no external dependencies.
  • Built-in concurrency: Go’s goroutines make it easy to write efficient networked applications.

From a developer’s perspective, it’s efficient and practical. From a malware analyst’s perspective, it presents some interesting challenges.

Why Use Go for Malware?

Go offers several advantages that make it appealing to malware authors:

  • Portability: Malware authors can compile a single codebase for multiple platforms (Windows, Linux, macOS, ARM, etc.).
  • Self-contained binaries: Go binaries include everything they need to run, which results in some seriously HUGE executable file sizes, but more on that later.
  • Less tooling: Traditional reverse engineering tools aren’t as well-optimized for Go binaries, especially compared to C/C++ (but this is changing quickly).
  • Rapid development: Go is relatively easy to write and maintain, which makes it efficient for malware development.
  • Evasion by obscurity: Go binaries look different from typical malware, especially in static analysis, which may help them avoid basic detections (this is also changing rapidly).

Common Pitfalls in Analyzing Go Malware

1. Large Binary Sizes

Even a simple Go program can compile into a binary tens of megabytes in size, which you’ll see in a moment. This is due to static linking of the Go runtime and standard libraries. For analysts, this means more to sift through as it’s often not immediately obvious where the actual malicious code begins or ends.

2. Excess of Legitimate Code

Go’s standard library is extensive, and malware often makes use of common packages like net, os, crypto, and io. Most of the code in the binary is likely benign. The challenge is identifying the small percentage of custom or malicious logic within all the legitimate functionality. Your classic needle-in-a-pile-of-needles problem.

3. Obfuscation (Garble and Others)

Go malware is increasingly using obfuscation tools like Garble, which strip or randomize symbol names, re-order packages, and break common static analysis workflows. These techniques don’t necessarily make the malware more sophisticated, but they do add complexity to the reversing process.

Other common obfuscation techniques may include:

  • Encrypted or encoded strings
  • Control flow obfuscation
  • Packing or compression

Let’s analyze a very basic Go binary. The best way to do this is to write our own code.

Analyzing a Basic Go Program

Go code is fairly straightforward and simple to write. Here is literally the most basic Go application you can write, printing our favorite “Hello World” (in this case, “Hello Earth”) string:

When compiled (using the go build command), the binary is a fairly large executable (2MB+). Since Go ships a lot of library code into each compiled executable, even this simple Hello World binary is substantial.

Let’s open this up in IDA, my dissasembler of choice for Golang. Newer versions of IDA (I think version 8+) are good at identifying Go standard library code. IDA nicely groups these libraries in “folders”, as you can see in the screenshot below:

Each of these folders represent a library. For example, “internal”, “runtime”, and “math” are all libraries being imported into this Go program. IDA is able to recognize these libraries and functions and name them appropriately. If your dissasembler is not designed for Golang use, you’ll see a bunch of generic names for these functions which makes analysis of Go binaries a lot more difficult. One tool (GoReSim) can help identify these functions, and the output of this tool can then be re-imported into some disassemblers like Ghidra.

Most of the time in un-obuscated Golang binaries, the main functionality of the program will reside in the function main.main or, main_main), which IDA identified for us:

Tip: Whenever I’m analyzing a Go binary, I first always look for main_main or other functions that contain the name “main_*”.

Inside main_main we can see our Hello World code. You may be able to spot the “Hello Earth!” string in the code below:

This “Hello Earth!” string also contains a bunch of other junk. These are also strings in the binary. One challenge when analyzing Golang code is that strings are not null-terminated like they are in C programs. Each string is actually a structure that contains the string itself and an integer representing the string’s length. I provided some terrible pseudocode for visualization of this:

struct string (
value = "Hello Earth!"
length = 12
)

In this case, IDA didn’t know that “Hello Earth!” is a separate string from “152587…” and the others. This is one thing you’ll need to take into account when analyzing Golang.

Ok, Hello World apps are cool and all, but let’s take it up a notch. Many malware binaries written in Go will be obfuscated. Garble is one such obfuscator. Garble… well… garbles the metadata of the Go binary. It does this by stripping symbols, function names, module and build information, and other metadata from the binary during compile-time.

If we open the same Hello World binary in IDA, but “Garbled” during compilation, it looks a lot different:

All our nice, beautiful Golang function names have been replaced with ugly, generic IDA function names (“sub_xxxxxx”). So how do we find our main function code now? We can’t – Golang won. Time to pack up and Go home.

No, just kidding. We just have to work a bit harder. I’ve found that Golang requires several critical libraries to correctly function, and one of those is the “runtime” library, which contains a lot of Go’s runtime code. Oftentimes, the runtime library names are not obfuscated, like in this case of my binary compiled with Garble (Note: I think Garble can also strip the module names from “runtime” as well, but I didn’t test this. In any case, the “runtime” module names are often not obfuscated). This means we can find cross-references to runtime functions in the code, and trace those back to the program’s main function! Let’s try this.

If we search the function list in IDA for “runtime”, we get the following:

One common runtime function is runtime_unlockOSThread. We can double-click on this function and select CTRL+X to see cross-references to it. Taking a look through all the cross-referenced functions will lead you to a block of code that looks like this:

When you spot functionality that contains a lot of “runtime” functions, you may be near the location of the program’s main code. In this case, our main code is not far away, in sub_49A9E0. You may be wondering: “Kyle, how are you so smart that you found that so fast?”. Well, intelligence aside, it was a lot of hunting around the code. No crazy tricks here.

And here we have our main code at sub_49A9E0:

Tip: Garble and other obfuscators can also obfuscate strings, not just the function names. I used the default Garble settings for this binary. The analysis methodology is the same, however.

Additional Resources

A few more resources on Golang I find extremely helpful:

  • Ivan Kwiatkowski’s YouTube videos on GoLang analysis.
  • Josh Stroschein’s PluralSight course on GoLang malware analysis. In this course, Josh covers the OT malware FrostyGoop.

Key Takeaways

Go malware is becoming more common, and it’s likely here to stay. While it presents some unique challenges, many of the same principles from other forms of reverse engineering still apply. You just need to adjust your approach and tools.

5 Tips for Reversing Go Malware

  1. Start with main.main (main_main) – This is (nearly) always the entry point for a Go binary and can give you a foothold into the rest of the logic.
  2. Use the right tooling – IDA, Ghidra with GoReSym (other disassemblers probably worked, but I haven’t tested them), and de-obfuscators like the appropriately named UnGarbler.
  3. Ignore the noise – Skip most of the standard library code unless it’s directly involved in malicious behavior.
  4. Look for key APIs – Even with obfuscation, patterns like “net.Dial”, “os/exec”, or “http.Get” can help narrow down suspicious areas.
  5. Combine static and dynamic analysis – Especially with obfuscated binaries, dynamic tracing or debugging can be the fastest way to understand real behavior. Ivan Kwiatkowski has some great tips on debugging Golang in this video.
“Beeeeeeeeep!”. How Malware Uses the Beep WinAPI Function for Anti-Analysis

“Beeeeeeeeep!”. How Malware Uses the Beep WinAPI Function for Anti-Analysis

I was recently analyzing a malware sample that abuses the Beep function as an interesting evasion tactic. The Beep function basically plays an audible tone notification for the user. The Beep function accepts a parameter DelayInterval, which is the number of milliseconds to play the beep sound. The calling program’s thread that executed the function will “pause” execution for the duration of the beep. This can be an interesting anti-sandbox and anti-debugging technique. Let’s take a deeper look at the Beep function.

How Beep Works

When a program invokes the Beep function, it ultimately calls into a function called NtDelayExecution, which does exactly as it is titled: It delays execution of the calling program’s running thread. The below image illustrates how this essentially works:

The calling program (in this case, the malware), calls Beep, which further calls into NtDelayExecution. Once the beep duration has been met, control flow is passed back to the malware.

Here is a function trace from API Monitor, showing the same thing. Notice how Beep invokes several other lower level functions, including DeviceIOControl (to play the audible beep sound via hardware) and the call to NtDelayExecution:

As a side note, as the Beep function was originally intended to play an audible “beeeep!” when executed, it accepts a parameter called dwFreq which denotes a frequency of the beep sound, in hertz. This means that the calling program can decide the type of the tone that is played when Beep executes. This particular malware doesn’t play a tone when calling beep, but I think this would be a funny technique for malware to use; annoy the victim (or malware analyst). You may also wonder why the malware can’t just call NtDelayExecution directly. This would also work, but may appear more obvious to malware analysts and researchers. Anyway, it’s much more fun to use Beep than to call NtDelayExecution directly.

The Malware

The malware I was investigating calls the Beep function with a DelayInterval parameter of 65000 milliseconds (which will stall analysis for about 1 minute). It also calls this Beep function multiple times for added delay. This will cause the sandbox to stall for potentially log periods of time. If the malware is being debugged, the analyst will temporarily lose control of the malware as the thread is “paused”. Here is an excerpt of this code in IDA Pro:

Sandbox and Debugger Mitigations

To mitigate this technique, the sandbox should hook the NtDelayExecution function and modify the DelayInterval parameter to artificially decrease any delay. In a debugger, the malware analyst can set a breakpoint on NtDelayExecution or the Beep function and modify the DelayInterval parameter in the same way.

Other References

While researching this malware, I ran across an article from researcher Natalie Zargarov from Minerva Labs who wrote about this same technique in 2023 used in a different malware family.

Thanks for reading!

— Kyle Cucci (@d4rksystem)

https://thehackernews.com/2023/02/experts-warn-of-beep-new-evasive.html
Unpacking StrelaStealer

Unpacking StrelaStealer

I was digging into a new version of StrelaStealer the other day and I figured it may help someone if I wrote a quick blog post about it. This post is not an in-depth analysis of the packer. It’s just one method of quickly getting to the Strela payload.

Here is the sample I am analysing (SHA256:3b1b5dfb8c3605227c131e388379ad19d2ad6d240e69beb858d5ea50a7d506f9). Before proceeding, make sure to disable ASLR for the Strela executable by setting its DLL characteristics. Ok, let’s dig in.

A quick assessment of the executable in PEStudio reveals a few interesting things that I’ve highlighted. Note the TLS storage (callbacks). When the sample first executes, it makes two TLS callbacks as we’ll see in a bit.

Viewing the strings in PEStudio reveals several large strings with high-entropy data. These strings are part of the packed payload.

Let’s open the file in a disassembler to investigate further. I’ll be using IDA Pro for this analysis. If we inspect the “histogram” at the top of the IDA Pro window, we can see a large olive green segment which indicates data or code that IDA can’t make sense of. IDA Pro calls this data blob unk_14012A010:

As we saw in the strings earlier, this is likely the packed payload. I’ll rename this blob in IDA Pro to obfuscated_payload_blob. If we view the cross-references to this blob (ctrl+x in IDA), we can see several references:

Double-click one of these (I’ll select the 2nd one from the bottom), and you’ll see the following:

It seems our blob is being loaded into register rdx (lea rdx, obfuscated_payload_blob), and a few instructions later there is a call instruction to the function sub_140096BA0. Inspecting the code of this function and you may notice there are quite a few mathematical instructions (such as add and sub), as well as lots of mov instructions and a loop. This all indicates that this is highly likely a deobfuscation routine. Let’s rename this function deobfuscate_data. We won’t be analysing the unpacking code in depth, but if you wished to do so, you should rename the functions you analyse in a similar manner to better help you make sense of the code.

If we then get the cross-references to the deobfuscate_data function, we’ll see similar output to the cross-references for the obfuscated payload blob:

Inspect these more closely and you’ll see that the obfuscated blob is almost always being loaded into a register followed by a call to the deobfuscate_data function. This malware is unpacking its payload in multiple stages.

If we walk backwards to identify the “parent” function of all this decryption code, we should eventually spot a call to a qword address (0x14008978D) followed by a return instruction. This call looks like a good place to put a breakpoint as this is likely the end of the deobfuscation routine (given that there is also a return instruction that will take us back to the main code):

Let’s test this theory by launching the malware in a debugger (I’ll be using x64dbg). When you run the malware, you’ll hit two TLS callbacks (remember I mentioned those earlier?), like the below:

Just run past these. TLS callbacks are normally worth investigating in malware but in this case, we are just trying to unpack the payload here and will not investigate these further. You’ll eventually get to the PE entry point:

Put a breakpoint on the call instruction at 0x14008978D (using the command bp 14008978D) and run the malware. You should break on that call instruction:

If we step into this call instruction, we’ll get to the OEP (original entry point) of the payload! Inspect the Memory Map and you’ll see a new region of memory with protection class ERW (Execute-Read-Write):

This new memory segment (highlighted in gray in the image above) contains our payload. Don’t believe me? Dump it from memory (right-click -> Dump to file) and take a look at the strings. You should see something like the following:

You’ll spot some interesting data like an IP address, a user agent string, registry keys, and so on. If you don’t see any cleartext strings, you likely dumped the payload too early (before the malware deobfuscated all the data in this memory region), or too late, after the malware cleared its memory. Start reading this blog post again and try again ☺

Let’s open this dump file in IDA. After opening the file in IDA, be sure to rebase it (Edit -> Segments -> Rebase Program) to match it to the memory address in x64dbg:

After opening this dumped payload in IDA, you’ll see some inconsistencies, however:

See the problem? Some call instructions are not resolved to function names. However, in x64dbg, these functions are labeled properly:

This is because in x64dbg, these function names are being resolved to addresses in memory. In our IDA dump, they are not mapped properly.

Normally, what I would do next is try get my IDA database as close as possible to the code in x64dbg. We could spend more time analysing the unpacking code to identify where the malware is resolving its imports and this may help us get a better dump of the payload. Or, we could automate this by writing a python script to export all function names from x64dbg and import them into IDA. But why spend 1 hour automating something when we can spend 2 hours doing it manually? 🙂

We can manually fix this up IDA by cross-referencing each unknown function with the function name in x64dbg. For example, at address 0x1B1042 there is a call to InternetOpenA (according to our x64dbg output) and address at 0x1B107B is a call to InternetConnectA.

And now, we have something a lot more readable in IDA:

After you spend a bit of time manually renaming the unknown functions in your IDA database file, you should have some fairly readable code. Congrats! You unpacked Strela’s payload. Spend some time analysing the payload and see what you can learn about this sample.

Happy reversing! 🙂

— d4rksystem

Creating Quick and Effective Yara Rules: Working with Strings

Creating Quick and Effective Yara Rules: Working with Strings

This is a quick post to outline a few ways to extract and identify useful strings for creating quality Yara rules. This post focuses on Windows executable files, but can be adapted to other files types. Let’s start with an overview of the types of strings we are interested in when developing Yara rules.

tl;dr

In this post, you will learn:

  • How to extract ASCII and Encoded strings from malware samples.
  • How to analyse strings from a malware sample set and choose strings for your Yara rule.
  • Tips and other tools to assist in Yara rule creation.

ASCII vs. Encoded Strings

Windows executables normally contain both ASCII and encoded strings. A “string” typically refers to a sequence of alphanumeric and special characters arranged in a specific order. Strings are used to represent various types of data, including file names, paths, URL’s, and other content within files. ASCII and encoded strings refer to different concepts in the context of character representation.

An ASCII string is a character encoding standard that uses numeric codes to represent characters. ASCII is a straightforward encoding, but it has limitations when it comes to representing characters from other languages or special symbols. Encoded strings generally refer to representing text using a specific character encoding scheme, such as Unicode (16-bit Unicode Transformation Format, or UTF-16, and sometimes referred to as “wide” strings) which is standard in Windows executable files. When writing Yara rules for Windows executables, we normally want to focus on both ASCII and Unicode strings. So, how do we extract these strings from an executable file? Glad you asked.

Extracting Strings

The simplest way to extract ASCII strings is using the strings tool in Linux/Unix (also available in Windows and MacOS). Execute the command on your malware executable target, and save the output to a text file like so:

strings -n 4 malware.exe > malware-ascii-strings.txt

Encoded strings are also easy to extract:

strings -n 4 -e l malware.exe > malware-encoded-strings.txt

Once we have our strings, let’s dump them into a Yara rule, shall we? Heh… Not so fast, cowboy. We have some strings analysis work to do first.

Analyzing Strings

One of the challenges with using strings for detecting malware is that there are so.. many.. strings. A single executable file could have thousands. How do we know the good strings, from the bad strings, from the ugly strings? How can we know which to include in our Yara rule?

If you have a single malware executable, you’ll have lots of strings to dig through (depending on the size of the executable file, of course). The trick is to identify the strings that are likely related to the malware itself, while disregarding and filtering out the strings that are not directly related to the malware that we may not be interested in (such as compiler data and code, common strings that also reside in benign files, etc.). It takes experience to know what to look for and what to ignore.

If you have a number of files of the same malware family, this process can be a bit more efficient. What we need to do is gather our malware sample set, extract all strings from these samples, and compare these strings to identify the strings we should zero in on for our Yara rule.

This malware sample set must meet the following requirements:

  • The malware samples should be part from the same malware family. For example, if you are developing a Yara rule for Ryuk ransomware, all samples should be Ryuk ransomware, otherwise bad samples/strings will taint your Yara rule.
  • The malware samples should be unpacked/deobfuscated. If the samples are packed, encrypted, obfuscated, etc., you are no longer writing a Yara rule for the malware itself, but rather for the packer/obfuscator. If this is your intention, that’s perfectly fine, as there are valid use cases for this as well!
  • The malware samples should be of the same file type. It’s not a good idea to mix Windows executables with MS Office documents, for example.
  • The more malware samples you have in your set, the more accurate your Yara rule could be.

We can extract and analyse all strings in a malware sample set with a one-liner command. First, make sure you have your malware samples together in one directory called “samples”. (I am assuming you are on a *Nix system here, but the following command can be adapted for Windows as well with a bit of work):

for file in $(ls ./samples/*); do strings -n 4 $file | sort | uniq; done | sort | uniq -c | sort -rn > count_malware_strings.txt

In the above command, we create a for loop that iterates over all files in our samples directory (“samples”). Each file’s strings are extracted and sorted, and finally we append a “count” value to each string and save this to a text file “count_malware_strings.txt”. Here is a screenshot of the result:

You may be able to spot some interesting strings. The number “9″ next to each line denotes the number of samples this string resides in. My sample set consists of 9 samples, so each string with a 9 next to it means that this string resides in all my malware samples!

We should also run this same command, but for encoded strings:

for file in $(ls ./samples/*); do strings -n 4 -e l $file | sort | uniq; done | sort | uniq -c | sort -rn > count_malware_strings_encoded.txt

Here is the result:

See any interesting strings here? Perhaps the references to WMI (SELECT * …), the sandbox-related strings (“sandbox”), and strings such as “Running Processes.txt”?

Selecting Strings for the Yara Rule

So, now we have a much better idea of what strings to use in our Yara file. Ideally, we’ll want to select strings that are in all or most of the sample set. Selecting strings that are in only one file may result in lots of false-positives (depending on what type of rule you are creating and what your objectives are, of course). However, selecting only strings that appear in all files may result in your Yara rule being too specific. Again, this will depend on your objectives for the rule.

Consider also that even though you are dealing with malware, there will be “benign” strings (sometimes called “goodware strings”) in these files that are not part of the malware’s code or functionalities. You’ll likely want to weed these out. Optionally, you could create a goodware strings database or list that simply contains strings you wish to exclude from your Yara rules. But this is a topic for another day.

Creating our Yara Rule

Based on the strings I observed in the strings text files I created previously, I chose the following strings and created my basic Yara rule:

Notice how I added the “wide” attribute to some of the strings. This tells Yara that these are encoded strings. For the conditions at the bottom, I am specifically looking for samples that have the header bytes 0x5A4D (meaning a Windows PE file), and the sample must have 15 or more of these strings residing in them. Lowering this number will result in more of a “hunting” rule, where you may catch additional malware (with a wider net) but have more false positives. Increasing this number will create a higher-fidelity rule, but may be too specific.

Other Tools and Tips

Here are a few other random tips/tricks for dealing with strings in Yara rules:

PE Studio – PE Studio is a great PE executable file analysis tool that also has a nice “goodware” and “malware” strings database built-in. You can open an executable file in PE Studio and the tool will provide you with some hints on which strings may be interesting.

Strings-Sifter – A tool created by Mandiant, it can “sift” through strings and sort them based on how unique or “malicious” they are. This is very useful for quickly identifying the interesting strings.

Yargen – A full-on cheatmode for Yara rules. Yargen is a tool from Florian Roth that takes an input sample set and automatically generates Yara rules based on interesting strings or code in the files. This is a great tool if you are pressed for time or if you have lots of rules to create. However, nothing beats a well-tuned, manually-written rule (in my humble, old-school, boomer opinion). Also, if you are new to Yara and/or malware analysis, stay away from the automatic tools and just do it manually, please 🙂

Conclusion

I hope this short post helps you create better Yara rules! If you have further suggestions or ideas, send them to me and I may include them in this post or in future posts!

@d4rksystem

Analysis of the NATO Summit 2023 Lure: A Step-by-Step Approach

Analysis of the NATO Summit 2023 Lure: A Step-by-Step Approach

Author: @d4rksystem

It has been a while since I’ve touched a malicious RTF document and I’ve been itching to refresh my knowledge in this area. The tricky part was finding a maldoc worth investigation. Well, my luck recently changed – along came a maldoc lure that targeted guests of the 2023 NATO Summit in Lithuania in July. I found a maldoc worthy of my time.

BlackBerry wrote a great post of the analysis of the entire attack chain, but glossed over the analysis of the first-stage lure, which is what prompted me to analyze this further. Note that I’ll only be covering the 1st-stage lure in this post. The filename of the file I am investigating in this post is “Overview_of_UWCs_UkraineInNATO_campaign.docx”. The document is available on VirusTotal:

SHA256: a61b2eafcf39715031357df6b01e85e0d1ea2e8ee1dfec241b114e18f7a1163f

Analysis of the Lure

Upon initial inspection, this MS Word document does indeed appear to be quite targeted:

To begin my analysis, I first executed the document in a Windows 10 VM while capturing network traffic in Fiddler. The screenshot below shows connections to two IP addresses:

The first connection seems to be an HTTP OPTIONS request to 104.234.239.26.

Edit 1: I was informed from a reader (@k0ck4) that the malware is also making SMB connections to the remote server. This is true – the malware attempts to connect to the remote server via SMB, and following this makes an HTTP OPTIONS request. I was not able to get the malware to connect to the server (likely, the server is offline) but according to the strings in the RTF document objects, it attempts to download a file (more on this later!). The following screenshot from Wireshark shows the SMB connections:

The second connection is for another IP, 74.50.94.156. The second IP appears to be downloading a file (start.xml). For fun, I queried these IP’s in Shodan to see if there is anything interesting. Fun fact: The 74.50.94.156 IP is using WinRM and other services and has some interesting data exposed. (I blurred the data out, but you can check it out on Shodan if interested):

Lets dig deeper into this Document file. I switched over to Remnux VM, and used the tool zipdump to get an idea of what this file’s contents are.

There is definitely something in this document. An embedded RTF file (index 13) which seems to be titled “afchunk.rtf”! Let’s extract it:

(Since this command is a bit hard to read, here it is in text):

zipdump.py -d -s 13 <target_file_name> >> afchunk.rtf

Analysis of “Afchunk.rtf”

Let’s switch over to the rtfdump tool to see what is inside this RTF file:

It looks like we have three potential embedded objects. The first object (index 147) has a size of 0 bytes… interesting. The second object (index 152) appears to be an “OLE2LINK” object. And the third object (index 161) has the designation “SAXXMLReader”.

While rtfdump, rtfobj, and similar tools are extremely valuable, they are reliant on malware authors behaving properly. Some RTF malware may be able to hide objects from these tools or otherwise obfuscate the data inside. For this reason, I almost always look into the raw data of the file to make sure my findings align. To start, I ran the strings tool on the afchunk.rtf file (command: strings afchunk.rtf). A few things pop out:

There are what appears to be two objects embedded in this RTF file, denoted by the highlighted “objdata” tags. The first objdata tag is succeeded by a blob of hex data. If we copy this hex and transform it to ASCII, we would see some interesting things – but we’ll extract the object in a moment. This objdata tag is preceded by the string “Word.Document.8” which informs us that this may be an embedded Word document.. However, the standard DOC magic bytes (“D0CF11E”) are missing from the hex data. This object seems to be potentially malformed – this could be purposefully done, so as to mislead analysts and automated tools.

The second objdata tag contains another hex blob, but this time we see the “D0CF11E” magic bytes, which denotes an embedded document file or OLE (Object Linking & Embedding) object.

OLE is a way for different programs to exchange data between them. Imagine you have a document in MS Word, and you want to include a chart or a spreadsheet from MS Excel. OLE enables this. The chart or spreadsheet becomes an OLE object. You can learn more about OLE here. In the case of maldocs, malware authors often link or embed malicious objects into otherwise benign RTF documents as a way to hide them and stealthily execute evil activity.

Let’s dump the objects we discovered to disk. The rtfobj tool can help with this:

This command displays the objects inside this RTF file, and dumps them to separate files so we can analyze them. As we suspected, the first object (ID 0 in this output) states that the object is “Not a well-formed OLE object”. The second object (ID 1) has a class name of “OLE2LINK”, a type of OLE object. As a fun homework assignment, Google “OLE2LINK” – the first thing you’ll see is a list of vulnerabilities affecting this object type.

So, let’s take a look at the embedded objects we just extracted.

Analysis of Embedded Object 1

Viewed in a hex editor, Object 1 contains some interesting strings, notably: The IP “104.234.239.26” and the URI path “\share1\MSHTML_C7\file001.url”. When the afchunk.rtf file executes, this embedded object also executes, forcing MS Word to send an HTTP request to this remote server. We’ll discuss this more in a moment.

Edit 2: As a described in Edit 1, this document makes an SMB connection as well as the HTTP request. You can tell this is SMB by the reversed Windows SMB slashes (“\\” and “\”).

Analysis of Embedded Object 2

Similarly to Object 1, Object 2 can be viewed in a hex editor:

Viewing the second object in hex editor reveals another interesting string: “74.50.94.156”, as well as a URI path “/MSHTML_C7/start.xml”. This is the other IP we saw in our Fiddler traffic. As with the first embedded object, this second embedded OLE object also executes upon afchunk.rtf executing, and similarly tricks MS Word into contacting a remote web server. How does this work? I am glad you asked.

These embedded objects seem to be taking advantage of a known older vulnerability called (CVE-2017-0199). According to Microsoft, this vulnerability “exists in the way that Microsoft Office and WordPad parse specially crafted files. An attacker who successfully exploited this vulnerability could take control of an affected system. An attacker could then install programs; view, change, or delete data; or create new accounts with full user rights.” Sounds quite dangerous… and very generic.

Digging deeper, I found publicly available exploit code for this vulnerability. If you compare the exploit’s payload code to the strings in the RTF document, you can see some similarities. The malware authors perhaps even re-used some of this exploit code for their own maldoc. For example, the following strings from the original RTF exploit code were existent in this document:

The Next Stages

During the time of my analysis, I could not contact the IP’s directly, so I could not obtain the files as they were supposed to be downloaded (via exploitation of the RTF document and MS Word). However, I was able to obtain them from VirusTotal.

The file hosted on 104.234.239.26 is another MS Word file that renders an iframe in preparation for the next stage of the attack. The file hosted on 74.50.94.156 is an XML file containing a weaponized iframe that is then loaded into MS Word. This malicious iframe is part of the CVE-2022-30190 vulnerability, and sets the stage for the later stages of this attack. 

Since the goal of this blog post was simply to show one methodology for analyzing an RTF file, I won’t go into detail on the later stages of this attack. You can read it on the BlackBerry blog post.

For further reading, I found a good older article from the researchers at Nviso. Additionally, McAfee researchers posted a great article on malicious RTF documents and how they work.

I hope you enjoyed! If you see any inconsistencies or errors in this post, please let me know! Also, if you have additional techniques, I am always happy to learn new ways of malware analysis! 🙂

@d4rksystem

How Malware Abuses the Zone Identifier to Circumvent Detection and Analysis

How Malware Abuses the Zone Identifier to Circumvent Detection and Analysis

I was investigating a malware sample that uses an interesting trick to circumvent sandboxes and endpoint defenses by simply deleting its zone identifier attribute. This led me on a tangent where I began to research more about zone identifiers (which, embarrassingly enough, I had little knowledge of prior). Here are the results of my research.

The Zone.Identifier is a file metadata attribute in Windows that indicates the security zone where a file originated. It is used to indicate a level of trustworthiness for a file when it is accessed, and helps Windows determine the security restrictions that may apply to the file. For example, if a file was downloaded from the Internet, the zone identifier will indicate this, and extra security restrictions will be applied to this file in comparison to a file that originated locally on the host.

The zone identifier is stored as an alternate data stream (ADS) file, which resides in the file’s metadata. There are five possible zone identifier values that can be applied to a file, represented as numerical values of 0 to 4:

  • Zone identifier “0”: Indicates that the file originates on the local machine. This file will have the least security restrictions.
  • Zone identifier “1”: Indicates that the file originated on the local Intranet (local network). Both zone identifier 0 and 1 indicate a high level of trust.
  • Zone identifier “2”: Indicates that the file was downloaded from a trusted site, such as an organization’s internal website.
  • Zone identifier “3”: Indicates that the file was downloaded from the Internet and that the file is generally untrusted.
  • Zone identifier “4” – Indicates that the file came from a likely unsafe source. This zone is reserved for files that must be treated with extra caution as they may contain malicious content or pose a security risk.

You can use the following PowerShell command to check if a file has a zone identifier ADS:

Get-Item <file_path> -Stream zone*

An example of this output can be seen below. Notice the highlighted area that denotes the ADS stream (“Zone.Identifier”) and its length. Also note that if no data is returned after running this command, the file likely does not have a zone identifier stream.

To view this file’s zone identifier stream, you can use the following PowerShell one-liner:

Get-Content <file_path> -Stream Zone.Identifier

An example of this can be seen below:

A zone identifier file will look like something like this:

[ZoneTransfer]
ZoneId=3
ReferrerUrl=https://www.evil.com
HostUrl=https://download.evil.com/malware.doc

In this example, this Zone.Identifier indicates that the associated file originates from “zone 3”, which typically corresponds to the Internet zone. The ReferrerUrl denotes the domain of the webpage where the file was downloaded from or potentially the referrer domain, and the HostUrl specifies the precise location where the file was downloaded from.

These zones are also referred to as the Mark of the Web (MoTW). Any file that originates from Zone 3 or Zone 4, for example, are said to have the mark of the web.

Malware can abuse the zone identifier in a few different ways, with a couple different goals:

Defense Evasion

Malware can manipulate the zone identifier value to spoof the trust level of a file. By assigning a lower security zone to a malicious file, the malware can trick Windows and defense controls into treating the file as if it came from a trusted source.

To accomplish this, malware can simply modify its files’ zone identifiers. Here is how this can be accomplished via PowerShell:

Set-Content file.exe -Stream Zone.Identifier -Value "[ZoneTransfer]`nZoneId=1"

This PowerShell one-liner modifies a file’s zone identifier to be a certain value (in this case, setting the zone ID to “1”). This may help the malware slip past certain defensive controls like anti-malware and EDR, and may make the malware look less suspicious to an end user.

Or, the zone identifier stream can simply be deleted, which may trick some defense controls. In order to attempt to bypass defenses, a variant of the malware family SmokeLoader does exactly this. SmokeLoader calls the Windows API function DeleteFile (see code below) to delete its file’s zone identifier stream. You can investigate this for yourself in a SmokeLoader analysis report from JoeSandbox (SHA256: 86533589ed7705b7bb28f85f19e45d9519023bcc53422f33d13b6023bab7ab21).

DeleteFileW (C:\Users\user\AppData\Roaming\ichffhi:Zone.Identifier)

Alternatively, malware authors can wrap their malware in a container such as a IMG or ISO file, which do not typically have zone identifier attributes. Red Canary has a great example in this report.

Anti-Analysis and Sandbox Evasion

Malware may inspect the zone identifier of a file to circumvent analysis. Malicious files that are submitted to an analysis sandbox or are being analysed by a reverse engineer may have a different zone identifier than the original identifier the malware author intended. When the malware file is submitted to a sandbox, the zone identifier may be erroneously set to 0, when the original value is 3. If malware detects an anomalous zone identifier, it may cease to execute correctly in the sandbox or lab environment.

The pseudo-code below demonstrates the logic of how malware may check its file’s zone identifier:

zone_identifier_path = current_file_path + ":Zone.Identifier"

with open(zone_identifier_path, 'r') as f:
     zone_info = f.read()

     # Check if the zone is Internet zone (zone ID 3 or higher)
     if "ZoneId > 2" in zone_info:
          
          # File is from the Internet zone (as expected), continue running
          return()

     else:
          # File may be running in a sandbox or analysis lab!
          TerminateProcess()

If you are craving more information on this topic, other good resources are here and here.

— Kyle Cucci (d4rksystem)

Book Summary – “Evasive Malware: Understanding Deceptive and Self-Defending Threats”

Book Summary – “Evasive Malware: Understanding Deceptive and Self-Defending Threats”

Since my new book “Evasive Malware: Understanding Deceptive and Self-Defending Threats” pre-order just launched, I wanted to write up a quick summary of the book, including what you’ll learn, the book’s target audience, and a breakdown of each section in the book. Let’s get started!

What is this book about?

“Evasive Malware: Understanding Deceptive and Self-Defending Threats” is a book about the fascinating and terrifying world of malicious software designed to avoid detection. The book is full of practical information, real-world examples, and cutting-edge techniques for discovering, reverse-engineering, and analyzing state-of-the-art malware, specifically malware that uses evasion techniques.

Beginning with foundational knowledge about malware analysis in the context of the Windows OS, you’ll learn about the evasive maneuvers that malware uses to determine whether its being analyzed and the tricks they employ to avoid detection. You’ll explore the ways malware circumvents security controls, such as network or endpoint defense bypasses, anti-forensics techniques, and malware that deploys data and code obfuscation. At the end of the book, you’ll learn some methods and tools to tune your own analysis lab and make it resistant to malware’s evasive techniques.

What will you learn?

  • Modern malware threats and the ways they avoid detection
  • Anti-analysis techniques used in malware
  • How malware bypasses and circumvents security controls
  • How malware uses victim targeting and profiling techniques
  • How malware uses anti-forensics and file-less techniques
  • How to perform malware analysis and reverse engineering on evasive programs

Who is this book for?

This book primarily targets readers who already have at least a basic understanding and skill-set in analyzing malware and reverse-engineering malicious code. This book is not a beginner course in malware analysis, and some prior knowledge in this topic is assumed. But have no fear – the first three chapters of this book consist of a crash-course in malware analysis and code analysis techniques.

Here are some of the practical applications of this book:

  • Malware Analysts and Researchers – Learn how modern and advanced malware uses evasion techniques to circumvent your malware lab and analysis tools.
  • Incident Responders and Forensicators – Learn how advanced malware uses techniques like anti-forensics to hide its artifacts on a host. Understanding these techniques will help improve incident response and forensics skills.
  • Threat Intellgience Analysts– Learn how bespoke, targeted, and cybercrime malware uses evasion techniques to hide and blend into its target environment.
  • Security Engineers / Security Architects – Learn how malware evades the host and network defenses that you design, engineer, and implement.
  • Students and Hobbyists – Learn how modern, advanced malware operates. If you read and actually enjoy this book, then you now know that you should pursue a job in malware research 😉

This book consists of five sections (parts), each consisting of three or more chapters. Let’s take a brief look at each of these.

Part 1: The Fundamentals

Part 1 contains the foundational concepts you’ll need to know before digging into the rest of the book. The topics include the fundamentals of how the Windows operating system works, and the basics of malware analysis, covering sandbox and behavioral analysis to static and dynamic code analysis.

Chapters in Part 1:

  • Chapter 1: Windows Foundational Concepts
  • Chapter 2: A Crash Course in Malware Triage and Behavioral Analysis
  • Chapter 3: A Crash Course in Static and Dynamic Code Analysis

What you’ll learn:

  • What evasive malware is and why malware authors use evasion techniques in their malware.
  • The fundamentals of Windows OS internals.
  • A crash course in malware analysis and reverse engineering, covering the basics of malware sandbox analysis and behavioral analysis, and static and dynamic code analysis. 

Part 2: Context-Awareness and Sandbox Evasion

Part 2 starts getting into the good stuff; How malware is able to detect sandboxes, virtual machines, and hypervisors, and circumvent and disrupt analysis.

Chapters in Part 2:

  • Chapter 4: Enumerating Operating System Artifacts
  • Chapter 5: User Environment and Interaction Detection
  • Chapter 6: Enumerating Hardware and Network Configurations
  • Chapter 7: Runtime Environment and Virtual Processor Anomalies
  • Chapter 8: Evading Sandboxes and Disrupting Analysis

What you’ll learn:

  • How malware detects hypervisors by inspecting operating system artifacts.
  • How malware detects virtual machines by looking for runtime anomalies.
  • How malware tries to detect a real end user in order to identify if it’s running in a sandbox.
  • How malware actively circumvents analysis by exploiting weaknesses in sandboxes or directly interfering or tampering with the analyst’s tooling. 

Part 3: Anti-Reversing

Part 3 covers the many techniques malware may use to prevent or impede reverse-engineering of its code, such as complicating code analysis, disrupting debuggers, and causing confusion and misdirection.

Chapters in Part 3:

  • Chapter 9: Anti-disassembly
  • Chapter 10: Anti-debugging
  • Chapter 11: Covert Code Execution and Misdirection

What you’ll learn:

  • How malware authors implement anti-disassembly techniques and how you can overcome them. 
  • How anti-debugging techniques work, and how to identify these techniques while analyzing malware. 
  • How malware utilizes covert code execution and misdirect techniques to confuse malware analysts and slow down the reversing process. 

Part 4: Defense Evasion

Chapters in Part 4:

  • Chapter 12: Process Injection, Manipulation, and Hooking
  • Chapter 13: Evading Network and Endpoint Defenses
  • Chapter 14: An Introduction to Rootkits
  • Chapter 15: Fileless Malware and Anti-forensics

What you’ll learn:

  • How malware implements modern process injection and manipulation techniques to circumvent defenses.
  • How malware actively and passively circumvents and bypasses modern endpoint and network defenses like EDR/XDR.
  • The basics of rootkits and how they evade defenses. 
  • How always uses living-off-the -and techniques to remain undetected and blend into the environment. 
  • Anti-forensics techniques and how advanced malware hides from forensics tooling and investigators .

Part 5: Other Topics

Finally, Part 5 covers additional techniques and topis that did not fit in well with the other chapters. This section covers topics like obfuscating malware and malicious behaviors via encoding and encryption, how packers work and how to unpack malware, and how to make your malware analysis lab a bit more resilient to evasive malware.

Chapters in Part 5:

  • Chapter 16: Encoding and Encryption
  • Chapter 17: Packers and Unpacking Malware
  • Chapter 18: Tips for Building an Anti-evasion Analysis Lab

What you’ll learn:

  • How malware implements obfuscation and encryption to complicate analysis and hide malicious activity, and how to analyze obfuscated code. 
  • How malware uses packers and crypters, and how to analyze packed malware. 
  • How to configure and tune your analysis lab to help streamline analysis of malware that may be detecting your lab environment.

Pre-Order the Book!

If you decide to legally purchase my book (instead of pirating it), it would be much appreciated. I need to buy beer, a new gaming PC, feed my family, you know, important stuff.

How to pre-order:

  • You can order the book directly from the No Starch Press publisher website. If you order from No Starch, you also can get access to an Early Access version of the book, as well as the finished book!
  • You can order on Amazon. Sometimes Amazon has deals and this may be cheaper, but you do not get access to the Early Access version. Amazon ships to many places in the world, so this is an advantage.
  • There are other sites you can order from as well, such as local bookstores. Just Google “Evasive Malware book”.

If you decide to pre-order the Early Access version of my book, I would love your feedback! If you spot technical errors, spelling and grammar errors, or even if you just want to tell me “It’s amazing!” or “It sucks!”, I want to hear your feedback 🙂 Feel free to contact me via Twitter or LinkedIn.

A lot of love for the infosec community went into this book, so I hope you enjoy it! 🙂

Malware Analysis in 5 Minutes: Identifying Evasion and Guardrail Techniques with CAPA

Malware Analysis in 5 Minutes: Identifying Evasion and Guardrail Techniques with CAPA

Modern malware has gotten better and better at detecting sandbox and analysis environments, and at evading these environments. Malware can circumvent defenses, sandboxes, and analysts by using various techniques such as VM detection, process injection, and guardrails.

In particular, guardrails are one or more artifacts that malware looks for on the host before executing its payload. These artifacts may be specific registry keys, files, directories, network configurations, etc. If these specific artifacts do not exist on the host, the malware may assume it is running in an analysis lab, or is otherwise not the right target for infection.

One of the most tedious processes when investigating malware that is evading your sandboxes or tooling is figuring out what techniques the malware is using for this, and where in the code this occurs. CAPA can help automate this process.

CAPA is a tool written by the FireEye/Mandiant FLARE team that can be used to quickly triage and assess capabilities of a malware sample.

For this example, I have a sample that will not run in my sandboxes or in my analysis VM’s and I am trying to figure out why. Let’s throw this sample into CAPA:

capa path/to/sample.exe

CAPA provides a nice summary of the potential ATT&CK techniques the malware is using, along with its identified capabilities. This assessment can help in many malware analysis situations, but here the focus is on evasion techniques.

Based on this initial analysis, we can see several possible techniques being used, such as:

  • Executing anti-VM instructions
  • Hashing and data encoding (could be used to hide strings)
  • Checking if a certain file exists (could be used for creating guardrails)
  • Getting the hostname (could also be used for guardrails)
  • Multiple process injection techniques

We can get additional information from CAPA by using the verbose mode:

capa path/to/sample.exe -vvv

Now we can focus on a few of these techniques and where they reside in code:

capa-anti-vm-instructions.png

CAPA identified two uses of the CPUID instruction, which can be used to identify a virtual machine environment. We can now throw this sample into a disassembler and locate this code by jumping to the addresses listed in CAPA:

If we wanted to bypass this detection technique, we could NOP out (remove) the CPUID instructions, or modify their return values. More about the CPUID instruction can be seen here and here.

Additionally, CAPA identified the addresses in the binary where process injection behaviors may be occurring:

With this information, along with the offset addresses provided, we can set breakpoints on these addresses or instructions for analysis in a debugger. For more info on these process injection techniques, this write-up is old but still very relevant.

Finally, I suspect this sample is using some sort of guardrails. Guardrails are a technique used by malware to prevent sandbox analysis, hamper manual analysis, evade host defenses, and prevent unnecessary “spreading” of the malware.

As previously identified by CAPA, this sample may be using the system hostname and files/directories as guardrails. It also likely that it has hardcoded hashes of those guardrails in order to make it difficult for analysts to spot what the malware is specifically looking for:

CAPA identified that this sample is checking for a specific file at function offset 0x1400012C1, and the hostname at 0x140001020. Let’s inspect the hostname query in the sample in a dissembler. Once Ghidra disassembles this function, this is what is displayed:

In Ghidra, we can see that the sample is calling GetComputerNameA in order to get the domain hostname of the victim. It then hashes this hostname (CryptCreateHash, CryptHashData) and compares it to a hardcoded hash using memcmp (memory compare).

memcmp-comparing-hashes.png

This instruction is comparing the DAT_target_hash (the hash of the hostname that the malware is expecting) to hashed_domain_name (the actual hostname of the victim). If these hashes do not match, the sample will terminate itself.

Since the target hash is hardcoded in the binary and will not be “un-hashed” in memory, we don’t really know what this malware sample is looking for. Our best option here is to bruteforce the hash using a rainbow table or wordlist.

Or… we can simply bypass this hash checking functionality altogether. With this information from CAPA, we can now patch the binary (in a disassembler or in a debugger) in order to completely bypass these VM detection and guardrail techniques, and allow our sample to run in our VM. We can do this by NOP’ing out instructions, modifying the function return values, or skipping the code altogether by jumping over the suspect code.

Happy reversing!

Kyle Cucci