📚 Part of the Shellcode Analysis for Security Researchers: A Complete Guide series.

Malware Analysis & Reverse Engineering: A Comprehensive Toolkit Workflow

In today's threat landscape, the ability to analyze and reverse engineer malicious software is critical for security teams, incident responders, and threat researchers. This comprehensive guide walks through a systematic malware analysis workflow that combines static analysis, dynamic analysis, and behavioral examination to identify threats, extract indicators of compromise (IOCs), and correlate findings with threat intelligence.

IOC Extractor

Extract indicators of compromise (IPs, domains, URLs, hashes) from threat intel

Open the full IOC Extractor tool →

Loading interactive tool...

JavaScript Required

This interactive tool requires JavaScript to function. Please enable JavaScript in your browser to use the full features.

The tool description and documentation above provide information about this tool's capabilities. For the best experience, please enable JavaScript and refresh the page.

Whether you're a SOC analyst investigating a suspicious file, a security researcher tracking emerging threats, or an incident responder containing an active breach, this workflow provides the methodology and tools needed to thoroughly analyze malware samples from initial triage through final reporting.

IOC Extractor

Extract indicators of compromise (IPs, domains, URLs, hashes) from threat intel

Open the full IOC Extractor tool →

Loading interactive tool...

JavaScript Required

This interactive tool requires JavaScript to function. Please enable JavaScript in your browser to use the full features.

The tool description and documentation above provide information about this tool's capabilities. For the best experience, please enable JavaScript and refresh the page.

Understanding the Analysis Landscape

Malware analysis exists on a spectrum from rapid automated triage to deep manual reverse engineering. Modern threats employ sophisticated evasion techniques including polymorphism, anti-debugging measures, virtual machine detection, and multi-stage payload delivery. A comprehensive analysis workflow must address all these challenges while maintaining analyst safety and operational security.

The workflow presented here spans approximately 5-15 hours for a complete analysis, though simple samples may be triaged in under an hour while advanced persistent threats (APTs) may require weeks of investigation. The key is understanding when to escalate from automated analysis to manual reverse engineering.

Time Investment by Complexity:

Simple droppers/scripts: 1-3 hours
Packed trojans: 3-6 hours
Sophisticated ransomware: 6-12 hours
APT toolkits: 12+ hours to weeks

Prerequisites and Safety Requirements

Before beginning any malware analysis, establish these critical safety measures:

Analysis Environment Isolation

Never analyze malware on production systems. Use fully isolated virtual machines with the following configuration:

Network Isolation: Disconnect from production networks or use isolated virtual networks
Snapshot Management: Take VM snapshots before each analysis stage
Disposable Systems: Delete and rebuild analysis VMs after each sample
Host-Only Networking: If network simulation needed, use FakeNet-NG or INetSim
Clipboard Isolation: Disable clipboard sharing between host and guest

Recommended Analysis Platforms

REMnux Linux Distribution: Ubuntu-based distro with 100+ malware analysis tools pre-installed. Ideal for static analysis, script deobfuscation, and network simulation.

FlareVM: Windows-based malware analysis environment from FireEye/Mandiant. Contains IDA Free, x64dbg, PE analysis tools, and monitoring utilities. Use for analyzing Windows malware in its native environment.

Hybrid Approach: Use REMnux for initial triage and static analysis, then move to FlareVM for dynamic analysis and Windows-specific reverse engineering.

Legal and Ethical Considerations

Only analyze malware obtained through legitimate channels (incident response, threat feeds, malware repositories)
Respect intellectual property and licensing restrictions on analysis tools
Follow responsible disclosure practices when sharing findings
Coordinate with legal/compliance before sharing IOCs externally
Document chain of custody for forensic investigations

Stage 1: Initial Triage & Safe Environment Setup

Duration: 15-30 minutes

Initial triage establishes the foundation for your analysis. The goal is to quickly classify the sample, determine its risk level, and decide on the appropriate analysis approach without executing the malware.

Step 1: File Type Validation with Magic Number Analysis

Malware commonly uses file extension spoofing to bypass basic filters. A file named invoice.pdf might actually be a Windows executable. The File Magic Number Checker examines the file's actual binary signature to reveal its true type.

Magic Number Examples:

PE executables: 4D 5A (MZ header)
ELF binaries: 7F 45 4C 46 (.ELF)
ZIP archives: 50 4B 03 04 (PK..)
PDF documents: 25 50 44 46 (%PDF)
JPEG images: FF D8 FF E0 or FF D8 FF E1

Polyglot Detection: Some sophisticated malware combines multiple file formats (e.g., valid PDF + embedded PE) to evade analysis. Magic number analysis reveals these dual-format files.

Practical Example:

File: "Resume.pdf"
Extension suggests: PDF Document
Magic number: 4D 5A 90 00
Actual type: PE32 executable (Windows)
Verdict: SUSPICIOUS - Extension spoofing detected

This immediate mismatch flags the file for deeper investigation and prevents accidental execution.

Step 2: Hash Generation and Reputation Checking

Generate cryptographic hashes to create unique identifiers for the sample and check against known malware databases using the Hash Generator.

Hash Types and Use Cases:

MD5 (128-bit): Fast but collision-prone. Still widely used in threat intel feeds.

MD5: 44d88612fea8a8f36de82e1278abb02f

SHA-1 (160-bit): Better collision resistance than MD5. Common in VirusTotal lookups.

SHA-1: 3395856ce81f2b7382dee72602f798b642f14140

SHA-256 (256-bit): Industry standard for file integrity. No known collisions.

SHA-256: 8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92

VirusTotal Integration: Submit hashes (never the file itself during initial triage) to check if the sample has been previously analyzed. A detection ratio like "45/70" indicates 45 of 70 antivirus engines flagged it as malicious.

Threat Intelligence Correlation: Use Hash Lookup to query multiple threat intelligence sources:

AlienVault OTX (Open Threat Exchange)
Abuse.ch MalwareBazaar
Hybrid Analysis
MISP communities

Known Sample Example:

SHA-256: 2c21b4b39df46e38868a7a5329b8faa8...
VirusTotal: 67/71 detections
Family: Emotet (epoch 5)
First seen: 2021-03-15
Campaign: TA542 phishing

If the sample is well-known, you can often skip detailed analysis and jump to remediation using existing IOCs and YARA rules.

Step 3: Entropy Analysis for Packing Detection

Entropy measures the randomness of data within a file. High entropy (7.0-8.0) typically indicates compression, encryption, or packing—techniques malware uses to hide its true payload from antivirus signatures.

Use the Entropy Analyzer to calculate entropy across the entire file and within individual sections.

Entropy Interpretation:

0.0-3.0: Plain text, highly structured data
3.0-5.0: Compiled code (normal executables)
5.0-7.0: Compressed data or mixed content
7.0-8.0: Encrypted or packed data (SUSPICIOUS)

Practical Example - Packed Malware:

File: malware.exe (245 KB)
Overall entropy: 7.82

Section Analysis:
.text:  entropy 7.91 (SUSPICIOUS - executable code should be 4-6)
.data:  entropy 7.88 (SUSPICIOUS - encrypted data)
.rsrc:  entropy 3.21 (normal)

Verdict: Highly likely packed with UPX, ASPack, or custom packer
Action: Proceed to Stage 3 (Unpacking) before analysis

Entropy Sections in PE Files:

.text section: Should be 4.0-6.0 (normal x86 code)
.data section: Varies based on content (3.0-6.0 typical)
.rsrc section: Often low entropy (bitmaps, icons)

If the .text section shows entropy above 7.0, the executable code is almost certainly packed or encrypted.

Step 4: VM Snapshot and Environment Preparation

Before proceeding to dynamic analysis stages, prepare your analysis environment:

Snapshot Creation:

# VMware Workstation
vmrun snapshot "C:\\VMs\\MalwareAnalysis.vmx" "Pre-Analysis-Clean"

# VirtualBox
VBoxManage snapshot "MalwareAnalysis" take "Pre-Analysis-Clean"

Network Configuration:

Set VM to "Host-Only" or "Internal Network" mode
Launch FakeNet-NG (Windows) or INetSim (Linux) for network simulation
Configure DNS to redirect all lookups to local analysis server

Monitoring Tool Setup (for later dynamic analysis):

Process Monitor (procmon.exe) - File/registry/process activity
Process Hacker - Process tree and memory inspection
Wireshark - Network packet capture
Regshot - Registry snapshot comparison

Stage 1 Deliverables

At the end of initial triage, you should have:

File Classification Report
- True file type (vs. claimed extension)
- File size and structural anomalies
Hash Fingerprints
- MD5, SHA-1, SHA-256 values
- VirusTotal detection ratio
- Known malware family identification (if available)
Packing Assessment
- Overall entropy score
- Section-by-section entropy analysis
- Packer identification (UPX, ASPack, etc.)
Initial Threat Level
- HIGH: Known malware family, high entropy, suspicious file type
- MEDIUM: Unknown sample, moderate entropy, requires deeper analysis
- LOW: Known benign, normal entropy, legitimate provenance

This triage determines whether to proceed with full analysis or escalate to specialized teams.

Stage 2: Static Analysis & File Structure Examination

Duration: 30-60 minutes

Static analysis examines the malware without executing it, extracting valuable intelligence from embedded strings, file structure, and metadata. This stage is safe (no code execution) and often reveals significant IOCs.

Step 1: String Extraction and Analysis

Malware contains hardcoded strings—URLs, IP addresses, file paths, encryption keys, and debug messages. Extracting these provides immediate intelligence about capabilities and infrastructure.

Use the String Extractor to pull both ASCII and Unicode strings from the binary.

String Categories to Identify:

1. Network Indicators:

http://malicious-c2.example.com/gate.php
192.168.1.100:8080
api.telegram.org
smtp.gmail.com

2. File System Paths:

C:\\Users\\Public\\svchost.exe
%APPDATA%\\Microsoft\\Windows\\Templates\\
\\\\.\\pipe\\namedpipe
C:\\Windows\\System32\\drivers\\malware.sys

3. Registry Keys:

HKLM\\Software\\Microsoft\\Windows\\CurrentVersion\\Run
HKCU\\Software\\Classes\\exefile\\shell\\open\\command

4. Debugging Artifacts:

[+] Connecting to C2...
[DEBUG] Payload decrypted successfully
Failed to inject into process
C:\\Projects\\Malware\\Release\\trojan.pdb

5. Encryption/Encoding Indicators:

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
(Base64 alphabet - indicates encoding/decoding)

Mozilla/5.0 (Windows NT 10.0; Win64; x64)
(User-agent strings for HTTP communication)

Real-World Example - Emotet String Analysis:

Extracted strings (sample): 4,892 total

Network IOCs:
- hxxp://185.157.82[.]211/wp-content/Dh9aQqY/
- hxxp://195.154.146[.]35/cgi-bin/Y3wd1/
- hxxp://217.182.143[.]207/images/k8wKN/

File Paths:
- %LOCALAPPDATA%\\Microsoft\\Windows\\
- C:\\Windows\\SysWOW64\\

Registry:
- Software\\Microsoft\\Windows\\CurrentVersion\\Run

Artifacts:
- service.exe
- regsvr32.exe /s

These URLs immediately identify C2 infrastructure for blocking. The registry key shows persistence mechanism. File paths reveal likely drop locations.

Step 2: Binary Structure Analysis with Hex Editor

The Hex Editor allows direct examination of the binary file structure, revealing details hidden from high-level analysis tools.

PE File Structure Components:

DOS Header (offset 0x00):

00000000: 4D 5A 90 00 03 00 00 00  MZ......

The "MZ" signature identifies Windows executables. The DOS stub is mostly vestigial but malware sometimes hides data here.

PE Header (offset varies, typically 0x80-0x100):

000000D0: 50 45 00 00 4C 01 06 00  PE..L...

"PE\0\0" marks the PE header. Following bytes indicate:

Machine type (0x4C01 = x64, 0x014C = x86)
Number of sections
Timestamp (often forged by malware)
Optional header size

Section Table:

Each section has characteristics flags:

IMAGE_SCN_MEM_EXECUTE (0x20000000): Executable code
IMAGE_SCN_MEM_WRITE (0x80000000): Writable data
IMAGE_SCN_MEM_READ (0x40000000): Readable

Suspicious Pattern - Writable + Executable Section:

Section: .text
Virtual Size: 0x00045000
Raw Size: 0x00045200
Characteristics: 0xE0000020 (READ | WRITE | EXECUTE)

VERDICT: SUSPICIOUS
Normal executables have RX (read+execute) .text sections.
Writable executable sections enable self-modifying code.

Import Address Table (IAT) Analysis:

The IAT lists external functions the malware calls. Suspicious APIs include:

Process Manipulation:

CreateRemoteThread - Code injection
WriteProcessMemory - Memory manipulation
VirtualAllocEx - Allocate memory in other processes

Anti-Debugging:

IsDebuggerPresent - Debugger detection
CheckRemoteDebuggerPresent
NtQueryInformationProcess

Persistence:

RegSetValueEx - Registry modification
CreateService - Service installation
ScheduleTask - Scheduled task creation

Network Communication:

InternetOpenA/W - HTTP communication
HttpSendRequestA/W - HTTP requests
WSAStartup - Raw socket communication

Example IAT from Ransomware:

KERNEL32.dll:
- CreateFileW (file access)
- WriteFile (file modification)
- FindFirstFileW (directory enumeration)
- CryptEncrypt (encryption)

ADVAPI32.dll:
- RegSetValueExW (persistence)
- LookupPrivilegeValueW (privilege escalation)

WININET.dll:
- InternetOpenA (C2 communication)
- HttpSendRequestA (data exfiltration)

This IAT signature immediately suggests ransomware behavior: file enumeration + encryption + persistence + C2 communication.

Step 3: IOC Extraction and Cataloging

Use the IOC Extractor to automatically parse the sample and extract indicators:

IOC Categories:

Network Indicators:

IPv4 addresses: 192.168.1.1, 10.0.0.0/8
IPv6 addresses: 2001:0db8:85a3::8a2e:0370:7334
Domain names: malware.com, c2.evil.net
URLs: http://attacker.com/payload.exe
Email addresses: attacker@evil.com

Host Indicators:

File paths: C:\\Windows\\Temp\\malware.exe
Registry keys: HKLM\\Software\\Malware
Mutex names: Global\\MalwareMutex
Service names: MalwareService

Cryptocurrency Indicators:

Bitcoin addresses: 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa
Monero addresses: 4AdUndXHHZ6cfufTMvppY6JwXNouMBzSkbLYfpAV5Usx3skxNgYeYTRj5UzqtReoS44qo9mtmXCqY45DJ852K5Jv2684Rge
Ethereum addresses: 0x742d35Cc6634C0532925a3b844Bc9e7595f0bEb

Practical IOC Extraction Example:

Analysis of: invoice.exe (SHA-256: 8d969eef...)

Network IOCs (12 found):
- 185.157.82.211
- 195.154.146.35
- 217.182.143.207
- update.adobe-flash-player.com (spoofed domain)
- hxxp://malware[.]cc/gate.php

File IOCs (8 found):
- %APPDATA%\\Microsoft\\svchost.exe
- C:\\Users\\Public\\readme.txt
- C:\\Windows\\Temp\\payload.dll

Registry IOCs (3 found):
- HKCU\\Software\\Microsoft\\Windows\\CurrentVersion\\Run\\WindowsUpdate
- HKLM\\System\\CurrentControlSet\\Services\\MalwareService

Cryptocurrency (1 found):
- 1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2 (Bitcoin ransom address)

Before sharing IOCs with team members or external threat intelligence platforms, defang them to prevent accidental clicks or execution.

Use the URL Defanger to convert IOCs to safe formats:

Defanging Transformations:

Original: http://malware.com/payload.exe
Defanged: hxxp://malware[.]com/payload.exe

Original: https://evil.net/gate.php?id=123
Defanged: hxxps://evil[.]net/gate[.]php?id=123

Original: 192.168.1.100
Defanged: 192[.]168[.]1[.]100

Original: attacker@evil.com
Defanged: attacker[@]evil[.]com

This prevents accidental navigation while keeping the IOC readable and searchable.

Step 5: PE Structure Deep Dive

Export Analysis:

If the malware is a DLL, examine its export table for exposed functions:

Exported Functions (malware.dll):
- ServiceMain (masquerading as legitimate service)
- DllRegisterServer (COM registration - persistence)
- InstallHook (keylogger installation)
- SendData (exfiltration function)

Legitimate DLLs have meaningful export names. Malware often uses generic names or mimics system DLLs.

Resource Section Analysis:

The .rsrc section contains embedded resources like icons, bitmaps, configuration files, and secondary payloads.

Use Resource Hacker or PE-bear to extract:

Resources found:
- ICON (mimics PDF icon - social engineering)
- RCDATA/CONFIG (encrypted C2 configuration)
- BINARY/PAYLOAD (embedded DLL or shellcode)

Timestamp Analysis:

PE headers contain compilation timestamps, though malware often forges these:

Compilation time: 1970-01-01 00:00:00 (Unix epoch)
Analysis: Timestamp zeroed - malware is hiding build time

vs.

Compilation time: 2024-12-15 03:47:22
Analysis: Recent build - potentially active campaign

Cross-reference timestamps with first-seen dates in threat intelligence to detect forgery.

Stage 2 Deliverables

Extracted Strings Report
- Categorized strings (network, filesystem, registry, crypto)
- Suspicious patterns identified
- Debug artifacts and PDB paths
Comprehensive IOC List
- IP addresses and domains (defanged)
- File paths and registry keys
- Mutex names and service identifiers
- Cryptocurrency addresses
PE Structure Analysis
- Section table with entropy per section
- Import Address Table (suspicious APIs highlighted)
- Export table analysis (if DLL)
- Resource extraction results
Anti-Analysis Technique Identification
- Debugger detection APIs
- VM detection patterns
- Timing checks (RDTSC)
- Obfuscation indicators

This guide was condensed for readability; deep-dive specifics live in the related guides above.

Malware Analysis & Reverse Engineering: A Comprehensive Toolkit Workflow