{"slug": "from-packed-binary-to-readable-code-a-hands-on-walkthrough-of-unpacking-analysis", "title": "From Packed Binary to Readable Code: A Hands-On Walkthrough of Unpacking, Shellcode Analysis, and Memory Forensics", "summary": "A developer documented a full malware analysis lab session, unpacking and analyzing packed binaries, shellcode, and memory forensics. The walkthrough covers static analysis, manual unpacking with a debugger, multi-stage shellcode extraction, code injection patterns, API hooking, and memory forensics with Volatility. The exercise was conducted on isolated virtual machines using teaching specimens.", "body_md": "A few weeks ago I spent a full lab session doing something that sounds simple on paper and is genuinely satisfying in practice: taking a packed, obfuscated piece of malware and peeling back every layer until I could see what it actually does.\n\nThis post is my write-up of that session. It's long, because the lab itself covered a lot of ground — static analysis, manual unpacking with a debugger, multi-stage shellcode extraction, code injection patterns, API hooking, and finally memory forensics with Volatility. I'm documenting it the way I wish more \"intro to malware analysis\" posts were documented: with the actual commands, the actual reasoning behind each step, and the dead ends along the way.\n\nIf you're getting into reverse engineering or malware analysis, this should give you a realistic feel for what a packed-malware investigation actually looks like end to end — not just the highlight reel.\n\n**A quick but important note:** this entire exercise was done on isolated, throwaway virtual machines (a Windows analysis VM with no network access beyond an internal isolated segment, plus a REMnux Linux VM) using known teaching specimens. Never run unknown executables, run unpackers, or experiment with shellcode on a machine connected to a real network or containing real data. Everything here assumes a fully isolated, snapshot-able VM setup.\n\nModern malware rarely ships as a plain, readable executable. Authors wrap their code in **packers** (like UPX) to shrink the file and make static analysis harder, and they layer in techniques like shellcode, code injection, and API hooking to evade detection and persist on a system. As an analyst, your job is to answer a chain of questions:\n\nThis walkthrough tackles that chain using three teaching specimens: a UPX-packed sample (`brbbot.exe`\n\n), a multi-technology dropper that chains JavaScript → PowerShell → shellcode (`PDFXCview.exe`\n\n), and a code-injecting, API-hooking sample analyzed both statically and via a memory image (`great.exe`\n\n/ `great.vmem`\n\n).\n\nTwo VMs, both reverted to clean snapshots before starting:\n\n`pescanner.py`\n\n, `diec`\n\n, `strings`\n\n, SpiderMonkey (`js`\n\n), `base64dump.py`\n\n, Volatility (`vol.py`\n\n)Both VMs were on an isolated internal network segment so the Windows VM and REMnux VM could talk to each other (for file transfer and the JavaScript dropper's local web server) without any route to the internet.\n\nFirst pass: load the suspicious binary in **PeStudio** and check three things — imports, section names, and strings.\n\nA packed file typically shows:\n\n`.text`\n\n, `.rdata`\n\n, `.data`\n\n, you'll see something like `UPX0`\n\nand `UPX1`\n\nAll three were true here, and the `UPX`\n\nnaming convention in the section headers was a strong hint about which packer was used.\n\nOn REMnux, `pescanner.py`\n\nmeasures the entropy of each section. High entropy (close to random) is a hallmark of compressed or encrypted data:\n\n```\npescanner.py brbbot.exe | more\n```\n\nThe tool flagged sections as \"SUSPICIOUS\" — one for unusually high entropy (consistent with packed/compressed code), and one with an entropy of exactly 0 (because its raw size was 0 — also anomalous for a legitimate section).\n\n```\ndiec brbbot.exe\n```\n\n`diec`\n\n(Detect It Easy, command-line version) reported UPX as the most likely packer — confirming the hint from the section names.\n\n```\nupx -d %AppData%\\brbbot.exe\n```\n\nThis is always worth trying, but it commonly fails on malware, because authors deliberately corrupt the UPX header/footer to block the standard unpacker while leaving the actual UPX decompression stub intact:\n\n```\nCantUnpackException: file is possibly modified/hacked/protected; take care!\n```\n\nSince the automated route was blocked, the next move is **manual dumping**: let the malware unpack *itself* in memory at runtime, then dump that unpacked memory image to disk.\n\n```\nsetdllcharacteristics -d %AppData%\\brbbot.exe\n```\n\nThis flips the `DYNAMIC_BASE`\n\nflag in the PE header from 1 to 0. Without this, the binary would load at a randomized base address every run, which makes it harder to find a stable breakpoint address across debugging sessions.\n\nWith the sample running (via a desktop shortcut set to \"Run as administrator\"), attach **Scylla x64** to the process and click **Dump**. This grabs the in-memory, already-unpacked version of the code.\n\nBut a raw memory dump alone usually isn't runnable — the **Import Address Table (IAT)** is broken, because imports get resolved dynamically and the dump doesn't capture that resolution cleanly. So:\n\nScylla writes a new file with `_SCY`\n\nappended to the name (e.g., `brbbot-dumped_SCY.exe`\n\n) — this is the \"fixed\" version with a repaired import table.\n\nLoading the fixed dump back into PeStudio showed *more* imports than the packed original — a good sign. But running the fixed dump directly produced a different outcome than expected (it exited immediately, without dropping the configuration file the real malware drops). **This is a useful and realistic lesson**: successfully fixing the IAT doesn't guarantee a perfectly runnable standalone binary. Sometimes further reconstruction is needed. Don't take \"it loads more imports now\" as proof the unpacking job is fully done — verify behavior too.\n\nScylla's automatic dump-and-fix approach doesn't always work cleanly, so it's worth knowing the manual debugger-based path too.\n\nLoad the packed binary in **x64dbg**. Scroll through the disassembly until the unpacking stub's instructions end and you hit a long run of zero bytes — that boundary is usually right where the final jump sits:\n\n```\njmp brbbot.140003F94\n```\n\nThat `140003F94`\n\ntarget address is the **OEP** — the address where the *real*, unpacked program logic begins.\n\nSet a breakpoint on the `JMP`\n\ninstruction, then run (`F9`\n\n). The process will execute all the unpacking logic and pause right at that breakpoint, immediately before transferring control to the unpacked code.\n\nStep over the jump (`F7`\n\nor `F8`\n\n) to land at the OEP — execution is now paused inside the *unpacked* code.\n\nDon't just trust the address — verify it. Right-click in the CPU view and run:\n\nBoth showing up is good confirmation you're looking at genuinely unpacked code.\n\nFrom x64dbg's Plugins menu: **OllyDumpEx → Dump process**. Key details:\n\n`UPX1`\n\nsection row and enable the `MEM_WRITE`\n\ncharacteristic flag before dumping (without write permission flagged, some dumpers won't capture the section properly)`brbbot_dump_64.exe`\n\n)Same logic as before — **IAT Autosearch → Get Imports → Fix Dump**, pointed at the OllyDumpEx output. Result: a `_SCY`\n\n-suffixed file with a repaired import table.\n\nSometimes you don't want to fully unpack a sample — you just want to watch a specific operation happen, like decryption of an embedded configuration.\n\nRun the packed binary in x64dbg with no breakpoints set (`F9`\n\n). It unpacks itself into memory and continues normally.\n\nIn the **Memory Map** tab, look for memory regions that don't belong to a Windows DLL and have **\"E\" (execute)** in the Protection column. In this sample, two regions matched that profile — the unpacker code region and a second region holding the freshly unpacked code. Right-click the latter and choose **Follow in Disassembler**.\n\nRight-click → **Search for → Current Region → Intermodular calls**, then filter the results by typing a keyword (e.g. `Crypt`\n\n) in the search box. This surfaced a call to `CryptDecrypt`\n\n— a strong signal that the malware decrypts an embedded configuration at runtime.\n\nSelect the instruction *right after* the `CryptDecrypt`\n\ncall (the result-checking instruction), and set a **hardware breakpoint on execution**. Then restart the process (`Ctrl+F2`\n\n) and run again (`F9`\n\n).\n\nWhy restart rather than just continuing? Because the process may have already executed past this point once — restarting guarantees you hit the breakpoint fresh, from the actual entry point, so register/stack state is consistent with a real first-run analysis.\n\nOnce paused there, the decrypted configuration data is sitting in memory (commonly reachable via the stack) — ready for inspection, exactly like you would when analyzing the unpacked version of the same family of malware.\n\nThis is where things get more interesting: a single executable that chains together several different technologies to avoid writing an obviously malicious file to disk.\n\nStart **Process Monitor** capturing, then run the sample. Watch the process tree in **Process Hacker**: the initial process spawns `mshta.exe`\n\nand `powershell.exe`\n\n, then after roughly a minute or two, spawns a couple of `regsvr32.exe`\n\nprocesses. Once those appear, terminate the process tree and pause Process Monitor capture — you don't need to let it run indefinitely, you just need enough activity captured to reconstruct the chain.\n\nExport the Process Monitor log as CSV, then load it into **ProcDOT** along with the initial malicious process. ProcDOT generates a visual graph of what touched what — registry keys created, files dropped, and a persistence entry added under the `Run`\n\nautostart key. It also revealed the malware created files with an unusual, randomly-generated extension and a batch file, plus matching registry entries describing how Windows should handle that custom file extension.\n\nIn **Regedit**, navigate to:\n\n```\nHKEY_CURRENT_USER\\Software\\Classes\\.<random-extension>\n```\n\nThe `(Default)`\n\nvalue there points to another key (a random-looking hex string), which under `shell\\open\\command`\n\ncontains the actual command Windows runs. In this lab it looked roughly like:\n\n```\n\"C:\\WINDOWS\\system32\\mshta.exe\" \"javascript:...eval(IV2u4L)...\"\n```\n\nThis is a classic file-less technique: rather than dropping a `.js`\n\nfile, the script content lives in a registry value, and `mshta.exe`\n\nis abused to execute inline JavaScript that reads and `eval()`\n\ns it.\n\n```\nreg_export HKCU\\software\\<random-key> <random-value> script.js\n```\n\nTransfer with WinSCP, then try SpiderMonkey directly:\n\n```\njs -f /usr/share/remnux/objects.js -f script.js\n```\n\nThis threw an \"illegal character\" error — the script was UTF-16 encoded, which SpiderMonkey can't parse directly. Fix the encoding first:\n\n```\nstrings --encoding=l script.js > script2.js\n```\n\n(`-l`\n\nhere is lowercase **L**, not the number 1 — easy typo to make.)\n\nThen deobfuscate properly:\n\n```\njs -f /usr/share/remnux/objects.js -f script2.js > script3.js\nscite script3.js &\n```\n\nThe deobfuscated script revealed a call resembling `[Convert]::FromBase64String`\n\n, with the decoded result handed off to `powershell.exe`\n\n— meaning the JavaScript's whole job was to decode and launch a Base64-encoded PowerShell stage.\n\n```\nbase64dump.py script3.js\n```\n\nThis lists every candidate Base64 blob found, each with an ID. Look in the **Decoded** column for the largest entry that decodes into readable ASCII — that's almost always the real payload, as opposed to short incidental Base64-looking noise.\n\n```\nbase64dump.py script3.js -s 10 -d > script.ps1\n```\n\n(`-s 10`\n\nselects that specific entry's ID — yours will likely be a different number.)\n\nTransfer `script.ps1`\n\nback to Windows and open in Notepad++. The pattern here is a textbook shellcode loader:\n\n`$sc32`\n\n) holds hex-encoded shellcode`VirtualAlloc`\n\nallocates memory with `PAGE_EXECUTE_READWRITE`\n\n`CreateThread`\n\nis called, pointing at the shellcode's address, to execute itRecognizing this pattern is genuinely useful — it shows up constantly across unrelated malware families because it's the simplest way to run raw shellcode from a scripting language.\n\n```\npowershell_ise script.ps1\n```\n\nSet a breakpoint on the line right after `$sc32`\n\nis assigned (before `$pr`\n\ngets defined), run to it (Debug → Run/Continue), then once paused, dump the variable's contents to a raw binary file:\n\n```\n[io.file]::WriteAllBytes('sc32.bin',$sc32)\n```\n\nNow you have the raw shellcode isolated in its own file, ready for dedicated shellcode analysis tools.\n\n```\nscdbg.exe -f sc32.bin\n```\n\n(Or via the GUI: load the file, leave default options, click Launch.) `scdbg`\n\nemulates the shellcode's likely API calls without actually executing it dangerously. Here it showed the code loading `advapi32.dll`\n\nand calling `RegOpenKeyExA`\n\nagainst both `HKEY_LOCAL_MACHINE`\n\nand `HKEY_CURRENT_USER`\n\n— useful, but it didn't reveal which specific registry keys were targeted.\n\n```\njmp2it sc32.bin 0x0 pause\n```\n\n`0x0`\n\nmeans the shellcode starts at offset zero in the file. The `pause`\n\nargument makes `jmp2it`\n\ninsert an infinite loop *before* jumping into the shellcode, buying you time to attach a debugger before anything actually runs.\n\nAttach **x32dbg** to the `jmp2it`\n\nprocess, run briefly, then pause — you'll land inside the infinite loop `jmp2it`\n\ncreated. The shellcode in this case expected a parameter (its own memory address) to be pushed onto the stack before it starts, mimicking how the PowerShell loader called it via `CreateThread`\n\n. Since `jmp2it`\n\nhappens to store that address in the `EDI`\n\nregister, you can satisfy that expectation by patching the infinite-loop instruction:\n\n```\npush edi\n```\n\nThis single patched instruction is what lets the shellcode run as if it had been called the same way the original loader called it.\n\n```\nSetBPX advapi32.RegOpenKeyExA\n```\n\nRun (`F9`\n\n) to hit it, then check the **Call Stack** tab for the first frame that isn't inside a Windows DLL — that's the shellcode's own calling code. Following that call stack entry back into the disassembler showed, a short distance later, a call to `VirtualAlloc`\n\n— a strong hint that this shellcode unpacks *another* payload into memory, just like the outer executable did.\n\nSet a breakpoint on `VirtualAlloc`\n\nitself:\n\n```\nSetBPX VirtualAlloc\n```\n\nThe pattern that emerged from hitting this breakpoint multiple times:\n\nEach time, right-clicking `EAX`\n\n(which holds the returned memory address) → **Follow in Dump → Dump 1/Dump 2** lets you watch that specific memory region fill in over successive breakpoint hits.\n\nOnce the third allocation showed clear PE-file characteristics, right-click that dump pane → **Follow in Memory Map**, then right-click the corresponding row → **Dump Memory to File**. That gives you a final extracted executable, ready to load into PeStudio to confirm it's a structurally valid PE file with imports and strings.\n\nSwitching specimens here — a sample that injects code into other running processes.\n\nIn IDA, jump to the **Imports** tab and locate `CreateRemoteThread`\n\n. Double-click it, then in the disassembler view, select it and press `x`\n\nto bring up cross-references. This shows every place in the code that calls this function.\n\n`CreateRemoteThread`\n\ntakes a process handle (`hProcess`\n\n) as a parameter. Tracing that register backward through the disassembly led to a call to `OpenProcess`\n\n— the function that obtains a handle to an existing process by PID. This is the classic injection setup: get a handle to a target process, then create a thread inside it.\n\nA separate function call (visible just before the `CreateRemoteThread`\n\ncall, taking the same process handle as a parameter) turned out to contain calls to `WriteProcessMemory`\n\n— the actual mechanism for placing code into another process's address space.\n\nA useful shortcut here: rather than manually walking every function called from that one, IDA's **View → Graphs → Xrefs from** generates a call graph showing everything reachable from a given function. That graph surfaced exactly which sub-function calls `VirtualAllocEx`\n\n— the memory allocation step that has to happen in the *target* process before you can write to it.\n\nAt the `VirtualAllocEx`\n\ncall site, the `flProtect`\n\nparameter being pushed was `0x40`\n\n. Right-clicking that value in IDA and choosing **\"Use standard symbolic constant\"** reveals it as `PAGE_EXECUTE_READWRITE`\n\n— memory that can be written to *and* executed. That combination, allocated in someone else's process, is the textbook signature of code injection intent.\n\nWalking back further up the call chain (using IDA's back-arrow navigation) led to a function that calls `CreateToolhelp32Snapshot`\n\n, which — combined with `Process32FirstW`\n\n/`Process32NextW`\n\n— is the standard Windows API trio for enumerating every running process. That's the malware searching for a suitable target before injecting into it.\n\n| Function called | Role in the injection chain |\n|---|---|\n`CreateToolhelp32Snapshot` + `Process32FirstW/NextW`\n|\nEnumerate running processes to pick a target |\n`OpenProcess` |\nGet a handle to the chosen target process |\n`VirtualAllocEx` |\nAllocate executable+writable memory inside the target |\n`WriteProcessMemory` |\nWrite the payload into that allocated memory |\n`CreateRemoteThread` |\nStart execution of the injected code |\n\nSame specimen, different capability: modifying other functions in memory so calls to them get redirected.\n\nFollowing cross-references to `ReadProcessMemory`\n\n(same Imports-tab → xrefs approach as before) led to a function that reads memory from a target process — almost always the first step before *overwriting* something, since you typically want to preserve the original bytes you're about to clobber.\n\nThe same function later calls `WriteProcessMemory`\n\ntwice, with two different byte patterns:\n\n`0xE9`\n\n— the opcode for a relative `JMP`\n\ninstruction`0x68`\n\n(the start of a `PUSH`\n\ninstruction), paired with a `0xC3`\n\n(`RET`\n\n) written five bytes laterThe second pattern — `PUSH`\n\nfollowed by `RET`\n\n— is a sneakier alternative to a plain `JMP`\n\nfor redirecting execution, since it doesn't look like an obvious jump instruction at a glance.\n\nWalking the call chain upward (xrefs again) eventually reaches a function that builds a table of function addresses — saving various API addresses into memory, one after another, to be passed as the list of functions to hook. The functions referenced there were largely browser-related, suggesting the malware's actual goal: intercepting and observing the victim's web browsing activity.\n\nFinal piece: instead of analyzing a live process or a static file, this works from a memory snapshot (`.vmem`\n\n) captured from an already-infected machine.\n\n```\nvol.py -f great.vmem kdbgscan | more\n```\n\nThis suggests one or more candidate OS profiles. The *first* suggestion isn't guaranteed to be correct — try it, and if Volatility throws errors like `\"need base\"`\n\nor `\"No Base Address Space\"`\n\n, that profile doesn't match and you move to the next candidate:\n\n```\nvol.py -f great.vmem --profile=Win10x86 pslist\n```\n\nOnce a profile returns clean, readable process output instead of errors, lock it in for the rest of the session:\n\n```\nexport VOLATILITY_PROFILE=Win10x86\n```\n\n`pslist`\n\noutput itself is worth scanning closely here — a process with an unusual, non-standard-looking name stood out immediately as worth investigating further.\n\n```\nvol.py -f great.vmem cmdline | more\n```\n\nThis surfaced a `cmd.exe`\n\ninvocation running a batch file out of `%Temp%`\n\nwith a randomized filename — code running from the Temp folder with a random name is a strong red flag on its own.\n\n```\nvol.py -f great.vmem memdump -p <PID> -D /tmp\nstrings /tmp/<PID>.dmp | grep -B3 -A3 <batch-filename>\n```\n\nThe surrounding strings matched typical batch-file syntax — consistent with a self-deleting cleanup script (delete the dropped executable, then delete itself), a very common malware self-cleanup pattern.\n\n```\nvol.py -f great.vmem malfind -D /tmp > malfind.txt\nscite malfind.txt &\n```\n\n`malfind`\n\nscans the entire memory image for telltale signs of injected code (executable memory regions with suspicious characteristics, frequently starting with the `MZ`\n\nsignature of a PE header) and dumps each one it finds. In this case it flagged several legitimate-looking processes — explorer.exe and a couple of others — as containing injected PE content, each at a different memory address. Several of the dumped files were exactly the same size, hinting they're likely the same payload injected repeatedly into different processes.\n\nA quick static check on one of those dumped files with a couple of additional command-line tools (string extraction, automated triage) turned up the same suspicious indicators seen earlier — references to a known risky DLL associated with silent file downloads, and string patterns matching the cleanup batch file extracted earlier. That overlap is good corroborating evidence that this is the same malware family operating across multiple injected processes.\n\n```\nvol.py -f great.vmem apihooks -p <PID> --skip-kernel > apihooks.txt\n```\n\nScrolling past the IAT-based entries (commonly false positives) to the first **Inline/Trampoline** entry revealed a hooked `ntdll.dll!LdrLoadDll`\n\n, patched with the same `PUSH`\n\n/`RET`\n\nredirection technique identified earlier via static analysis — confirming that what was theorized from the binary alone is actually happening at runtime, in memory.\n\nThe hook redirected execution to a small address range. Using `pslist`\n\nagain to find the *virtual offset* of the specific process being investigated let me narrow down, among all the files `malfind`\n\nhad extracted, which one's address range actually encompassed that hook target — confirming exactly which extracted memory dump contains the code the hijacked function jumps into.\n\nA checklist for confirming each major milestone in this kind of analysis:\n\n`scdbg`\n\n) first before live-running it, even in an isolated VM`apihooks`\n\n) when both are available`malfind`\n\n, `apihooks`\n\n, `pslist`\n\n) should agree with each otherA few things stuck with me after this session:\n\n| Mistake | Why It Happens | How to Avoid It |\n|---|---|---|\n| Assuming UPX (or any packer) can always be unpacked with the standard tool | Authors deliberately corrupt headers to break generic unpackers | Always have a manual debugger-based fallback ready |\n| Forgetting to disable ASLR before debugging | Default behavior on modern Windows | Run `setdllcharacteristics -d` (or equivalent) before setting breakpoints by address |\n| Trusting a dumped file just because PeStudio shows more imports | More imports indicates partial success, not full functional correctness |\nActually try running the dumped/fixed binary, and watch for expected side effects (dropped files, registry changes) |\nUsing `strings` without `--encoding=l` on UTF-16 obfuscated scripts |\nMany obfuscation toolkits output UTF-16 by default | If a deobfuscator throws an encoding/illegal-character error, check the source encoding first |\n| Picking the wrong Base64 blob from a dump tool's output | Obfuscated scripts often contain several short, irrelevant Base64-looking strings | Sort by decoded size and check for actual readable ASCII content in the decode preview |\n| Trying the first Volatility profile suggestion and giving up if it errors |\n`kdbgscan` often suggests multiple plausible profiles |\nTreat profile errors as informative, not blocking — try the next suggested profile |\nTreating IAT hook entries from `apihooks` as real hooks |\nIAT-style entries are common false positives in Volatility's hook detection | Specifically look for \"Inline/Trampoline\" hook type entries, which are far more reliable indicators |\n| Analyzing shellcode by directly running it without emulating first | Skips a safe verification step | Run through `scdbg` (emulation) before live execution, even in an isolated VM |\n\nGoing from \"this file is packed\" to \"I understand exactly how it injects code, hooks APIs, and what it left behind in memory\" took a genuinely long chain of tools and techniques — and that's honestly the most realistic takeaway here. Real malware analysis is rarely a single tool giving you a single clean answer. It's PeStudio pointing you toward a hypothesis, a debugger confirming it, IDA explaining the *why*, and Volatility proving it actually happened on a real system.\n\nIf you're working through similar material, my biggest piece of advice is: don't skip the verification steps. It's tempting to declare victory the moment a tool produces *some* output, but the real confidence comes from cross-checking — static findings against dynamic behavior, debugger observations against memory forensics, one tool's output against another's.\n\nIf you found this useful, I'm planning to keep documenting more of this kind of hands-on analysis work — let me know in the comments if there's a specific technique here you'd like a deeper dive into.", "url": "https://wpnews.pro/news/from-packed-binary-to-readable-code-a-hands-on-walkthrough-of-unpacking-analysis", "canonical_source": "https://dev.to/almahmudkhalif/from-packed-binary-to-readable-code-a-hands-on-walkthrough-of-unpacking-shellcode-analysis-and-27bd", "published_at": "2026-06-26 17:05:26+00:00", "updated_at": "2026-06-26 17:34:00.325956+00:00", "lang": "en", "topics": ["ai-safety", "developer-tools"], "entities": ["PeStudio", "REMnux", "Volatility", "UPX", "brbbot.exe", "PDFXCview.exe", "great.exe", "great.vmem"], "alternates": {"html": "https://wpnews.pro/news/from-packed-binary-to-readable-code-a-hands-on-walkthrough-of-unpacking-analysis", "markdown": "https://wpnews.pro/news/from-packed-binary-to-readable-code-a-hands-on-walkthrough-of-unpacking-analysis.md", "text": "https://wpnews.pro/news/from-packed-binary-to-readable-code-a-hands-on-walkthrough-of-unpacking-analysis.txt", "jsonld": "https://wpnews.pro/news/from-packed-binary-to-readable-code-a-hands-on-walkthrough-of-unpacking-analysis.jsonld"}}