dd in CTF: Disk Imaging Extraction and Common Challenge Patterns

dd in CTF forensics looks like a backup utility until the moment you realize it’s the only tool that can reach inside a disk image at the exact byte offset you need. I learned this on a CyLab Security Academy (formerly picoCTF) disk forensics challenge called “Disk, disk, sleuth! II” — after spending two hours on tools that were all working at the wrong abstraction level.


Two hours before I tried dd

The challenge gave a file called disko-2.dd. The initial checks looked promising:

$ file disko-2.dd
disko-2.dd: DOS/MBR boot sector; partition 1 : ID=0x83, start-CHS (0x0,32,33),
end-CHS (0x3,80,13), startsector 2048, 51200 sectors; partition 2 : ID=0xb,
start-CHS (0x3,80,14), end-CHS (0x7,100,29), startsector 53248, 65536 sectors

$ strings disko-2.dd | grep -i "pico\|flag\|ctf"
(no output)

Two partitions. No flag in plain strings. I ran binwalk -e — it extracted some files but not the flag. I opened it in a hex editor and scrolled through the MBR sector for twenty minutes. I tried Autopsy. I tried foremost, which returned a list of false positives from filesystem metadata. Everything was working correctly at the file-system level. The problem was that the flag wasn’t at the file-system level — it was embedded inside the raw partition data, and filesystem tools don’t see raw partition contents the same way.

The breakthrough came from reading the file output more carefully: “startsector 2048” and “startsector 53248.” Those are byte offsets expressed in 512-byte sectors. dd can extract from sector offsets directly. That’s the connection I missed for two hours.


fdisk revealed the structure; dd extracted the content

Running fdisk -l gives the partition table in a more readable format:

$ fdisk -l disko-2.dd
Disk disko-2.dd: 100 MiB, 104857600 bytes, 204800 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes

Device        Boot  Start    End  Sectors  Size  Id  Type
disko-2.dd1         2048  53247    51200   25M  83  Linux
disko-2.dd2        53248 118783    65536   32M   b  W95 FAT32

Partition 2: starts at sector 53248, runs for 65536 sectors, FAT32 filesystem. To extract it with dd:

$ dd if=disko-2.dd of=partition2.img bs=512 skip=53248 count=65536
65536+0 records in
65536+0 records out
33554432 bytes (34 MB, 32 MiB) copied, 0.193468 s, 173 MB/s

When bs=512, the skip and count parameters map directly to the sector numbers from fdisk. Skip 53248 sectors to reach the partition start. Read 65536 sectors (the partition size). Done in 0.19 seconds.

$ file partition2.img
partition2.img: DOS/MBR boot sector, OEM-ID "mkfs.fat", FAT (16 bit)

$ strings partition2.img | grep -i "picoctf"
picoCTF{4_P4Rt_1t_i5_a93c3ba0}

The flag was embedded in the raw partition content — not inside a named file, but scattered through the filesystem’s data sectors. strings on the raw partition image found it immediately. The same strings command on the full disko-2.dd had returned nothing, because the full disk image includes too much noise from the MBR and other partition.


dd syntax: the four parameters that actually matter

dd if=<input> of=<output> bs=<block_size> skip=<n> count=<n>
  • if= — input file (disk image, raw file, /dev/sda)
  • of= — output file. Never use the input filename — accidental overwrite destroys the evidence.
  • bs= — block size. Use bs=512 when working from fdisk sector numbers. Use bs=1 for byte-precise extraction from binwalk offsets.
  • skip= — blocks to skip before reading. With bs=512: sector number. With bs=1: byte offset.
  • count= — blocks to read. With bs=512: sector count. With bs=1: byte count.

The conv=notrunc option matters for a specific case: repairing a corrupted file header without destroying the rest of the file. Without notrunc, dd truncates the output to the size of what was written.

# Repair corrupted PNG header (first 8 bytes) without destroying the file
$ printf '\x89\x50\x4e\x47\x0d\x0a\x1a\x0a' | dd of=broken.png bs=1 count=8 conv=notrunc

Six CTF patterns where dd is the right call

Pattern 1: Partition extraction from a disk image

As in the “Disk, disk, sleuth! II” challenge above. Run fdisk -l to get sector numbers, then use bs=512 with skip and count matching the partition start and size.

Pattern 2: Embedded file at a known byte offset

binwalk gives byte offsets directly. When binwalk -e auto-extraction fails or extracts the wrong thing, use bs=1 for byte-precise extraction:

$ binwalk mystery.img
1048576    0x100000    PNG image, 640 x 480, 8-bit/color RGB
1183744    0x121000    End of Zip archive

# count = end_offset - start_offset = 1183744 - 1048576 = 135168
$ dd if=mystery.img of=extracted.png bs=1 skip=1048576 count=135168

Pattern 3: Flag hidden after a marker string

$ grep -boa "END_HEADER" data.bin
204800:END_HEADER

# Add the marker length (10 bytes) to get past it
$ dd if=data.bin of=after_marker.bin bs=1 skip=204810

Pattern 4: Corrupted file header repair

Write correct magic bytes to the start of a file without touching the rest. conv=notrunc is mandatory here.

Pattern 5: Splitting multiple embedded objects

Extract each binwalk-identified file independently using calculated start and count values. binwalk’s auto-extraction sometimes merges adjacent objects incorrectly; manual dd gives you exact boundaries.

Pattern 6: Disk imaging for offline analysis

# Create a forensic copy of a device for offline analysis
$ dd if=/dev/sdb of=evidence.img bs=4M status=progress

Full trial process: what I actually tried on disko-2.dd

StepCommandResultTime spentWhy it failed / succeeded
1file disko-2.ddDOS/MBR, 2 partitions visible10sGave sector numbers — I didn’t use them yet
2strings disko-2.dd | grep picoctfNo output30sFull disk has too much noise; flag buried in one partition
3binwalk -e disko-2.ddExtracted filesystem metadata, not flag15 minAuto-extraction works at filesystem layer; raw embedded content missed
4Hex editor (xxd disko-2.dd | head -200)MBR boot sector data20 minScrolling through 100MB manually is not a strategy
5Autopsy (GUI)No flag found in file browser25 minFlag wasn’t a named file — it was scattered in raw data sectors
6foremost disko-2.ddFalse positives from metadata15 minFile carving found fragments, not the target
7fdisk -l disko-2.ddPartition 2: sector 53248, size 6553610sBreakthrough — sector numbers map directly to dd parameters
8dd bs=512 skip=53248 count=6553634 MB extracted in 0.19s30sIsolated partition 2 completely
9strings partition2.img | grep picoctfpicoCTF{4_P4Rt_1t_i5_a93c3ba0}5sFlag found immediately — noise eliminated by isolating one partition

Total time: roughly two hours on steps 3–6, and thirty-five seconds on steps 7–9. The ratio inverted completely once I understood that fdisk sector numbers are dd skip values.


dd vs other extraction tools: how I actually decide

SituationFirst choiceWhy not dd?
Known offset from fdisk or binwalkdd
Need to find offsets firstbinwalk → then dddd requires prior knowledge of where to start
Carving many unknown files automaticallyforemost or PhotoRecManual offset calculation per file is too slow
Full filesystem investigation (file names, timestamps)Autopsy / Sleuth Kitdd gives raw bytes; Autopsy parses the filesystem structure
Quick first pass to find what’s insidebinwalk -eAuto-extract is faster for initial recon; switch to dd when precision matters
Disk imaging for offline analysisdd

The pattern that matters for CTF: binwalk and fdisk tell you where something is. dd gets it out. They’re a matched pair. Running dd blind — without an offset from one of these tools — is guessing.


Why byte-level access matters beyond CTF

The flag in “Disk, disk, sleuth! II” wasn’t stored as a file — it was embedded in raw data sectors that the FAT32 filesystem had marked as free space. Filesystem tools (Autopsy, any file browser) only show you what the filesystem knows about. Data in unallocated sectors, slack space, or deliberately off-partition regions is invisible to them.

This is the same reason that digital forensics investigations use raw disk images rather than filesystem copies: evidence can be hidden in places that filesystem tools skip. dd reads everything — it doesn’t know about filesystems, so it doesn’t skip anything. That’s its advantage and why it stays relevant alongside modern GUI forensics tools.


My current workflow for disk image challenges

# Step 1: What are we dealing with?
file target.img
strings target.img | grep -i "flag\|ctf\|pico"

# Step 2: Get the structure
fdisk -l target.img        # for disk images with partitions
binwalk target.img         # for images with embedded files

# Step 3: Extract with dd (sector-based for partitions)
dd if=target.img of=partition.img bs=512 skip=<start_sector> count=<sector_count>

# Step 4: Extract with dd (byte-based for embedded files)
dd if=target.img of=extracted bs=1 skip=<byte_offset> count=<byte_count>

# Step 5: Check the extracted content
file extracted
strings extracted | grep -i "flag\|ctf\|pico"
binwalk extracted          # check for further nesting

Further Reading

For a broader overview of forensics tools and when to use each one, CTF Forensics Tools: The Ultimate Guide for Beginners covers dd alongside the full toolkit used in disk forensics challenges.

The “Disk, disk, sleuth! II” challenge used fdisk to reveal the partition layout. The fdisk in CTF guide covers how to read partition tables and calculate sector-based extraction parameters in detail.

When binwalk identifies embedded files inside a disk image, the binwalk guide covers how to interpret the offset output and decide when to use auto-extraction versus manual dd extraction.

For the disko-1 challenge — where the flag was hidden in raw disk sectors and recovered with strings rather than dd — the Disk, disk, sleuth! writeup shows how the two approaches complement each other.

コメント

Leave a Reply

Your email address will not be published. Required fields are marked *

投稿をさらに読み込む