Table of Contents

CTF Forensics Tools: The Ultimate Guide for Beginners

執筆者:

カテゴリ:

When I first started picoCTF forensics challenges, I had a folder full of installed tools and no idea which one to open first. Every challenge felt like staring at a locked box with twenty keys on the table. The problem wasn’t a lack of tools — it was not knowing the decision process behind picking the right one. I’ve since worked through 20+ picoCTF forensics challenges — disk images (DISKO 1, Disk disk sleuth), audio (Morse Code), PNG steganography (RED, Matryoshka Doll, Scan Surprise), network captures (Ph4nt0m 1ntrud3r), and more — and the same mistakes kept appearing at the start: wrong tool for the file type, skipping the metadata check, mounting a disk image before reading its partition table. This page maps the decision process I’ve built from those failures — not a feature list, but a guide to when to reach for each tool and when to put it down.

Step zero: identify what you’re dealing with

Before touching any specialized tool, run these two commands on every unknown file:

$ file challenge.bin
challenge.bin: Zip archive data, at least v2.0 to extract

$ xxd challenge.bin | head -5
00000000: 504b 0304 1400 0000 0800 ...  PK...........

file reads magic bytes and tells you the actual format regardless of the file extension. I’ve lost count of how many times a file named data.png turned out to be a ZIP or a disk image — the magic bytes 50 4B 03 04 (PK) in the hex output are an immediate giveaway for ZIP regardless of what the filename says. If file says “data” or gives something unexpected, that’s your first clue that the header has been deliberately corrupted. See the Corrupted File writeup for a real example where a broken PNG magic byte was the entire challenge.

Two commands to always run first: strings and binwalk

Regardless of file type, I run these two passes before reaching for anything else:

# Pass 1: look for plaintext flag
strings challenge.* | grep -i "picoctf\|flag{"

# Pass 2: look for embedded files
binwalk challenge.*

strings sounds too simple to be useful — but I’ve found flags in plaintext embedded in binary files more than once. The strings in CTF article covers a picoCTF challenge where the flag was sitting unencoded inside a compiled binary, extractable in under 10 seconds. Never skip the strings pass before reaching for a specialized tool.

binwalk scans for signatures of embedded files — ZIP archives, gzip streams, file systems, certificate data — appended or injected into the challenge file. The binwalk in CTF article covers the Matryoshka Doll challenge (4 nested ZIPs inside a PNG) and one important caveat: binwalk sometimes reports “TIFF image data” inside PNG files as a false positive. That hit me on a challenge where every layer of a multi-file PNG triggered a fake TIFF detection — it’s an ICC color profile at a specific offset that happens to match TIFF magic bytes, not a real embedded image.

By file type: which tool to reach for

Disk images (.img, .dd, raw)

Disk image challenges are where picking the wrong tool first wastes the most time. The order that works:

fdisk — read the partition table first. Tells you how many partitions exist and their byte offsets.
dd — carve out individual partitions by byte offset for closer inspection.
mount — only after fdisk gives you the offset. Use mount -o loop,offset=X.

The rabbit hole I fell into early on: jumping straight to mount without checking the partition table. If the image has multiple partitions, mount defaults to the first one. In picoCTF “Disk, disk, sleuth! II,” the flag was in a later partition — mounting the first one showed an empty ext4 filesystem. fdisk showed two partitions; carving the second with dd and then mounting it took 35 seconds total. Trying to find the flag without fdisk had me confused for two hours.

Audio files (.wav, .mp3, .flac)

Audio forensics challenges almost always hide data in one of three places: the spectrogram, the waveform encoding (LSB or Morse), or the metadata. The spectrogram first:

Audacity — open the file and switch to spectrogram view immediately. If there’s a visual message hidden in the frequency domain, you’ll see it in seconds. If the spectrogram looks like noise, zoom into the 1–4 kHz range — flags are sometimes hidden in a narrow frequency band invisible at default scale. I used Audacity’s spectrogram to solve the picoCTF Morse Code challenge: the Morse dots and dashes appeared as vertical bars in the spectrogram long before I thought to listen to the audio.
SoX — when scripted analysis or speed/pitch manipulation is needed. The sox stat and soxi commands give you duration, sample rate, and frequency stats quickly. For DTMF/touch-tone challenges, SoX’s rough frequency output can identify the carrier tone.
FFmpeg — for video files, or when a file won’t open in Audacity due to codec issues. ffprobe reads stream metadata and codec details without decoding.

Image files (.png, .jpg, .bmp)

Image steganography is one of the most common forensics categories. Always run exiftool before any steganography tool:

exiftool — metadata check on any image. The picoCTF RED challenge hid its entire hint (the acrostic “CHECKLSB”) inside a custom Poem metadata field. Running steghide and binwalk first wasted five minutes; exiftool in the first pass would have pointed directly at the LSB encoding technique.
pngcheck — run this on any PNG before using other tools. A PNG with a corrupted chunk will fail silently in steghide, binwalk, and zsteg — pngcheck tells you exactly which chunk is malformed.
zsteg — for LSB steganography analysis in PNG and BMP. Tries every combination of bit planes, channels, and read order. The output is verbose, but the key entry is usually b1,rgba,lsb,xy — other entries are often false positives (OpenPGP key detections are a common artifact in single-color PNGs).
steghide — for JPEG and BMP files when the challenge hints at a passphrase. Important: steghide doesn’t work on PNG. And both “wrong passphrase” and “no data embedded” produce the same error message (steghide: could not extract any data with that passphrase), which is a common confusion point.
binwalk — when the image is suspiciously large. Scans for and extracts embedded ZIP archives, firmware, or compressed data appended after the image data ends.

Network captures (.pcap, .pcapng)

For packet capture challenges, Wireshark and tshark are the core tools. The biggest pitfall is assuming that “Follow TCP Stream” gives you data in the right order — packets may arrive out of order intentionally. The Ph4nt0m 1ntrud3r writeup covers a challenge where Base64 flag fragments were sent in the correct timestamp order but captured in a different arrival order. Sorting by microsecond timestamp rather than arrival index was the entire solve.

Archives (.zip, .tar, .7z)

For password-protected ZIPs, check the encryption type first — this is the step most beginners skip:

$ 7z l -slt challenge.zip | grep Method
Method = ZipCrypto Deflate   ← crackable with zip2john + hashcat
Method = AES-256             ← NOT crackable; you need the actual password

zip2john works only for ZipCrypto encryption. If the method shows AES-256, dictionary attacks won’t work and the challenge will always provide the password in another form. I spent an hour on a wordlist attack against an AES-256 ZIP before checking the encryption method.

QR codes and barcodes

zbarimg is the fastest CLI decoder for QR and barcodes. If it returns “0 barcodes detected,” two fixes cover 90% of failures:

# Fix 1: color inversion (QR on dark background)
convert flag.png -negate inverted.png && zbarimg inverted.png

# Fix 2: resolution too low
convert flag.png -resize 400% -filter point upscaled.png && zbarimg upscaled.png

My first-pass workflow

When I get a new forensics challenge, this is the sequence I actually follow:

# 1. What is this file?
file challenge.*
xxd challenge.* | head -30

# 2. Anything in plain text or embedded?
strings challenge.* | grep -i "picoctf\|flag{"
binwalk challenge.*

# 3. If image: metadata first
exiftool challenge.*

# 4. Branch by file type:
# → disk image:    fdisk → dd → mount
# → audio:         Audacity spectrogram → sox stat → ffmpeg
# → PNG:           pngcheck → zsteg → exiftool (if not done)
# → JPEG/BMP:      steghide → binwalk
# → ZIP:           7z l -slt (check method) → zip2john (ZipCrypto only)
# → QR/barcode:    zbarimg → convert -negate if 0 results
# → pcap:          tshark with tcp.len>0 filter → sort by timestamp

Common rabbit holes in forensics CTF

These are the specific mistakes that cost me the most time across 20+ picoCTF forensics challenges:

Wrong file type assumption — the extension lies. Always check magic bytes with file and xxd before anything else.
steghide on PNG — steghide only supports JPEG and BMP. Running it on a PNG produces a format error and wastes time; reach for zsteg instead.
Mounting without reading partition offsets — mount with no offset parameter mounts the first partition. In multi-partition images, that’s often not where the flag is. Always run fdisk first to get the offsets.
AES-256 ZIP + zip2john — zip2john cannot crack AES-256 encrypted ZIPs. If the method is AES-256, you need the actual password. Checking the encryption method first saves hours.
binwalk TIFF false positive in PNG — binwalk sometimes reports “TIFF image data” inside PNG files because an ICC color profile at a specific offset matches TIFF magic bytes. If the same offset appears repeatedly across multiple files, it’s structural artifact, not real embedded data.
Packet arrival order vs. timestamp order — Wireshark’s “Follow TCP Stream” concatenates packets in arrival order. When packets are intentionally sent out-of-order (as in covert channel challenges), this produces garbled output. Sort by the packet’s own timestamp, not by capture position.
Spectrogram at wrong scale — if Audacity’s spectrogram looks like noise, zoom into the 1–4 kHz range. Flags are sometimes hidden in a narrow band invisible at default zoom.

Tool reference index

Tool	File type	When to use
`file` / `xxd`	Any	Always run first — confirm actual format from magic bytes
`strings`	Any binary	Run before any specialized tool — flags in plaintext are common
`binwalk`	Any binary	Detect and extract embedded files; watch for ICC profile false positives in PNG
`exiftool`	Any image/media	Metadata check before steganography tools — hints are sometimes in custom fields
fdisk	Disk image	Read partition table before anything else
dd	Disk image	Carve partitions by byte offset after fdisk
Audacity	Audio	Spectrogram analysis — open first for any audio file
SoX	Audio	Scripted analysis, stat output, frequency identification
FFmpeg	Audio/Video	Format conversion, codec issues, video forensics
pngcheck	PNG	Validate chunk integrity before any other PNG tool
zsteg	PNG/BMP	LSB steganography analysis — check `b1,rgba,lsb,xy` entry first
steghide	JPEG/BMP	Passphrase-protected embedded data — not supported for PNG
zbarimg	QR/Barcode	Fastest CLI decoder; try color inversion if 0 results
zip2john	ZIP archive	Password cracking for ZipCrypto only — check method first with 7z

CTF Forensics Tools: The Ultimate Guide for Beginners

Step zero: identify what you’re dealing with

Two commands to always run first: strings and binwalk

By file type: which tool to reach for

Disk images (.img, .dd, raw)

Audio files (.wav, .mp3, .flac)

Image files (.png, .jpg, .bmp)

Network captures (.pcap, .pcapng)

Archives (.zip, .tar, .7z)

QR codes and barcodes

My first-pass workflow

Common rabbit holes in forensics CTF

Tool reference index

Further Reading

コメント

Leave a Reply Cancel reply

投稿をさらに読み込む

Local Authority picoCTF Writeup

Inspect HTML picoCTF Writeup

Wireshark doo dooo do doo picoCTF Writeup

Crack the Gate 1 picoCTF Writeup