pdfdumper in CTF: Extracting PDF Content and Common Challenge Patterns


What Is pdfdumper?

pdfdumper is a command-line tool designed to extract embedded objects and streams from PDF files. In CTF challenges, flags are often hidden in:

  • Embedded images
  • JavaScript scripts
  • Attachments
  • Encoded streams

pdfdumper automates the process of dumping these objects for further analysis.


Basic Usage

pdfdumper file.pdf

By default, pdfdumper will:

  • List all objects in the PDF
  • Extract streams (images, JavaScript, attachments) into files
  • Provide object numbers and types

Example Output

Object 5: /Type /Page
Object 12: /EmbeddedFile stream -> saved as obj12.bin
Object 15: /XObject /Image -> saved as obj15.png
Object 22: /JS stream -> saved as obj22.js

How pdfdumper Is Used in CTF Challenges

1. Extract Embedded Files

PDFs may contain attachments or hidden files.

pdfdumper challenge.pdf

Check for:

  • objXX.bin → could contain flag data
  • objXX.png → images with steganography
  • objXX.js → scripts that reveal or decode flags

2. Analyze Embedded JavaScript

Some PDFs contain JavaScript that, when executed, generates the flag.
After extraction:

cat obj22.js
  • Look for base64-encoded strings
  • Look for function calls revealing secrets

3. Image Analysis

PDFs often hide flags in images or XObjects:

  • Dumped images can be analyzed with:
    • zsteg for PNGs
    • stegsolve for bitplane analysis
    • exiftool for metadata

4. Stream Inspection

Streams can contain:

  • Base64-encoded flags
  • Hidden text
  • Hex-encoded messages

Use:

xxd objXX.bin | less
strings objXX.bin | grep FLAG

Common Patterns in CTF PDF Challenges

PatternDescriptionHow pdfdumper Helps
Flag in embedded fileFlag is in an attachmentExtract attachments automatically
Flag in imageImage inside PDF contains hidden dataDump images and analyze with stego tools
Flag in JavaScriptPDF script generates or encodes flagExtract and inspect JS streams
Base64/Hex-encoded streamsStreams contain hidden dataExtract streams and decode
Multi-layer hidingCombination of objects, images, and streamsDump everything for inspection

Recommended Workflow in CTF

  1. Dump all PDF objects and streams
pdfdumper challenge.pdf
  1. Check for embedded files
  • Images → objXX.png
  • Attachments → objXX.bin
  1. Analyze JavaScript streams
  • Decode base64 or other encodings
  1. Inspect images for hidden data
  • Steganography or metadata
  1. Decode any extracted binary streams
  • strings, xxd, binwalk, zsteg
  1. Combine clues
  • Sometimes the flag requires multiple objects to reconstruct

pdfdumper is a powerful first step for analyzing PDF challenges in CTFs, especially when flags are hidden in embedded streams or files.

Leave a Reply

Your email address will not be published. Required fields are marked *