📝 Challenge Overview
In this challenge you download a PDF named confidential.pdf and inspect it. The PDF text appears redacted (black bars), but the file’s metadata contains a Base64 string in the Author field. Decoding that Base64 reveals the hidden flag. This is a classic metadata-forensics task — always check file metadata, not just visible content.
📁 Step 1: Download and identify the file
- Download
confidential.pdfto your working directory. - Verify the file type with the
filecommand:
$ file confidential.pdf confidential.pdf: PDF document, version 1.7, 1 page(s)
📝 Explanation: file quickly tells you the file format so you know what tools to use next. Even if contents look redacted visually, metadata may still hold clues.
🔎 Step 2: Inspect PDF metadata with pdfinfo
- Run
pdfinfoto list PDF metadata:
$ pdfinfo confidential.pdf Author: cGljb0NURntwdXp6bDNkX20zdGFkYXRhX2YwdW5kIV9lZTQ1NDk1MH0= Producer: PyPDF2 Pages: 1 Encrypted: no File size: 182705 bytes ...
- Notice the
Author:field contains a Base64-looking string (ending with=padding).
📝 Explanation: PDF metadata fields (Author, Title, Subject, Keywords, etc.) often contain remnants of text or intentionally hidden data. pdfinfo extracts these metadata values for inspection.
🧩 Step 3: Decode the Base64 string with Python
- Create a small Python script
decode.py:
import base64 cipher = "cGljb0NURntwdXp6bDNkX20zdGFkYXRhX2YwdW5kIV9lZTQ1NDk1MH0=" plain = base64.b64decode(cipher).decode() print(plain)
- Run it:
$ python3 decode.py
picoCTF{puzzl3d_m3tadata_f0und!_ee454950}
📝 Explanation: Base64 is a common encoding for embedding binary/text data inside other fields. The trailing = characters are padding and indicate Base64. Decoding turns the metadata into the readable flag.
🏁 Capture the Flag
🎉 The decoded metadata contains the flag:picoCTF{puzzl3d_m3tadata_f0und!_ee454950}
📊 Summary
| Step | Command / Action | Purpose | Key Result |
|---|---|---|---|
| 1 | file confidential.pdf | Identify file type | PDF confirmed (v1.7, 1 page) |
| 2 | pdfinfo confidential.pdf | Extract PDF metadata | Found Base64 string in Author field |
| 3 | Python Base64 decode script | Decode metadata to plaintext | Flag revealed: picoCTF{...} |
💡 Beginner Tips
- 🔎 Always check file metadata (PDF, images, Office docs) — flags and hints are frequently hidden there.
- 🧰 Useful tools:
file,pdfinfo,exiftool(for many file types), and CyberChef for quick Base64 decoding. - 🧪 If Base64 decoding fails, ensure proper padding (
=) or try tolerant decoders. - 🧾 Prefer local decoding (Python,
base64 -d) over unknown online services for safety.
🎓 What you learn (takeaways)
- File metadata can leak sensitive or intentional hidden information even when visible content is redacted.
pdfinfoand similar tools are your first step in file-forensics.- Base64 is a simple but common encoding — recognizing its padding (
=) helps you spot it quickly. - Always inspect both visible content and metadata when analyzing challenge files.
⚡ Short explanations for commands / techniques used
file <filename>- What: Reports a file’s type and basic info.
- Why: Confirms format and suggests next tools to use.
pdfinfo <file>- What: Prints PDF metadata (Author, Title, Producer, pages, etc.).
- Why: Reveals hidden or embedded info stored in metadata fields.
- Base64 detection / padding (
=/==)- What: Base64 encodings often end with
=padding; missing padding can cause decode errors. - Why: Spotting
=helps you identify Base64 at a glance.
- What: Base64 encodings often end with
- Python
base64.b64decode()- What: Standard library function to decode Base64 strings to bytes/text.
- Why: Quick and scriptable way to convert encoded metadata to plaintext.

Leave a Reply