Table of Contents

strings Command in CTF: Hidden Data Guide

執筆者:

カテゴリ:

The strings command in CTF is the first tool you should reach for when you don’t know what’s inside a file. It takes under three seconds to run and has saved me hours of digging with the wrong tools. But it also has a deceptive side — output that looks like a flag isn’t always one, and files where strings comes up empty aren’t necessarily empty of flags. This guide covers both sides, using real picoCTF challenges where strings either solved everything instantly or sent me down a dead end.

What This Article Covers

After reading this, you’ll know when to reach for strings, which flags to use depending on the file type, how to filter noise from a 10,000-line output, and — critically — when strings is pointing you at a fake flag. I’ll walk through two picoCTF challenges where strings played a central role: one where it cracked the problem in 30 seconds after an hour of wasted effort, and one where it surfaced the flag immediately but I kept going anyway to understand the actual vulnerability.

Introduction

The strings command extracts printable character sequences from any file — binaries, disk images, memory dumps, encrypted blobs — regardless of format. In CTF forensics and reverse engineering, it’s the fastest way to answer the question: “is there anything human-readable in here?” No installation needed on Linux, no dependencies, no configuration. Just point it at a file.

I first ran into strings during picoCTF’s DISKO 1 challenge. The problem gave me a raw disk image and no other hints. I spent close to an hour mounting it, running binwalk, trying Sleuth Kit — none of it worked because the flag wasn’t inside the filesystem at all. It was written directly to a raw disk sector. One command found it immediately:

strings disko-1.dd | grep "pico"
picoCTF{1t5_ju5t_4_5tr1n9_be6031da}

The challenge name was literally “DISKO 1” and the flag contained “5tr1n9” — both were telling me the answer from the start. I just wasn’t listening.

What Is strings? (And What It Isn’t)

strings scans a file byte by byte and prints any sequence of printable ASCII characters above a minimum length (default: 4 characters). That’s it. It doesn’t parse file formats, doesn’t understand binary structure, doesn’t decompress anything. It just looks for readable text, wherever it might be hiding.

This simplicity is also its biggest limitation. strings cannot:

Find flags that are XOR-encoded or AES-encrypted
Extract embedded files (use binwalk for that)
Decode LSB steganography (use zsteg)
Parse Unicode unless you tell it to

The most common beginner mistake is running strings, seeing no flag, and concluding there’s nothing there. A blank output means “no readable ASCII sequences of 4+ characters” — not “no flag.” The flag might be encoded, compressed, or hidden in a format strings doesn’t cover.

When to Use strings in CTF

Reach for strings first in these situations:

Unknown binary or file with no obvious format
Raw disk images (strings bypasses the filesystem entirely)
Memory dumps from forensics challenges
Executables in reverse engineering problems (look for embedded strings, build metadata, hard-coded keys)
Any file where you’re doing initial recon before committing to a heavier tool

In picoCTF’s Corrupted File challenge, the file opened with what looked like garbage bytes. Before reaching for a hex editor, I ran strings to check if there was anything readable left:

$ strings corrupted_file | head -20
JFIF
Exif
picoCTF{r3st0r1ng_th3_by73s_b67c1558}

The flag was right there in the first 20 lines. The JFIF and Exif markers also told me this was a JPEG with a corrupted header — information I used to fix the file properly afterward. Strings gave me a quick win and pointed me toward the actual problem in under 10 seconds.

Don’t use strings when:

You already know the file is encrypted (the output will be random garbage)
You’re dealing with LSB image steganography (no readable text is stored that way)
The challenge is network forensics or cryptography — strings is a forensics/reversing tool

Basic Usage

Default scan with grep

strings challenge.bin | grep -i "ctf{"

Start here. The -i flag makes grep case-insensitive — useful when the flag format uses mixed case or the author got creative.

Scan for Unicode (Windows binaries)

strings -e l challenge.exe | grep -i "ctf{"

Windows executables store many strings as UTF-16 Little Endian, not ASCII. The default strings command skips these entirely. The -e l flag switches to UTF-16 LE mode. I learned this the hard way on a reverse engineering problem where strings found nothing, but strings -e l surfaced the flag in one line. If the challenge involves a Windows binary, always run both.

Show file offset

strings -t x challenge.bin | grep "pico"
 29800 picoCTF{1t5_ju5t_4_5tr1n9_be6031da}

The -t x flag prints the hex offset of each string in the file. Useful when you need to locate the flag’s position to understand how it was embedded — or when you need to patch bytes around it. In DISKO 1, this confirmed the flag was at raw offset 0x29800, well outside any filesystem structure.

Raise the minimum length to cut noise

strings -n 8 challenge.bin | grep -i "flag\|ctf\|key\|pass"

Default output from a compiled binary can run to tens of thousands of lines — compiler metadata, library names, fragment byte sequences. Raising the minimum string length with -n 8 removes most of the noise. Be careful not to go too high: picoCTF flags tend to be long, so -n 8 is usually safe, but if you’re looking for short keys or passwords, stay at the default -n 4.

Common Mistakes and Rabbit Holes

Trusting a fake flag

In reverse engineering challenges, authors sometimes plant decoy flag strings in the binary. I submitted a strings result on a picoCTF reversing problem and got “Incorrect” — the binary had a hard-coded picoCTF{fake_flag_here} that was printed on the wrong password path. The real flag was assembled from parts at runtime.

How to check: look at the strings surrounding the flag candidate. If you see “Congratulations!” or “Access granted” nearby, it’s likely real. If it’s sandwiched between “Wrong password” or other error paths, treat it as a decoy and keep going.

Missing Unicode strings

Forgetting -e l on Windows binaries is the most common reason strings comes up empty when there’s clearly something there. UTF-16 LE stores each ASCII character with a null byte padding — strings in ASCII mode sees only the null bytes and skips the whole sequence.

Reaching for complex tools before running strings

On DISKO 1, I spent an hour with mount, binwalk, and Sleuth Kit before trying strings. The flag was sitting in a raw disk sector the whole time. Now my rule is: strings runs first, regardless of what the challenge looks like. It costs 3 seconds and has paid off enough times to make it non-negotiable.

Command and Option Reference

Command / Option	Purpose	When to Use
`strings file`	Default ASCII scan, minimum 4 chars	First recon on any unknown file
`strings -n 8 file`	Raise minimum length to 8	Large binaries with noisy output
`strings -e l file`	UTF-16 Little Endian mode	Windows executables (.exe, .dll)
`strings -t x file`	Show hex offset for each string	Locating flag position in raw binary
`strings file \| grep -i "ctf{"`	Filter for flag format	When you know the flag prefix
`strings file \| grep -i "flag\\|key\\|pass"`	Broad keyword filter	When flag format is unknown

Short Explanations for Key Options

-n (minimum length)

Controls how short a string can be and still appear in output. Default is 4. Compiled binaries contain thousands of 4-character sequences that are just compiler noise. Raising to 8 or 10 removes most of that clutter. Lower it back to 4 if you’re hunting for short keys or passwords.

-e (encoding)

Selects the character encoding to search for. The relevant options for CTF are -e s (default, 7-bit ASCII) and -e l (UTF-16 LE, used in Windows PE files). There’s also -e b for UTF-16 BE and -e S for 8-bit ASCII if you’re dealing with non-standard targets.

-t (offset format)

Prints each string with its byte offset in the file. Use -t x for hex offset (standard for binary analysis), -t d for decimal. The offset tells you where in the file the string physically lives — useful for understanding file structure or patching.

When strings Finds Nothing

Empty strings output doesn’t mean the file is empty. Work through this checklist:

Try -e l — if the target is a Windows binary, ASCII mode will miss everything
Lower -n to 3 — some flags or keys are shorter than 4 characters
Check for embedded files — run binwalk to see if there’s a ZIP or image buried inside
Check for encryption or encoding — XOR-obfuscated binaries and AES-encrypted blobs won’t have readable text; you need to reverse the encryption layer first
Try a hex dump — xxd file | head -40 shows the raw bytes and any near-ASCII patterns that strings might have missed

Beginner Tips

Make strings your first command on every unknown file, before anything else. 3 seconds of cost, occasional instant solve.
Always pipe through grep. Raw strings output from a compiled binary is unreadable without filtering.
On Windows challenge files, run both strings file and strings -e l file — skip one and you might miss the flag entirely.
If you get a hit that looks like a flag, check the surrounding output before submitting. Decoys exist.
The offset flag (-t x) is underused by beginners. Once you know where a string lives in a file, you can usually understand why it’s there.

What You Learn from This Tool

Using strings consistently teaches you to think about files as sequences of bytes rather than opaque objects. You start recognizing format markers (JFIF for JPEG, PK for ZIP, ELF for Linux binaries) that appear as readable strings in the first few bytes. That pattern recognition carries over to every forensics and reversing challenge you encounter.

The deeper lesson from DISKO 1 is about tool selection order. Complex tools aren’t always better — they’re just more work. Spending an hour with mount and Sleuth Kit before trying a one-liner is a category of mistake that CTF experience gradually eliminates. The habit strings builds is: “what’s the simplest thing I can try first?”

In real security work, strings is used for exactly the same purpose: rapid triage of malware samples, checking binary artifacts for embedded configuration or C2 addresses, and identifying file format anomalies in incident response. The skill is directly transferable.