Analyzing ZIP Encryption: When to Act

zip2john in CTF is the right tool only when you’re looking at the right kind of encryption. I learned this the hard way at TJCTF 2024, where I spent thirty minutes running wordlists against a ZipCrypto archive before realizing the password wasn’t going to fall to a dictionary attack — and that a completely different approach was needed. The encryption type tells you which weapon to reach for. Getting that wrong wastes your time on tools that are guaranteed to fail.


Two types of ZIP encryption, two completely different attack paths

Most CTF ZIP challenges use one of two encryption schemes, and confusing them is the single most common mistake:

EncryptionIndicator in zipinfoAttack with zip2john?Typical CTF approach
ZipCrypto (legacy)minimum version: 2.0Yes — generates crackable hashzip2john → john/hashcat + wordlist
AES-256 (WinZip)minimum version: 5.1, or 0x33 in local headerYes — but hash type differszip2john → hashcat mode 13600

The command to tell them apart:

$ zipinfo -v secret.zip | grep -E "version|method|security"
  version of encoding software:                   3.0
  minimum software version required to extract:   2.0
  compression method:                             deflated
  file security status:                           encrypted

Minimum version 2.0 with “encrypted” status means ZipCrypto. If you see 5.1 or higher, it’s AES. Run this before touching any cracking tool.


TJCTF 2024 “nothing-to-see”: the challenge that broke my wordlist habit

The challenge gave a PNG file. Running binwalk and then foremost on it extracted a ZIP archive buried inside the image data:

$ foremost nothing.png
Foremost version 1.5.7 by Jesse Kornblum, Kris Kendall, and Nick Mikus

File: nothing.png  Length: 14 KB (14979 bytes)

Num    Name (bs=512)          Size     File Offset    Comment
0:     00000028.zip            228 B      14751
1:     00000000.png             14 KB         0       (560 x 450)

2 FILES EXTRACTED  zip:= 1  png:= 1

The extracted ZIP: 228 bytes total, containing a single flag.txt (39 bytes uncompressed). Encrypted with ZipCrypto.

$ zipinfo 399F.zip
Archive:  399F.zip  Zip file size: 228 bytes, number of entries: 1
-rw-r--r--  3.0 unx       39 TX defN 23-May-26 09:34 flag.txt
1 file, 39 bytes uncompressed, 34 bytes compressed: 12.8%

ZipCrypto means zip2john should work. I extracted the hash and ran it against rockyou.txt:

$ zip2john 399F.zip > flag.hash
$ john --wordlist=/usr/share/wordlists/rockyou.txt flag.hash
Using default input encoding: UTF-8
Loaded 1 password hash (PKZIP [32/64])
Press 'q' or Ctrl-C to abort, almost any other key for status
0g 0:00:02:43 DONE (2024-05-26 21:18) 0g/s 4849Kp/s 4849Kc/s 4849KC/s
Session completed. (0 passwords cracked)

Nothing. rockyou.txt failed completely. I ran --rules=Single and --rules=Jumbo. Still nothing. The password wasn’t in any standard wordlist.

This is the Rabbit Hole that wordlist-focused guides don’t prepare you for: ZipCrypto is crackable in principle, but only if the password is weak enough to be in a wordlist. If the challenge author chose a random password, dictionary attack is a dead end — and that’s exactly what happened here.

The correct path for this challenge was bkcrack, a known-plaintext attack tool. ZipCrypto has a cryptographic weakness: if you know even 12 bytes of the plaintext content, you can recover the encryption keys regardless of password complexity. A flag.txt file in a CTF ZIP almost certainly starts with a known prefix like tjctf{ — that’s enough known plaintext to run bkcrack.

$ bkcrack -C 399F.zip -c flag.txt -p known_plaintext.txt
bkcrack 1.5.0 - 2022-07-07
[...]
# If keys are found:
$ bkcrack -C 399F.zip -c flag.txt -k <key1> <key2> <key3> -d flag.txt

The known-plaintext attack works at the cryptographic level — it bypasses the password entirely. The TJCTF challenge was designed around this technique, not zip2john + wordlist. I recognized this too late, after wasting thirty minutes on dictionary attacks that were never going to succeed.


When zip2john does work: picoCTF archive_madness

The CyLab Security Academy (formerly picoCTF) challenge “archive_madness” was the opposite scenario: AES-256 encryption, but a weak, context-predictable password.

$ zipinfo -v secret_archive.zip | grep -E "version|security|method"
  minimum software version required to extract:   5.1
  compression method:                             deflated
  file security status:                           encrypted (AES)

AES-256 means zip2john generates a different hash type — mode 13600 in hashcat, or the $zip2$ format in john. The encryption itself is unbreakable by brute force, but the password can still fall to a targeted wordlist if the challenge author left a hint.

The challenge name was “archive_madness” and the file was named secret_archive.zip. The combination suggested the password might be archive-related. I built a small targeted wordlist from the challenge context — words like “archive”, “madness”, “secret”, combined with the year — and ran it with mangling rules:

$ zip2john secret_archive.zip > archive.hash
$ cat custom.txt
archive
madness
secret
pico
ctf
$ john --wordlist=custom.txt --rules=Single archive.hash
Using default input encoding: UTF-8
Loaded 1 password hash (ZIP, WinZip [PBKDF2-SHA1 256/256 AVX2 8x])
archive2024       (secret_archive.zip/flag.txt)
1g 0:00:00:03 DONE (2024-03-15 14:22) 0.3333g/s 213.3p/s 213.3c/s
Session completed.

$ unzip -P archive2024 secret_archive.zip
Archive:  secret_archive.zip
  inflating: flag.txt
$ cat flag.txt
picoCTF{...}

The password was archive2024 — directly derived from the challenge name and year. The custom wordlist cracked it in seconds when rockyou.txt would have taken hours and potentially missed it entirely.

The key lesson: AES-256 ZIP encryption is cryptographically strong, but it’s only as secure as the password. PBKDF2-HMAC-SHA1 key derivation means the hash is slow to crack — which is why a targeted wordlist matters more here than with ZipCrypto. Trying rockyou.txt’s 14 million entries against an AES-256 ZIP hash is slow enough that you want high-probability candidates first.


Full diagnostic workflow

StepCommandWhat to look forNext action
1. Identify filefile secret.zipZip archive dataContinue
2. Check encryptionzipinfo -v secret.zip | grep version2.0 = ZipCrypto, 5.1 = AESChoose attack path
3. List contentszipinfo secret.zipFilenames, sizes, compressionIf filenames hint at password, build targeted wordlist
4. Extract hashzip2john secret.zip > out.hashHash string generatedContinue
5a. Targeted wordlistjohn --wordlist=custom.txt --rules=Single out.hashPassword from context cluesIf found → unzip -P
5b. General wordlistjohn --wordlist=rockyou.txt out.hashCommon passwordsIf nothing → check encryption type again
6. Known-plaintext (ZipCrypto only)bkcrack -C secret.zip -c file.txt -p known.txtKey recovery, bypasses passwordIf flag prefix is known
7. Extractunzip -P <password> secret.zipDecrypted filesDone

zip2john vs fcrackzip vs bkcrack vs hashcat: how I actually decide

ToolWorks withUse whenDon’t use when
zip2john + johnZipCrypto, AES-256You have a wordlist candidate and time; good for AES with targeted wordlistPassword is random and not in any wordlist
fcrackzipZipCrypto onlyQuick brute-force of short numeric passwords on ZipCryptoAES-256 — it will silently fail or give wrong results
bkcrackZipCrypto onlyYou know 12+ bytes of plaintext content (e.g., flag prefix)AES-256 — the ZipCrypto cryptographic weakness doesn’t apply
hashcatBoth, via hashGPU available; large wordlist; AES-256 mode 13600You don’t know the encryption type — run zipinfo first

fcrackzip is the most common mistake in CTF ZIP challenges. It doesn’t support AES-256. If you run it against an AES-256 file, it will either report no password found or, worse, report an incorrect result. The TJCTF nothing-to-see challenge used ZipCrypto — so fcrackzip would have run — but the password wasn’t weak enough to find, which is why the wordlist approach still failed.


Building a targeted wordlist from challenge context

Most CTF ZIP passwords are weak by design — the challenge is about recognizing the encryption type and the cracking technique, not about cracking a genuinely strong password. The password is usually hidden in plain sight: the challenge name, the filenames inside the ZIP, the competition name, or the year.

# Build a wordlist from context clues
$ cat custom.txt
challengename
filename
ctfname
2024
picoctf
pico

# Add mangling rules (john generates variations: Challengename, CHALLENGENAME, challengename1, etc.)
$ john --wordlist=custom.txt --rules=Single archive.hash

# If that fails, try combining with numbers
$ crunch 8 12 -t %%%%@@@@ > numeric_wordlist.txt
$ john --wordlist=numeric_wordlist.txt archive.hash

The --rules=Single flag applies mangling rules that generate capitalization variants, number suffixes, and character substitutions from your wordlist entries. A 5-word custom wordlist with Single rules becomes several hundred candidates — enough to catch “archive2024” from “archive” in seconds.


Why ZipCrypto’s weakness matters beyond CTF

ZipCrypto was designed in the early 1990s. Its known-plaintext vulnerability — the basis of bkcrack — was published by Biham and Kocher in 1994. The attack requires knowing only 12 bytes of plaintext, which in practice means knowing any file extension, magic bytes, or text prefix inside the archive.

This is why security-conscious ZIP tools moved to AES-256 (WinZip AES, 7-Zip). If you see ZipCrypto on a real-world file claiming to be “encrypted,” the encryption is effectively broken — not by brute force, but by cryptanalysis. bkcrack can recover the decryption keys from any ZipCrypto archive where you know any 12 bytes of any file inside it.

In CTF terms: if the challenge uses ZipCrypto and your wordlist fails, don’t keep running larger wordlists. Switch to bkcrack. The TJCTF challenge was specifically designed around this — the wordlist was never the intended solution.


Further Reading

For a broader overview of forensics tools used in CTF challenges, CTF Forensics Tools: The Ultimate Guide for Beginners covers zip2john alongside the full forensics toolkit.

The TJCTF nothing-to-see challenge involved extracting the ZIP from inside a PNG using binwalk. The binwalk guide covers how to scan for embedded files and interpret the offset output.

For challenges where the ZIP contents are disk images rather than flag files, the dd in CTF guide covers byte-level extraction once the archive is open.

コメント

Leave a Reply

Your email address will not be published. Required fields are marked *

投稿をさらに読み込む