Summary
This challenge revolves around extracting a bunch of layers from a tar
archive, and then rearranging those layers into other files. In short, someone at FE got creative with their use of tar :)
Extracting all layers of the first archive
We are given a file, tar-and-feathers.tgz
, which is a POSIX tar archive. Trying to extract from the archive results in a new archive named something like 25
and so on. We quickly realised that the name of each layer represented a hex value. Our idea was then to fully extract all layers from the original archive, and then assemble all the hex values we are given into a new file. Of course we need to also make sure that we get the order of the layers right when creating the new file. For extraction we used this quick and dirty script:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
#!/bin/bash
found=1
next='tar-and-feathers.tgz'
while [[ ${found} -eq 1 ]]; do
echo "Untaring - $next"
tmp=$(tar -tf $next)
tar -tf $next>>file
tar -xf $next
next=$tmp
done
exit 0
This script extracts all the layers, while adding the names of the layers into a file.
We can then write all these bytes to a file, resulting in a new tar archive, which contains these files:
1
E2.tar offsets.py runme.py* tar-and-feathers.tgz top.png
The challenge in layer 2
The source of runme.py
can be found below:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#!/usr/bin/env python3
import os
import sys
import subprocess
from offsets import offsets
if len(sys.argv) != 2:
print(f'Usage: {sys.argv[0]} <outfile>', file=sys.stderr)
exit(-1)
INIT = 'tar-and-feathers.tgz'
def run(cmd):
return subprocess.check_output(cmd, shell=True)
def unpack1(name):
filemagic = run(f'file {name}')
if b'bzip2' in filemagic:
run(f'mv {name} {name}.bz2')
run(f'bunzip2 {name}.bz2')
return unpack1(name)
return run(f'tar xfv {name}').strip().decode()
def getbyte(n):
print(f'getbyte({n}) = ', end='', file=sys.stderr, flush=True)
prev = None
for _ in range(n + 1):
next = unpack1(prev or INIT)
if prev and prev != next:
os.unlink(prev)
try:
byte = int(next, 16)
except:
os.unlink(next)
raise
prev = next
os.unlink(next)
print(f'0x{byte:02x}', file=sys.stderr)
return byte
def unpack(path):
data = bytes(getbyte(offset) for offset in offsets)
with file(path, 'wb') as fd:
fd.write(data)
unpack(sys.argv[1])
runme.py
seems to be a script that extracts each layer of the original tar archive, until it reaches a specific layer or offset, and then prints that layers name as a byte. It does this for each of the offsets in the offsets
file, meaning it takes ages for it to run, and we would probably still be waiting on this script to be done if we hadn’t optimized it. We basically this script, found the highest offset, ran it once looking for that offset and wrote all the extracted layer names to a file.
Modified runme.py
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#!/usr/bin/env python3
import os
import sys
import subprocess
from offsets import offsets
if len(sys.argv) != 2:
print(f'Usage: {sys.argv[0]} <outfile>', file=sys.stderr)
exit(-1)
INIT = 'tar-and-feathers.tgz'
def run(cmd):
return subprocess.check_output(cmd, shell=True)
def unpack1(name):
filemagic = run(f'file {name}')
if b'bzip2' in filemagic:
run(f'mv {name} {name}.bz2')
run(f'bunzip2 {name}.bz2')
return unpack1(name)
return run(f'tar xfv {name}').strip().decode()
def getbyte(n):
print(f'getbyte({n}) = ', end='', file=sys.stderr, flush=True)
prev = None
with open("outfile", 'w') as fd:
for _ in range(n + 1):
next = unpack1(prev or INIT)
if prev and prev != next:
os.unlink(prev)
try:
byte = int(next, 16)
except:
os.unlink(next)
raise
prev = next
fd.write(next+'\n')
os.unlink(next)
print(f'0x{byte:02x}', file=sys.stderr)
return byte
def unpack(path):
data = bytes(getbyte(50382))
unpack(sys.argv[1])
We then created the script below so that instead of extracting all the layers it simply looked up the value of the layer in an array, and wrote the bytes to a file named output
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
from offsets import offsets
import binascii
with open("outfile", "r") as f:
l = f.readlines()
for i,line in enumerate(l):
l[i] = line.strip()
print(l)
with open ("output", "wb") as f:
for offset in offsets:
f.write(binascii.unhexlify(l[offset]))
This gets you a pdf:
While text in the pdf has been “redacted” by putting a black bar over it, the text is still in the file and can be extracted via the following:
1
2
3
4
bitis@Workstation ~/c/f/t/_file_decoded.extracted> pdftotext download-1.pdf && cat download-1.txt
flag{it’s turtles all the way down}
1