introspection

challenge

challenge.py

#!/usr/local/bin/python
import subprocess
import sys

pkl = raw_input("Enter your introspective pickle: ")

pickle_result = subprocess.check_output([sys.executable, "unpickle.py", pkl])
cpickle_result = subprocess.check_output([sys.executable, "cunpickle.py", pkl])

print open("flag.txt").read().strip() if "pickle" in pickle_result and "cPickle" not in pickle_result and "cPickle" in cpickle_result and "pickle" not in cpickle_result else "You have much to learn."

unpickle.py

#!/usr/local/bin/python
import pickle
import io
import base64
import sys

WiseUnpickler = pickle.Unpickler(io.BytesIO(base64.b64decode(sys.argv[1])))
WiseUnpickler.find_class = None

try: print WiseUnpickler.load()
except: pass

cunpickle.py

#!/usr/local/bin/python
import cPickle
import io
import base64
import sys

cWiseUnpickler = cPickle.Unpickler(io.BytesIO(base64.b64decode(sys.argv[1])))
cWiseUnpickler.find_global = None

try: print cWiseUnpickler.load()
except: pass

Dockerfile

FROM pwn.red/jail

COPY --from=python:2-slim-buster / /srv
COPY --chmod=755 introspection.py /srv/app/run
COPY cunpickle.py unpickle.py flag.txt /srv/app/

solution

first off, this challenge was done on python2 (hence the print statement with no parens)

essentially, we had to make a pickle that when unpickled using the python implementation, returns "pickle", but when unpickled using the c implementation of pickle, returns "cPickle"

if we look at the python docs for pickle (here)

image

little does anyone know, their differences are NOT pointed out where necessary. there are two majors ones, which i will describe below

we can start by opening up the c implementation of pickle in python2 and the python implementation of pickle in python2

the unpickling class is a stack based virtual machine that has a bunch of opcodes listed on line 102 of pickle.py.

one such opcode is the INT opcode, which is implemented as below in python

image

the c implementation is below

image

do you see any differences?

well, here, you can see that the python implementation uses int() and long() as backup, but the c implementation uses a sort of long() primarily

we can abuse this by putting in octals like 010 which is 8 in octal but 10 but in long

alternatively, we can exploit a discrepancy in the long_binget opcode, which is definitely the easiest way to solve this challenge

image

image

we can exploit the fact that the mloads will return a signed integer.

so, we can see below that it prints something different

#!/usr/bin/python2.7

import pickle
import io
from pickle import *
import cPickle

p = PROTO + '\x02'
p += STRING + '"cPickle"\n'
p += PUT + b'-1\n'
p += STRING + b'"pickle"\n'
# memo is {'-1': "cPickle"}, stack has "pickle" at the top

p += LONG_BINPUT + '\xff\xff\xff\xff'  # pickle will recognize this as -1, cPickle will recognize this as 4294967295
# cPickle will have memo as {'-1': "cPickle", '4294967295': "pickle"}
# pickle will have memo as {'-1': "pickle"}
p += GET + b'-1\n'
# we return what is at -1 in memo

p += STOP

print(p)

cWiseUnpickler = cPickle.Unpickler(io.BytesIO(p))
cWiseUnpickler.find_global = None
print(cWiseUnpickler.load(), cWiseUnpickler.memo)

WiseUnpickler = pickle.Unpickler(io.BytesIO(p))
WiseUnpickler.find_class = None
print(WiseUnpickler.load(), WiseUnpickler.memo)