Looking at Compiled Python Files

Python is like Emacs for me, where you use it all the time but every now and then you discover something new and wonder how you lived without it. I've known for a while you can compile python files but I've never really bother looking into them much. Well I have some time so here are my notes at looking into compiled python files.

Hello World

So I'm going to need a file to actually compile first, but I want to see the differences (if any) when you import a module to when you do not.

def hello(name):
    return ("hello {}".format(name))


print(hello("world!"))
hello world!

Got to be a little different to every other "Hello World" programme out there.

Now its time to compile this script with the snippet below.

import py_compile

py_compile.compile("hello.py")

So thats actually made a folder called __pycache__ and inside of there I have the compiled .pyc file.

Initial Analysis

Ok so like any binary file, there are a few things I can do to get the lowdown on the file.

file ./__pycache__/hello.cpython-37.pyc && \
    strings ./__pycache__/hello.cpython-37.pyc && \
    cat ./__pycache__/hello.cpython-37.pyc
./__pycache__/hello.cpython-37.pyc: data
a}f^O
hello {})
format)
name
hello.py
hello
world!N)
printr
<module>
B

a}f^O@sddZeeddS)cCs
d|S)Nzhello {})format)namerhello.pyhellosrzworld!N)rprintrrrr<module>s

As expected its just a data binary, and the source code could with a little luck be easily interpreted if needed.

Import this

I'm not going to import this as i dont think that would be a good test :), but I want to see what happens when I import a libary. I have more knowelege of the requests libary so I'll compile the below script.

import requests


def get(url):
    r = requests.get(url, timeout=5, verify=False)
    return r.content


print(get("https://httpbin.org/get"))
b'{\n  "args": {}, \n  "headers": {\n    "Accept": "*/*", \n    "Accept-Encoding": "gzip, deflate", \n    "Host": "httpbin.org", \n    "User-Agent": "python-requests/2.22.0", \n    "X-Amzn-Trace-Id": "Root=1-5e668044-f4814cc60b85f9445f50695c"\n  }, \n  "origin": "86.20.106.90", \n  "url": "https://httpbin.org/get"\n}\n'

Its not pretty but it'll do for the simple test. Like last time lets compile it again.

import py_compile

py_compile.compile("get.py")

And much like last time its made a folder __pycache__ and a compiled .pyc of the above file.

Initial Analysis

Like last time, I'll see if there is anything noteworthy in the strings of the file and the file type.

file ./__pycache__/get.cpython-37.pyc && \
    strings ./__pycache__/get.cpython-37.pyc && \
    cat ./__pycache__/get.cpython-37.pyc
./__pycache__/get.cpython-37.pyc: data
timeoutZ
verify)
requests
getZ
content)
get.pyr
https://httpbin.org/get)
printr
<module>
B

f^@s ddlZddZeeddS)NcCstj|ddd}|jS)NF)ZtimeoutZverify)requestsgetZcontent)Zurlrrget.pyrsrzhttps://httpbin.org/get)rrprintrrrr<module>s

Ah bummber, nothing different from the output of this example and the output from the previous one. Oh well, most people would see this as a failure, but I've learnt from this and sharing it with the reader.