Looking at Compiled Python Files
Python
is like Emacs
for me, where you use it all the time but every now and then you discover something new and wonder how you lived without it. I've known for a while you can compile python
files but I've never really bother looking into them much. Well I have some time so here are my notes at looking into compiled python
files.
Hello World
So I'm going to need a file to actually compile first, but I want to see the differences (if any) when you import a module to when you do not.
def hello(name): return ("hello {}".format(name)) print(hello("world!"))
hello world!
Got to be a little different to every other "Hello World" programme out there.
Now its time to compile this script with the snippet below.
import py_compile py_compile.compile("hello.py")
So thats actually made a folder called __pycache__
and inside of there I have the compiled .pyc
file.
Initial Analysis
Ok so like any binary file, there are a few things I can do to get the lowdown on the file.
file ./__pycache__/hello.cpython-37.pyc && \ strings ./__pycache__/hello.cpython-37.pyc && \ cat ./__pycache__/hello.cpython-37.pyc
./__pycache__/hello.cpython-37.pyc: data a}f^O hello {}) format) name hello.py hello world!N) printr <module> B a}f^O @ s d d Z ee d dS )c C s d | S )Nzhello {})format)name r hello.pyhello s r zworld!N)r printr r r r <module> s
As expected its just a data binary, and the source code could with a little luck be easily interpreted if needed.
Import this
I'm not going to import this
as i dont think that would be a good test :), but I want to see what happens when I import a libary. I have more knowelege of the requests
libary so I'll compile the below script.
import requests def get(url): r = requests.get(url, timeout=5, verify=False) return r.content print(get("https://httpbin.org/get"))
b'{\n "args": {}, \n "headers": {\n "Accept": "*/*", \n "Accept-Encoding": "gzip, deflate", \n "Host": "httpbin.org", \n "User-Agent": "python-requests/2.22.0", \n "X-Amzn-Trace-Id": "Root=1-5e668044-f4814cc60b85f9445f50695c"\n }, \n "origin": "86.20.106.90", \n "url": "https://httpbin.org/get"\n}\n'
Its not pretty but it'll do for the simple test. Like last time lets compile it again.
import py_compile py_compile.compile("get.py")
And much like last time its made a folder __pycache__
and a compiled .pyc
of the above file.
Initial Analysis
Like last time, I'll see if there is anything noteworthy in the strings of the file and the file type.
file ./__pycache__/get.cpython-37.pyc && \ strings ./__pycache__/get.cpython-37.pyc && \ cat ./__pycache__/get.cpython-37.pyc
./__pycache__/get.cpython-37.pyc: data timeoutZ verify) requests getZ content) get.pyr https://httpbin.org/get) printr <module> B f^ @ s d dl Z dd Zeed dS ) Nc C s t j| ddd}|jS )N F)ZtimeoutZverify)requestsgetZcontent)Zurlr r get.pyr s r zhttps://httpbin.org/get)r r printr r r r <module> s
Ah bummber, nothing different from the output of this example and the output from the previous one. Oh well, most people would see this as a failure, but I've learnt from this and sharing it with the reader.