Python Pickle Security Problems and Solutions

If you are familiar with Python, you may have used the Pickle standard library module for object serialization. This module allows a developer to convert a Python object into data that can be transferred over the network, written to a file, or even stored away in a database. When the object is later needed, the Pickle module can convert the serialized data into a regular Python object.

When building distributed systems, a data serialization format can be used to communicate between machines. The Pickle module may be considered ideal, but there are a few security problems that should be known to anyone using this module.

Python Pickle: Code an Attacker Might Use

In this example, we will use ZeroMQ to send data serialized with Pickle from one Python instance to another.

Client Code (

import pickleimport zmqcontext = zmq.Context()sock = context.socket(zmq.PULL)sock.connect("tcp://localhost:8006")# Receive a messagemessage = sock.recv()# Unpickle the data from the socketpickle.loads(message)

Server Code (

import pickleimport subprocessimport zmqcontext = zmq.Context()sock = context.socket(zmq.PUSH)sock.bind("tcp://*:8006")class Payload(object):    """ Executes /bin/ls when unpickled. """    def __reduce__(self):        """ Run /bin/ls on the remote machine. """        return (subprocess.Popen, (('/bin/ls',),))# Send the payload over the socketsock.send(pickle.dumps(Payload()))

In separate shells, run and

>> python

As you can see, the client executed the code that was defined in Payload.__reduce__(). A more advanced attack would involve the attacker gaining remote access to a shell on the target system.

There are valid reasons for running the code in __reduce__ though. Implementing the __reduce__ method in objects provides a way to save the state of objects that were previously difficult to serialize. However, allowing the serialized object to dictate how it should be unserialized could provide attackers with a simple attack vector to execute arbitrary code.

Even if the attacker does not have control of the server, he may have access to the network between the client and the server. In this scenario, the attacker can inject a payload into the communication channel between the two machines.

Python Pickle Security Best Practices

  • If possible, encrypt the network connection between the machines communicating pickled data. This will prevent modification of pickled data. Using SSL/TLS to encrypt network connections between systems is very common and effective in preventing attackers from tampering with network traffic.
  • If network connection encryption is not possible, use a digital signature to maintain data integrity and ensure network traffic is not altered in transit.
  • If pickled data is stored to a disk, ensure strict file permissions are applied to prevent someone from modifying the pickled data.
  • Since it is easy to execute arbitrary code when unpickling data, it may be best to avoid using the Pickle module. Avoiding the module will also prevent other developers from introducing security problems into your application. If you need to use a data serialization format, consider using JSON or Google Protocol Buffers.

Python Pickle: Security Risk & Alternative

At SmartFile, we use Google Protocol Buffers for communication between software systems and view Python Pickle as a security risk. As a security measure, we disallow usage of the Pickle module in all of our software dependencies.

How Can I Find Pickle Usage in My Code?

To check your code base for usage of a Pickle, you can use bandit, a security linter from the OpenStack Security Group. This tool will help you find common security problems in Python code. It is also useful to check your project dependencies for usage of Pickle.

Install Bandit

> pip install bandit

Run Analysis

> bandit -r .[bandit]\tINFO\tusing config: /home/travis/venv/etc/bandit/bandit.yaml[bandit]\tINFO\trunning on Python 2.7.10Run started:\t2015-11-06 22:32:03.032735Run metrics:\tTotal lines of code: 33\tTotal lines skipped (#nosec): 0\tTotal issues (by severity):\t\tUndefined: 0\t\tLow: 4\t\tMedium: 1\t\tHigh: 0\tTotal issues (by confidence):\t\tUndefined: 0\t\tLow: 0\t\tMedium: 0\t\tHigh: 5Files skipped (0):Test results:>> Issue: [blacklist_imports] Consider possible security implications associated with pickle module.   Severity: Low   Confidence: High   Location: ./\timport pickle2\timport zmq3\t >> Issue: [blacklist_calls] Pickle library appears to be in use, possible security issue.   Severity: Medium   Confidence: High   Location: ./\tmessage= sock.recv()9\tpickle.loads(message)>> Issue: [blacklist_imports] Consider possible security implications associated with pickle module.   Severity: Low   Confidence: High   Location: ./\timport pickle2\timport subprocess3\timport zmq>> Issue: [blacklist_imports] Consider possible security implications associated with subprocess module.   Severity: Low   Confidence: High   Location: ./\timport pickle2\timport subprocess3\timport zmq

Adding bandit to your Continuous Integration service (such as Jenkins or Travis CI) can help prevent team members from introducing potential security problems.


