Stop Using Pickle To Save Machine Learning Models

Elfao
3 min readMar 6, 2022
Photo by Kevin Ku on Unsplash

As you have probably already done it, saving machine learning models is quite easy but there are a good and bad way.

One common way to save and load models is using pickle(is a binary protocol to transform Python objects into a stream of bytes) but that isn’t a very good idea for several reasons.

Pickle has some negative sides, flaws that turn that solution as a bad one. The most important flaws are insecurity and the fact that pickle is bounded to Python language.
Python module came with a warning about using Pickle.

Warning The pickle module is not secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source.

But this isn’t the only one, there are others flaws (Old pickles look like old code, Implicit, Over-serializes, __init__ isn’t called, Python only, Unreadable, Appears to pickle code and Slow) detailed on this blog.

Ok I agree with you that if you save your model yourself and reuse it later, the danger is quite limited or even non-existent.
But think if your model will be sent to a partner or others teams that they don’t know you or they use other programming languages. They will rightly ask the question if they can load the model and if they will not load…

--

--

Elfao

Data scientist with 4 years experience. I worked in different field like Marketing digital, Consulting and currently I work for a start-up in finance.