3 years and 3 months ago, I signed up for maintaining Gentoo's pipenv ebuild. I thought it would be a fun way to learn about Python in Gentoo. I already knew that pipenv has issues. Also, I knew that the project was a bit dormant, due to the departure of Keneth Reitz, who was the original developer. And despite that I took on the commitment to work on that ebuild.
Gentoo's maintainers didn't like the fact that Pipenv bundles tons of Python packages in its vendor directory. Arguably, a security issue. Hence, bug 717666 - dev-python/pipenv: bundles humongous number of packages was opened. I also think that even though disk space is cheap it's a waste of resources. It seems like a nice project to work on. And what started as fixes for the ebuild, to "unvendor" those packages, turned into skepticism.
Are those packages even used? It seemed that many of those python packages where old and unused. Specifically, vistir seemed like a grab-bag of hacks to support multiple platforms and Python versions. But many of those had already better fixes in upstream packages or the standard library.
Eager to fix those issues, I started looking for good opportunity to fix issues directly in Pipenv. And then in June 2023, I saw that Frost Ming, the maintainer of Pipenv, is looking for help.
And so, together with Matt Davis I took over the project. My first goal, was to find
a way to unvendor all those packages. This required a lot of refactoring and patches
to requirementslib and vistir and few other packages that pipenv consumes.
Eventually, we fixed many issues in vistir but ended up dropping it because it was no longer
needed. Matt has done a tremendous job fixing requirementslib, until it was no longer possible,
and he ended up internalizing many parts, and rewritting huge amounts of code.
I continued to incrementally find stuff in the vendor library that was not needed due to fixes
in the standard library. An obvious example, pipenv was still shipping six even though it only
supported Python3 versions. Another example, we shipped orderedmultidict
. Python has had ordered
dictionaries for quite a while, so it was a bit of a red flag.
We found out that we can safely remove this library in favour of using regular dictionaries,
thus removing hundreds line of code immediately.
Many other vendored libraries where dropped, and in some cases we turned to use vendored libraries
that are found inside pip's own vendor directory, thus reducing duplication.
In parallel, I kept bumping versions of pipenv in Gentoo, incrementally removing more packages, from the vendor in favor of adding dependencies to other ebuilds. A few times, I tried removing all the packages all at once, but failed. Finally, I did the unpreventable and started creating ebuilds to packages that didn't have any ebuilds in Gentoo. I did not yet upstream them, just to keep experimenting with Python ebuilds.
This September, I finally decided to give this issue a final blow, and it's now closed. The Gentoo version no longer bundles packages in vendor! My work on this issue, has also a positive impact on pipenv for all other users. This is because recent version of pipenv no longer ship many packages in vendor. Pipenv release 2023.10.20 has only 16 vendored libraries, compared to 59 in 2020.8.13.
Finally, despite almost 20% reduction in size, pipenv is still a huge program. The wheel size of version 2020.8.13 is 3.9MB, while the wheel size of version 2023.10.20 is only 3.2MB. I was hopping for more, but the work is still not done. Future versions of pipenv will continue receiving bug fixes, along with speed improvements and size reduction.
If you use pipenv, please consider sponsoring our work via Github sponsors.
Share this post: