Excuse this embarrassingly real-world post on software packages — no endofunctors or compactly generated spaces today, I’m sorry to say.
Problems in the packagehood
Like most other programmers, I’ve experienced my share of trouble when trying to install this version or that of some library or other. The typical example is your application being brought down by a new version of a package you’ve never even heard about. Let me explain how can this happen.
Suppose you are developing the application Cooperative Obstruction of Locomotion (COoL). Naturally, you need to use the library called Best Looping Effects (BLE). So you add a dependency to your build system saying ‘BLE=*’ (meaning all versions of the library will do). Next morning you try to build your application only to find out it does not work at all. What happened? The author of BLE decided to post a new version 126.96.36.199, introducing breaking changes from the earlier version 188.8.131.52. Wiser, you put into your dependencies ‘BLE=184.108.40.206’, since that’s the version that worked yesterday and yay, your application works again. And that’s the end of this post, right?
Yeah, right — the very next day, as you rebuild the system, you discover it does not work once more. What now? Well, it turns out that the BLE library included a dependency ‘ASS=*’ (Astoundingly Stupendous Search) and just last night the author of ASS updated the package from 220.127.116.11 to 18.104.22.168, introducing breaking changes in the process. Oops.
What can you do? That depends on your package manager. If you can specify that you want to use version 22.214.171.124 of ASS when building BLE, you can update your package and it will work, for now. Or you can convince the author of BLE to fix their ASS dependency to specify the exact version. But you won’t be able to use that without changing your code — remember that BLE version 126.96.36.199 had changes that broke your application, so the new 188.8.131.52 will have them too.
Okay, imagine the pink world where ponies roam freely, leprechauns dance in polygonal arrangements and application developers only specify exact versions of their dependencies. Does it solve all your problems? The answer is no. One simple reason is that you might need multiple versions of the same library. For example BLE 1.5 might include a function wiggleLoop, which is removed in BLE 1.6 and replaced with wobbleLoop. What if you want to both wiggle and wobble? Tough luck, there’s no package manager that I know of that would allow you to keep multiple versions of some library. You have to pick. And it’s quite probable that BLE had some serious reasons for dropping wiggleLoop, so from version 1.6 onwards you will never see it again. The typical action here is to keep your BLE dependency updated to the newest version because they offer much cool stuff as well as bugfixes. Thus, with a broken heart, you drop all loop wiggling support. All that’s left for you is wailing every second Friday around 8pm, only thing to wiggle being your whisky on the rocks.
A remark here is in order, lest people think I don’t know what I’m talking about. Serious package managers do not use * dependencies (the reason being that it’s completely stupid, as I hoped to have illustrated above) but they still use inequalities like ‘BLE>1.0’ AND ‘BLE<=5.2’ to specify that any version of BLE from 1.1 to 5.2 will work, say. Do you think everyone has really tested that all packages depending on BLE really work with all those versions? Well, let me answer that myself — hardy, har, har; no, nobody checked. What happens at most is that the new package is compiled (if the language is statically typed, otherwise not even that) and few tests are run. Long story short, you can rest assured much and more will break.
The root of all evil
There are many other issues with traditional package management but instead of describing each one in turn, let me try to explain what I think is a source of all the trouble and propose a possible, if unorthodox solution. Well, one big problem pointed above is the deletion of code. If the creator of BLE never removed wiggleLoop in the first place, you would have no problems at all. But it’s not just deletion. It’s all change — or mutation, using a technical language — altogether. For example, wiggleLoop could have been changed to actually fizzle loops instead of wiggle them and nobody might have told you about it. You upgrade to new version of BLE, happily thinking you’ll profit from all the new bugfixes and performance improvements and BOOM — you receive huge number of angry calls from your clients asking why their loops are all fizzled up. Now imagine loops are stock shares, fizzle is an action of selling and wiggle is simply some reporting the current share count. You’ve just lost your clients millions of dollars. Stuff like this does happen occasionally and on the smaller scale this happens all the time.
Thus my proposal is simple — do not change anything! There is a very old paradigm in programming, called purely functional style, that currently finally enjoys something of a boom after a very long period of dormancy in academic circles. The pure word means that all functions are innocent and cannot do anything immoral. For example, you might have function + that adds two numbers and using it 1 + 1 is always 2, no matter how many times you carry out that computation. This is so painfully obvious and yet programmers everywhere are busy creating functions like add(1, 1) that actually do compute the addition (thereby passing the tests) but what they did not tell you is that they also incidentally launch at least 1+1 nuclear missiles. Of course, every time you call such a function, the number of missiles launched will be different, just to spice it up. The key difference here is between a mathematical function — an object that always does the same thing — and an imperative programming function — a thing that can and will capture your children and skin them alive, making even Lord Ramsey green with envy. And in case you did not know, the reality of these impure functions is the brave new world of computing we live in everyday. That’s one of the reasons why you need to turn it off and on again — because nobody friggin’ knows what’s going on.
Okay, back to package management. Applying the pure functional perspective, let’s keep our code management pure. That means you cannot delete anything, ever. WiggleLoops will always stay in place and you can use them till the end of time. Also, nobody can change a function; they have to create a new one — just like the old one but a bit different. Thus there will be wobbleLoop 1, wobbleLoop 2, and so on, for every single change and you can pick whichever one of them you like as your dependency. Maybe the version woobleLoop 345 that was created two years ago was the best one ever for your purposes? Cool, no problem! You can go back, pick it and use it as it was back then. Thus your codebase just became a persistent data structure!
Now, this proposal makes most sense for purely functional languages with static type systems. Static because you only need to consider functions that type-check and functional because pure functions only depend on the variables they reference and on their source code and nothing else. Thus if you can compute a cryptographic hash of the function’s source code and the referenced variables, you are guaranteed to have one stable piece of functionality that you can reuse forever. And this is my proposal!
The new way
We do not need files. We do not need packages. All we need is those stable bits of functionality. I can’t remember how many times I needed to download megabytes of recursive dependencies just to use a single function of some package! All I wanted was some little bit called BLE.wiggleLoop that could have been stored in a global function database indexed by the hash of that function. And in case it was not already stored there, it would be straightforward to compute it because it itself depends on just several other references that are similarly easily obtainable. The function database forms an acyclic graph, same that our current package databases, so it’s obvious how to work with it.
Just think about how many times you did not know whether to put some function in the file A or file B. If the function is mergeTwoObjects(a, b) where a is from file A and b is from file B then it belongs in neither file and every decision is completely arbitrary. We have a similar situation if the function does some stuff to two packages. E.g., you have a function getOn putting Passengers into Buses. Do you put it in your Passenger module or you Bus module? Or perhaps you have created Special Passenger-Bus module? Yep, still completely arbitrary.
Where’s the catch, Jim?
Obviously, I realize that there are many questions left to be answered. For example, how do you manage hundred versions of the same function? Well, you already do that if you’re using some versioning system, don’t you? Except versioning systems make it pretty hard to go back one year ago, pick some function from back then and use it today. We would need many adjustments in our editing software that would let us easily walk between the versions — and also dynamically update the view of all the dependencies since older functions depend on older other old functions. But the result would be pretty cool. In this world, all code is equal and nothing is ever obsolete.
Another big question is how would this management play with standard package managers. Obviously we cannot just forget about all the existing software. We would need a builder that would take as input a source file and store in the DB all the functions contained therein, as well as their dependencies. Doing this for all standard packages would allow us to use all the old functionality. By patching standard package managers we would then be able to pull new function bits into old packages too. It’s all a lot of work but I don’t see any insurmountable obstacle and I think it’s definitely the right way to go in the long run.