Functional, Programming, Software

Packages gone wild

Excuse this embarrassingly real-world post on software packages — no endofunctors or compactly generated spaces today, I’m sorry to say.

Problems in the packagehood

Like most other programmers, I’ve experienced my share of trouble when trying to install this version or that of some library or other. The typical example is your application being brought down by a new version of a package you’ve never even heard about. Let me explain how can this happen.

Suppose you are developing the application Cooperative Obstruction of Locomotion (COoL). Naturally, you need to use the library called Best Looping Effects (BLE). So you add a dependency to your build system saying ‘BLE=*’ (meaning all versions of the library will do). Next morning you try to build your application only to find out it does not work at all. What happened? The author of BLE decided to post a new version, introducing breaking changes from the earlier version Wiser, you put into your dependencies ‘BLE=’, since that’s the version that worked yesterday and yay, your application works again. And that’s the end of this post, right?

Yeah, right — the very next day, as you rebuild the system, you discover it does not work once more. What now? Well, it turns out that the BLE library included a dependency ‘ASS=*’ (Astoundingly Stupendous Search) and just last night the author of ASS updated the package from to, introducing breaking changes in the process. Oops.

What can you do? That depends on your package manager. If you can specify that you want to use version of ASS when building BLE, you can update your package and it will work, for now. Or you can convince the author of BLE to fix their ASS dependency to specify the exact version. But you won’t be able to use that without changing your code — remember that BLE version had changes that broke your application, so the new will have them too.

Okay, imagine the pink world where ponies roam freely, leprechauns dance in polygonal arrangements and application developers only specify exact versions of their dependencies. Does it solve all your problems? The answer is no. One simple reason is that you might need multiple versions of the same library. For example BLE 1.5 might include a function wiggleLoop, which is removed in BLE 1.6 and replaced with wobbleLoop. What if you want to both wiggle and wobble? Tough luck, there’s no package manager that I know of that would allow you to keep multiple versions of some library. You have to pick. And it’s quite probable that BLE had some serious reasons for dropping wiggleLoop, so from version 1.6 onwards you will never see it again. The typical action here is to keep your BLE dependency updated to the newest version because they offer much cool stuff as well as bugfixes. Thus, with a broken heart, you drop all loop wiggling support. All that’s left for you is wailing every second Friday around 8pm, only thing to wiggle being your whisky on the rocks.

A remark here is in order, lest people think I don’t know what I’m talking about. Serious package managers do not use * dependencies (the reason being that it’s completely stupid, as I hoped to have illustrated above) but they still use inequalities like ‘BLE>1.0’ AND ‘BLE<=5.2’ to specify that any version of BLE from 1.1 to 5.2 will work, say. Do you think everyone has really tested that all packages depending on BLE really work with all those versions? Well, let me answer that myself — hardy, har, har; no, nobody checked. What happens at most is that the new package is compiled (if the language is statically typed, otherwise not even that) and few tests are run. Long story short, you can rest assured much and more will break.

The root of all evil

There are many other issues with traditional package management but instead of describing each one in turn, let me try to explain what I think is a source of all the trouble and propose a possible, if unorthodox solution. Well, one big problem pointed above is the deletion of code. If the creator of BLE never removed wiggleLoop in the first place, you would have no problems at all. But it’s not just deletion. It’s all change — or mutation, using a technical language — altogether. For example, wiggleLoop could have been changed to actually fizzle loops instead of wiggle them and nobody might have told you about it. You upgrade to new version of BLE, happily thinking you’ll profit from all the new bugfixes and performance improvements and BOOM — you receive huge number of angry calls from your clients asking why their loops are all fizzled up. Now imagine loops are stock shares, fizzle is an action of selling and wiggle is simply some reporting the current share count. You’ve just lost your clients millions of dollars. Stuff like this does happen occasionally and on the smaller scale this happens all the time.

Thus my proposal is simple — do not change anything! There is a very old paradigm in programming, called purely functional style, that currently finally enjoys something of a boom after a very long period of dormancy in academic circles. The pure word means that all functions are innocent and cannot do anything immoral. For example, you might have function + that adds two numbers and using it 1 + 1 is always 2, no matter how many times you carry out that computation. This is so painfully obvious and yet programmers everywhere are busy creating functions like add(1, 1) that actually do compute the addition (thereby passing the tests) but what they did not tell you is that they also incidentally launch at least 1+1 nuclear missiles. Of course, every time you call such a function, the number of missiles launched will be different, just to spice it up. The key difference here is between a mathematical function — an object that always does the same thing — and an imperative programming function — a thing that can and will capture your children and skin them alive, making even Lord Ramsey green with envy. And in case you did not know, the reality of these impure functions is the brave new world of computing we live in everyday. That’s one of the reasons why you need to turn it off and on again — because nobody friggin’ knows what’s going on.

Okay, back to package management. Applying the pure functional perspective, let’s keep our code management pure. That means you cannot delete anything, ever. WiggleLoops will always stay in place and you can use them till the end of time. Also, nobody can change a function; they have to create a new one — just like the old one but a bit different. Thus there will be wobbleLoop 1, wobbleLoop 2, and so on, for every single change and you can pick whichever one of them you like as your dependency. Maybe the version woobleLoop 345 that was created two years ago was the best one ever for your purposes? Cool, no problem! You can go back, pick it and use it as it was back then. Thus your codebase just became a persistent data structure!

Now, this proposal makes most sense for purely functional languages with static type systems. Static because you only need to consider functions that type-check and functional because pure functions only depend on the variables they reference and on their source code and nothing else. Thus if you can compute a cryptographic hash of the function’s source code and the referenced variables, you are guaranteed to have one stable piece of functionality that you can reuse forever. And this is my proposal!

The new way

We do not need files. We do not need packages. All we need is those stable bits of functionality. I can’t remember how many times I needed to download megabytes of recursive dependencies just to use a single function of some package! All I wanted was some little bit called BLE.wiggleLoop that could have been stored in a global function database indexed by the hash of that function. And in case it was not already stored there, it would be straightforward to compute it because it itself depends on just several other references that are similarly easily obtainable. The function database forms an acyclic graph, same that our current package databases, so it’s obvious how to work with it.

Just think about how many times you did not know whether to put some function in the file A or file B. If the function is mergeTwoObjects(a, b) where a is from file A and b is from file B then it belongs in neither file and every decision is completely arbitrary. We have a similar situation if the function does some stuff to two packages. E.g., you have a function getOn putting Passengers into Buses. Do you put it in your Passenger module or you Bus module? Or perhaps you have created Special Passenger-Bus module? Yep, still completely arbitrary.

Where’s the catch, Jim?

Obviously, I realize that there are many questions left to be answered. For example, how do you manage hundred versions of the same function? Well, you already do that if you’re using some versioning system, don’t you? Except versioning systems make it pretty hard to go back one year ago, pick some function from back then and use it today. We would need many adjustments in our editing software that would let us easily walk between the versions — and also dynamically update the view of all the dependencies since older functions depend on older other old functions. But the result would be pretty cool. In this world, all code is equal and nothing is ever obsolete.

Another big question is how would this management play with standard package managers. Obviously we cannot just forget about all the existing software. We would need a builder that would take as input a source file and store in the DB all the functions contained therein, as well as their dependencies. Doing this for all standard packages would allow us to use all the old functionality. By patching standard package managers we would then be able to pull new function bits into old packages too. It’s all a lot of work but I don’t see any insurmountable obstacle and I think it’s definitely the right way to go in the long run.



Whence and whither?

Welcome dear reader.

This blog is meant to be a brain dump of my thoughts on math, physics and computer science. Nevertheless, I’ll try to make these ideas as approachable as possible, so that you can share them with your grandparents and/or grandchildren (if you have neither feel free to talk about the post with yourself).

But let’s not dwell on idle chatter and jump instead straight into the deep end. There is a certain branch of mathematics known as category theory that deals with pictures such as this one.

A -> B

It turns out that one can describe most of the objects that one encounters in mathematics and science with similar pictures. I will not dwell on this fundamental fact here, since I mean to describe such objects extensively in future installments. Just take my word for it that pink bubbles and arrows are important.

A consequence of this observation is that mathematics contains an implicit notion of direction that one might not have a priori expected to find there. In the picture above, A is the producer, B is the consumer, and not the other way around. In this post I’d like to discuss precisely this kind of asymmetry and point out that it is not found just in mathematics but also all around us and is perfectly natural.

I — and I suspect many other people — consider myself a consumer. I eagerly read my twitter and RSS feeds. But I do not tweet that much and since this is my very first post, it’s obvious that neither do I blog that much. You can often see me with my nose deep in some mathematical book or other. But I do not write books. I listen to music all the time but I do not compose it nor even interpret it. Why is that? And is it all right or should I change something about it?

I’d like to argue that being a consumer is perfectly natural. For one thing, suppose everybody was a composer and nobody would go to concerts. If a virtuosic piece is played and no one is around to hear it, does it make a sound? Okay, I do not want to push it with my paraphrases but you get the point.

Obviously, I do not claim that everybody is either a composer or a listener and nothing else. We get this relationship for every single product, be it art, science, or a piece of furniture and a single person can and does participate in dozens of them. And the world kind of likes to balance these relationships out so that there is a steady flow of products from producers to consumers. Yes, I am talking about the (in)famous (in)visible hand of the free market. But that’s not quite what I wanted to talk about.

I know you hoped I’ve forgotten already, but let’s get back to those feelings of guilt you get when you have done anything at all all weekend beyond downloading images of kittens, tucking in nachos and dribbling over Khaleesi. That is you producer instinct kicking in. In mathematics there is a fundamental principle of duality. I remarked above that everyone being a composer would not be very useful. But neither would be everybody being a listener — we would have nothing to listen to (well, technically, nature itself does make some interesting sounds too but who cares).

The point is that while it’s fine being a consumer in many areas, one should also produce something from time to time. That’s why I’m starting this blog — duh!



P.S.: I hope you did not expect to find here some grand life-changing ideas — it is my first post after all. Here, have a picture of kitten and fare well, dear reader!