JuliaLang: The Ingredients for a Composable Programming Language
One of the most remarkable things about the julia programming language, is how well the packages compose. You can almost always reuse someone else’s types or methods in your own software without issues. This is generally taken on a high level to be true of all programming languages because that is what a library is. However, experienced software engineers often note that its surprisingly difficult in practice to take something from one project and use it in another without tweaking. But in the julia ecosystem this seems to mostly work. This post explores some theories on why; and some loose recommendations to future language designers.
This blog post is based on a talk I was invited to give at the 2020 F(by) conference. Hopefully videos from that will be up soon, and I will link it here. This blog post is a bit ad-hoc in its ordering and content because of its origin as a talk. I trust the reader will forgive me.
Name collisions make package authors come together and create base packages (like StatsBase) and agree on what the functions mean.
They don’t have to, since the user can still solve it, but it encourages the practice. Thus we have package authors thinking about how other packages might be used with theirs.
Package authors can even overload functions from multiple namespaces if you want; e.g. all of MLJBase.predict, StatsBase.predict, SkLearn.predict. Which might all have slightly different interfaces targetting different use cases.
Its easier to create a package than a local module.
Many languages have one module per file, and you can load that module e.g. via import Filename from your current directory.
You can make this work in julia also, but it is surprisingly fiddly.
What is easy however, is to create and use a package.
What does making a local module generally give you?
The feeling you are doing good software engineering
Easier to transition later to a package
What does making a julia package give you?
All the above plus
Standard directory structure, src, test etc
Managed dependencies, both what they are, and what versions
Easy re-distributivity – harder to have local state
Test-able using package manager’s pkg> test MyPackage
Julia uses a JIT compiler, so even compilation errors don’t arrive until run-time. As a dynamic language the type system says very little about correctness.
Testing julia code is important. If code-paths are not covered in tests there is almost nothing in the language itself to protect them from having any kind of error.
So it’s important to have Continuous Integration and other such tooling set up.
Trivial package creation is important
Many people who create julia packages are not traditional software developers; e.g. a large portion are academic researchers. People who don’t think of themselves as “Developers” are less inclined to take the step to turn their code into a package.
Recall that many julia package authors are graduate students who are trying to get their next paper complete. Lots of scientific code never gets released, and lots of the code that does never gets made usable for others. But if they start out writing a package (rather than a local module that just works for their script) then it is already several steps closes to being released. Once it is a package people start thinking more like package authors, and start to consider how it will be used.
It’s not a silver bullet but it is one more push in the right direction.
Multiple Dispatch + Duck-typing
Assume it walks like a duck and talks like a duck, and if it doesn’t fix that.
Julia’s combination of duck-typing with multiple dispatch is quite neat. It lets us have support for any object that meets the implict interface expected by a function (duck-typing); while also having a chance to handle it as a special-case if it doesn’t (multiple dispatch). In a fully extensible way.
This pairs to the weakness of julia in its lack of a static type system. A static type system’s benefits comes from ensuring interfaces are met at compile time. This largely makes in-compatible with duck-typing. (There are other interesting options in this space though, e.g. Structural Typing.)
The example in this section will serve to illustrate how duck-typing and multiple dispatch give the expressivity that is escential for composability.
Aside: Open Classes
Another closely related factor is Open Classes. But I’m not going to talk about that today, I recommend finding other resources on it. Such as Eli Bendersky’s blog post on the expression problem. You need to allow new methods to be added to existing classes. Some languages (e.g. Java) require that methods literally be placed inside the same file as the class. This means there is no way to add methods in another code-base, even unrelated ones.
We would like to use some code from a library
Consider I might have a type from the Ducks library.
and I have some code I want to run, that I wrote:
Lets give it a try:
3 Adult ducks, 2 Baby ducks: Input:
Great, it works
Ok now I want to extend it with my own type. A Swan
Lets test with just 1 first:
The Waddle was right, but Swans don’t Quack.
We did some duck-typing – Swans walk like ducks, but they don’t talk like ducks.
We can solve that with single dispatch.
Great, now lets try a whole farm of Swans:
That’s not right. Swans do not lead their young to water.
They carry them
Once again we can solve this with single dispatch.
Now I want a Farm with mixed poultry.
2 Ducks, a Swan, and 2 baby swans
Thats not right again.
We had a Duck raising a baby Swan, and it lead the baby Swan to water.
If you know about raising poultry, then you will know: Ducks given baby Swans to raise, will just abandon them.
But how will we code this?
Option 1: Rewrite the Duck
Rewriting the Duck has problems
Have to edit someone else’s library, to add support for my type.
This could mean adding a lot of code for them to maintain.
Does not scale, what if other people wanted to add Chickens, Geese etc.
but it means copying their code into my library, will run into issues like not being able to update.
Scaled to other people adding new types even worse, since no longer a central canonical source to copy.
Variant: could fork their code
That is giving up on code reuse.
There are engineering solutions around this. Design patterns allow one to emulate features a language doesn’t have. For example the Duck could allow for one to register behaviour with a given baby animal, which is basically adhoc runtime multiple dispatch. But this would require the Duck to be rewritten this way.
Option 2: Inherit from the Duck
(NB: this example is not valid julia code)
Inheriting from the Duck has problems:
Have to replace every Duck in my code-base with DuckWithSwanSupport
If I am using other libraries that might return a Duck I have to deal with that also
Again there are design patterns that can help, like using Dependency Injection to control how all Ducks are created. But now all libraries have to be rewritten to use it.
Still does not scale:
If someone else implements DuckWithChickenSupport, and I want to use both their code and mine, what do I do?