Created
March 8, 2013 05:36
-
-
Save markhibberd/5114434 to your computer and use it in GitHub Desktop.
version rant
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Why do I believe this distinction is important? | |
Versions are virtual, the express intent not reality. Artifacts are real. How | |
they are idenitified is up for debate but I am of the belief it is not and | |
never can be versions. | |
As a concrete example, if I am using cabal and make the statement that I depend on | |
version 1.7.* of library barney, I am not stating that I think I work with all | |
versions of barney that start with 1.7, I am merely stating that my expectations | |
are that I am compatible with version 1.7 of barney and *believe* that barney intends | |
to convey some semantics with that version number and will modify the number if it | |
changes. I am entering into a contract of trust, but it is a contract I must have if | |
I want to work effectively with barney. If barney breaks that contract - I should | |
reconsider and determine whether my expectations were incorrect or was barney a complete | |
arsehole. | |
This is where I believe cabal (and pretty much every dependency tool) dies. It stops there. | |
I have not, and can not, make any claims as to correctness. I do not know what artifacts, | |
version 1.7.* encompasses, and I would argue you that you can not (even if the dependency | |
was stricter - 1.7.1), so how can I claim that I work with them. | |
This flows on to what I wanted to accomplish with chalk. There should be versions that | |
specify intent, but there should also be identifiers that track real artifacts (a hash | |
or signature being the most easily consumable representations). This is where it becomes | |
important to link building, testing and publishing. I think I should be able to publish | |
metadata that represents real artifacts, and clients should be able to leverage that | |
metadata. I can publish something that depends on 1.7.*, but has been built with ac1df12e | |
and fee1fbf1 clients should have the ability to say I don't care give me 1.7.* or give | |
me something that has been verified with a specific build (to what extent means you are | |
in a trust relationship with the publisher, but a more useful one). The more useful | |
metadata you have, the more awesome the client can be. For example: "I want all these | |
dependencies, with a preference for stable, and a preference for something that has been | |
tested with experimental compliler flag x." | |
I want this to be true. And it sucks every day that I don't have this level of tooling. | |
Basically, it's all just fucked and we should go back to writing monolithic codebases because this reuse stuff is more trouble than it's worth :P
But seriously... you've obviously got more of a handle on this than me, but I really suspect, were all of our ideas to be implemented, we'd still have a shitload of problems. I'm only just starting to realise how goddamn complex this situation is.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Couldn't agree more.
I think what I want is to have assertions of compatability in a very concrete form. If dependency resolution gets me B and C, where B depends on C, I want some way of knowing that B behaves correctly when used with C.
This could be information stored in the dependency system - e.g. a test result could note that a CI build has demonstrated that B compiles with C, or that B's tests have passed when using C.
Perhaps you have a few relationships to model in the dependency system:
If I then ask the dependency system to resolve the Y for my X, and I state a preference for stability, then the dependency system should prefer 1.2.8 over 1.2.*, as the CI result provides evidence of correctness.
I think we need to ask: what does it mean for me to trust another component? Right now, we all use systems of versions and use our own tests to verify - I depend on Q version 1.* and I trust that Q behaves "as I expect it to" for all versions 1.*.
Really, I wish to state a set of properties that I expect Q to exhibit - e.g. quickcheck properties, proofs of properties.
I think you absolutely nailed it with "The more useful metadata you have, the more awesome the client can be."
I got to thinking about this from working with cabal - I tweeted this a while back. This is how I think it happened:
I directly depended on a component and one of its dependencies. However, it was built with a version that was outside of my spec.
e.g. I depend on X version 1.3 and Y version 1.7. X depends on Y version 1.*. However, when I had cabal installed Y and X, X was built against Y version 1.1, and it couldn't handle it.
If you looked at all of the specs of the dependency tree, an answer could be resolved. However, when one component was compiled, the versions of its dependencies were resolved and those resolutions crystalised.
So, I thought: well, if I have an X version 1.3 that depends on Y version 1.1 and another that depends on Y version 1.2... well, there's 2 different X version 1.3.
So, what the heck does a version mean for a built artifact, if you can resolve dependencies differently and still call it the same number?
I think ultimately, the idea of using version numbers to model changes in software components just doesn't work. It's supposed to be a simple, practical model, but it's neither.
In the end, I think this simple model glosses over some fundamentally important complexities. What does it mean for one component to depend on another and be correct? Can you trust that their definition of correct (passes their tests) makes you correct? Do you need to test your dependencies?
Within a component, everything's fine. Things change in parallel. Dependent behaviors can be tested. But once you toss in that component barrier it all falls to shit.
Like many quality problems in software, I really think we need the approach of statically provable properties of programs. We need much stronger guarantees than "I think it works with 3.*".