Here's the thing, Google didn't solve the problem. Google didn't need to solve the problem because their code didn't actually reference public GitHub. Everything they referenced was basically internal or forked external.
They could do this because they have an amazing CI/CD pipeline. If somebody updated HEAD on the internal reference, the DevOps/SRE could confidently redeploy all dependent services. They would get alerts and automated rollbacks for failures ... /1
... But 99.9% of software companies don't have this quality of tooling. They don't have this confidence in deploys. They don't have the resources to "internal fork" every dependency they need and they're not paying most of these public projects they use.
So they need to make some concessions.
The first concession is typically "shrink wrapping" of dependencies. You declare the version of your dependencies and the build system pulls in a consistent version of those /2
Now you're not referencing main/HEAD
, but you also know exactly what you're getting. Build system can cache those values.
Other people's repos don't come with warranties, so you need to build your own assurances... /3
"As soon as the source code is merged to the main
branch, it should be considered published
."
The reason this doesn't work is that people who write these open source libraries don't actually provide that guarantee. Often they don't want code to work this way.
If I'm building an open source library, sometimes I need to make a backwards breaking change and sometimes I need to make a security fix on an old version. I can't do both with a single 'main' branch .../4
So at a minimum I end up with git several branches that are tagged with a specific version.
But if I'm like most open source libraries, I end up using other libraries that also have the same versioning challenges (backwards breaks, security patches)... So now I need to point my code at their specific branches.
And at this point, you basically have packaging, except you're scouring the internet on every build loading everyone's dependencies... /5
Then you have a Leftpad incident and Supply Chain attack.
So you start caching external repos and their dependencies. Then you start building them and running their tests and trying to keep them up to date...
And then, , you've basically reverse built a package manager. It's the only way to get the contracts you need from the code you use.
And that notion of contracts is with underpins all of this... /6
Google was able to pin directly to main
because they had very strong internal contracts with all of the code they used.
The vast majority of companies do not have these contracts in place. They leverage OSS code that they're not paying for. For which there is no explicit contract.
As this post highlights [www.softwaremaxims.com]: "I am not your supplier".
Package Managers help regulate these contracts by at least providing you with a consistent copy of code../7
They also tend to act as "trust middlemen" by maintaining relationships with top developers. And providing tooling to help manage quality and standardize things like versioning or lifecycle management or security checking.
I know package management feels like a technical failing. In some ways it is.
But package managers are not trying to fix a purely technical problem. They're also trying to fix this very human problem of contracts. //