node_modules is the black box where other programmers solve your problems.
Isn't this the point of abstraction? You don't need to understand internal combustion engines and fuel injectors to drive a car.
But corporate entities manufacture those components with legal guarantees, regulatory frameworks, and an international reputation to maintain. When they fail, insurance can cover recovery, diagnosis, and repair.
There are no regulatory guarantees for third-party code. When a product-crippling bug appears in the middle of the night, you aren't insured, and there's no one to sue. There's no recovery team coming.
When you trace the bug back to a wonky line of code in one of your "blazing fast" dependencies, all you can do is get angry on GitHub. You didn't pay for the code, and that means the chain of responsibility stops with you.
node_modules folder is a wild place. The average codebase has hundreds, if not thousands of transitive dependencies and trying to audit them all is an exercise in futility.
With nearly two million packages overall, the npm registry is already an order of magnitude larger than equivalent registries for other programming languages.
The simple answer is to do with popularity, but we're not interested in simple answers today. Like any complex problem, there are many factors to consider.
It doesn't help that both Node.js and the web have 'mid-level' standard libraries, either. Too low-level to use directly in applications, but high-level enough that the average developer can spot opportunities for abstractions.
These gaps created a culture of using libraries to solve all but the simplest problems. If you want to build a user interface, you reach for something like React, not
document.createElement. If you want to build a server, you reach for something like Express, not
http.createServer. Even before npm existed, working directly with the DOM was uncommon. Almost everyone reached for jQuery.
When you learn that libraries are the way to solve problems, then you solve novel problems (real and imagined) with novel libraries. More programmers, more problems. More problems, more packages.
Instead, you reach for libraries. Projects often begin with the installation of a formatter, a linter, a bundler, a type checker, a minifier, a testing framework, or some combination of the above.
These tools are developed in relative isolation. The formatter knows nothing about the linter, the linter knows nothing about the type checker, and so on. These tools need to be wired together with plugins and configuration files. This creates another explosion of supporting packages.
This melting pot of ideas, opinions, and subcultures helps the language learn from the successes and failures of other languages. It also fragments the identity of the language, the idioms from which people learn, and the ecosystem as a whole.
Almost every popular package that solves a problem with object oriented programming, has an alternative that provides the same functionality with a pure functional flavour. For each package which embraces the dynamic aspects of the language, there's another which constrains them with static types. Every school of thought believes that their way is best, and they all rewrite existing libraries to prove it.
These tribal rivalries also bleed into stylistic preferences for frameworks. React, Vue, Svelte, Angular, Web Components and more. These ecosystems tend to solve the same kinds of problems in parallel, each creating a unique package ecosystem.
Even language features have proven to be controversial enough to create their own divisions. It's easy to forget how many libraries were duplicated whilst the community took a few years to decide that promises were probably better than callbacks.
Maturing packages tend to lose velocity as they grow in size, and developers begin to search for alternatives that solve a vertical slice of the problem with significantly fewer lines of code.
Eventually some of the alternatives mature and the cycle repeats itself. It only takes a few generations of splitting to arrive at atomic packages like
For the authors, these tiny modules are easier to reason about in isolation, which makes them easier to test, easier to optimise, and easier to version. Consumers can install the exact set of tiny packages they need without worrying about whether their toolchain will "tree shake" everything correctly.
The dream is a lego brick utopia. Each package solves a single problem, and they all slot together neatly to build a lego castle.
However, not everyone is putting lego into the bucket. The classical inheritance fans are using wooden bricks, the functional folks prefer magnets, the reactive streams crowd have marbles, and the well-meaning but misguided rookies are pushing playdough into the gaps.
There's a reason that toy shops don't sell lego bricks individually. Small modules dilute the coherence of the ecosystem as a whole.
These inexperienced but overconfident developers tend to discover that it is harder to make a meaningful contribution to an existing package than it is to create a simpler version from scratch.
Recreating existing software is a wonderful way to learn. However, npm set the bar for publishing a package so low that these learning exercises have significantly diluted the value of the average package in the registry.
There are hundreds of thousands of unused and unmaintained packages by developers who thought they were making a meaningful contribution to the ecosystem by publishing half-baked thought experiments with catchy names and unusually polished logos.
I'm as guilty as anyone else. My own npm profile lists some shamefully empty packages that my younger self thought would grow into popular and useful open source projects.
Not everyone there has misplaced their confidence though; some just want to look better on paper, some don't realise that they're solving a solved problem, and others are so afraid of repeating themselves that they package up and publish every tiny abstraction they ever stumble upon.
The darker side of this problem is known as typosquatting and it involves registering malicious packages with deliberately similar names to popular ones. After you hit
npm install your system is at the mercy of the
postinstall scripts of whichever package name you typed. On a good day a misspelling results in a registry miss with a friendly warning. On a bad day, you send your production environment variables to a bad actor.
npm has naming rules that prevent some common techniques for typosquatting, but while humans have fat-fingers and slow response times, we'll keep making mistakes, and bad actors will capitalise on them because it is laughably easy for them to publish packages on npm.
Why Does It Matter?
The dilution of npm should matter to you, because understanding the code that you add to your project is important.
You should know whether you just installed something that's going to try to steal crypto from your users. You should know whether the package is actively maintained, or whether it's going to be your job to make sure that the code is still secure.
You don't need to understand low-level implementation details, but you should at least have a high level of understanding of how a package works, unless you want to be caught with your hands in your pockets when something goes wrong.
With a high quality package registry, you can install packages with confidence that the code is correct, that the code is maintained, and that the code is not malicious. There's a better chance that semantic versioning is applied correctly and that breaking changes will be documented.
Where Do We Go From Here?
It's too late for npm to ever become a high quality package registry. Death by a thousand cuts has taken its toll on the ecosystem and the idea of a fresh start with Deno is appealing to many.
There's still plenty that we can do to improve the current story though.
We can encourage the trend towards integrated tools. Node's move towards a standard test runner is a step in the right direction. The fewer tools we need to install to be productive, the better.
We can prefer lego sets to lego bricks. Use the standard library when possible. Don't use one-line packages when the equivalent method exists in
lodash. The current generation of build tools are great at removing unused code from your bundles, and unifying around high quality packages gives the whole community a shared frame of reference.
We can aim to create a culture of contribution. It's easy to shame mature packages for having clunky codebases and huge bundle sizes, but it's better to join the conversation to see what is being done and how you can help. When popular packages are improved rather than abandoned, everyone wins.
We can make a habit of picking one package with many contributors, rather than many packages with one contributor. We can make a habit of picking the package with zero dependencies.
We can reduce tribalism by being open to solutions written in different programming styles, even when that means sacrificing some purity in our own projects.
We can take more responsibility when deciding to publish packages. Is the problem that we're solving real or imagined? Have we shared the codebase with other developers to find out whether the solution is generally applicable? Are the appropriate tests and documentation in place? Are we committed to maintaining the codebase, or is it just a completed weekend project?
@danprince/foo), but to be published without a scope (e.g.
foo) the package would need to reach a high integrity score (some combination of downloads, contributors, and published versions) and would need to undergo a manual quality and security review.
Unscoped packages would be blessed as part of the ecosystem, a mark of quality which would help us all navigate the registry. If we can have a committee that decides on language features, I see no reason why we couldn't have a similar process for blessing packages too.
We don't need two million packages. We probably don't even need two thousand.