> So let's vendor it. How much code is there? After removing all tests, we end up with 29 individual crates vendored taking up 62MB disk space. Tokei reports 209,150 lines of code.
> Now this is a bit misleading, because like many times most of this is within windows-. But how much of windows- does getrandom need? A single function.
See also the Azure CLI. There's a github issue, still open, from 2018 complaining about their 350 MB monstrosity bloating up a docker container. By now the thing weighs more than 1.5 GB. Fucking MS.
I assume, as an ex Microsoft person, that’s due to shipping the Org chart at a team level. Each sub package of the azure cli probably has it's own team that hates talking to anyone else, so implements the versions of just what they need in their subpackage. They probably have more than one python included at this point because some team needed python 3.12 and another is still on 3.6 and claims updating would take a year with a team of 10.
(Disclaimer, I didn’t work anywhere near this, and am just making up a guess).
mmh0000 35 days ago [-]
There's a great meme of the team structure at Microsoft:
Maybe you could have just told us what the "unbreakable law" is, so we can all see in 2 seconds? Instead of posting a link to video that doesn't seem very keen to tell us even in the first couple of minutes?
cookie_monsta 35 days ago [-]
Thank you for compounding the problem. Skipping ahead to the conclusion, I am guessing the law is "be lean and flexible"?
db48x 35 days ago [-]
No, the unbreakable law is Conway’s Law (not that Conway):
“Organizations which design systems (in the broad sense used here) are constrained to produce designs which are copies of the communication structures of these organizations.”
cookie_monsta 35 days ago [-]
Oh, ok. Thanks, but curious why that link instead of so many others which talk explicitly about that law
db48x 35 days ago [-]
He talks of nothing else for most of an hour, and provides several salient and memorable examples.
isoprophlex 35 days ago [-]
beautiful, because this is probably exactly what is happening here... all the subcommands bring their own jungle of crap along.
adolph 35 days ago [-]
Any system that doesn’t ruthlessly self edit winds up like a genomic katamari.
Systems are like babies: once you get one, you have it.[a.] They don’t go away. On the contrary, they display the most remarkable persistence. They not only persist; they grow. And as they grow, they encroach. The growth potential of Systems was explored in a tentative, preliminary way by Parkinson, who concluded that Administrative Systems maintain an average rate of growth of five to six percent per annum (corrected for inflation) regardless of the work to be done.
- from Systematics by John Gall
0. https://en.wikipedia.org/wiki/Junk_DNA
(Well aware of how “junk” DNA is linked to functions elsewhere, and that the subtleties involved represent an evolutionary aggregation equivalent to why acli is 1.5G.)
The reason why the windows-sys (and below the windows-targets) crate are so beefy is that they are basically a bunch of binary blobs that are needed to link stuff together due to how import libs work on Windows. https://kennykerr.ca/rust-getting-started/understanding-wind...
In theory that would not be necessary any more on more modern rustc versions, but if you want to target rustc < 1.70 you still need that.
akx 35 days ago [-]
That's my issue! <3
EDIT: It's gotten even worse in the last 6 years!
portaltonowhere 35 days ago [-]
I agree with his sentiments in the article. I love Rust as a PL, but the situation with certain crates and dependency trees is a bit of a nightmare IMO. It's certainly a trade off.
I recently ripped out the rand crate and replaced it with some much simpler code ported from a C++ codebase. Still does what I need it to do but way fewer LOC and way less complexity. Is it as flexible as what rand and related crates offer? Maybe not, but that flexibility comes at a cost.
Ygg2 35 days ago [-]
I also disagree, first off rand is working on simplifying it. Plus out of those dependencies it's hard to see something I'd rather do myself than trust other people with.
Windows-sys is necessary for w
Windows OS kernel, libc is similar thing for *Nix, cfg-if is necessary for specializing targets per OS, arch, or SIMD capabilities.
Biggest offender is honestly zerocopy-derive. Which pulls in most dependencies.
jicea 33 days ago [-]
I really like Armin "food for thought" articles and I'm also concerned with the dependencies attitude in Rust. I like the language a lot, but I cringe when I clone some project and see the number of dependencies... It's really a balance: when you add a dependency you get a lot of code for "free" but nothing is free: now you've potentially new bug/security failure/things to update. It's a balance.
On the article, two random thoughts:
- I like that the Rust standard lib is "tiny" and a lot of stuff is delegated to third-parties crates. I wish crates support namespaces (for instance std) so it's easier to see what dependencies are blesser. For the moment, you can obtain a good name, like http, and squat it forever (http being just an example, don't know if there is an actual crate)
- when you vendor a Rust project, does it vendor also the flag dependencies? For instance, Rand is dependent on Serde because of an optional flag for serializing a random generator. Serde loc should be ignored in that case if we count the code lines.
vlovich123 35 days ago [-]
I think that the rand crate is much bigger than it needs to be and is conflating unrelated concepts *. This is a sore spot for the stdlib - it should standardize 1 PRNG and a CSPRNG so that they’re available on all platforms as a default and the types that everyone can use so that you can properly plug in whatever PRNG / CSPRNG that you want. It should also standardize what interfaces random distributions should conform to and implement really common ones like Norm & Uniform. Those two changes alone would remove the need for the vast majority of dependencies, especially if a crate wants to delegate selection of the RNG to their users.
That being said, I simultaneously think the concerns are slightly overblown on the safety part. Having stable pillar crates that everyone builds around is a good thing not a bad thing. The build issues for things like that should be solved at the language/tooling level (e.g. pulling in a crate for 1 function should be trivially cheap) while relying on the network effects of auditing the components (i.e. it’s OK to rely on a crate with a stronger chain of trust than you have yourself).
* To be fair, they call out alternatives that you might find more appealing, but the type and module system being what it is (+ the name rand being so concise and appealing when you come at it with a first glance), it becomes the de facto standard.*
dwattttt 35 days ago [-]
I think it's more a conscious decision to default to a CSPRNG unless you know better. If you know don't know the difference, a safer default means less problems.
EDIT: it's a similar situation to general purpose allocators. If you know you can use a simple one, it's orders of magnitude less code and complexity than a general one.
vlovich123 35 days ago [-]
The Rust stdlib doesn’t come with a CSPRNG so the point is moot. On the other hand, if you’re rolling your own crypto you deserve what you get if you don’t know the difference between CSPRNG and PRNG & you can name them differently even (e.g. rand(), insecure_rand()).
dwattttt 35 days ago [-]
This was in response to the size of rand & dependencies, trying to be the safe default for every use. I don't have a strong opinion on whether it belongs in the stdlib.
glitchc 35 days ago [-]
I'm not sure if this is possible. All systems do not have access to an entropy source of the same quality.
vlovich123 35 days ago [-]
Then the CSPRNG API just wouldn’t be available (just the traits). There are plenty of platform-specific APIs available within the stdlib. The PRNG would always be available though.
koakuma-chan 35 days ago [-]
> All of these are great crates, but do I need all of this just to generate a random number?
I love Armin’s blog. I don’t always agree with everything he says, but always come away with a new appreciation for his POV.
the_mitsuhiko 35 days ago [-]
Thank you for that. I appreciate this.
malcolmgreaves 35 days ago [-]
Does —release do tree shaking to remove unused code in the final executable?
the_mitsuhiko 35 days ago [-]
Rust's compiler is very good at removing most unused code. You are however going to pay a lot for the compilation. In case of some of those common dependencies you are not infrequently ending up with them multiple times in your dependency tree since not all libraries move up to the latest version. For instance today I have three different versions of windows-sys in my dependency tree and three zerocopy versions.
malcolmgreaves 35 days ago [-]
I see — thank you for the information!
Do you happen to also know if there is any ongoing work in rustc to make compilation faster for this situation?
I’m imagining that one could typecheck and then see if ASTs are used / unused and eliminate them before generating code. Maybe that would speed up compilation? Perhaps this is already being done.
the_mitsuhiko 35 days ago [-]
Rust's compilation unit is an entire crate. There is not much that can be done here as far as I can tell without changing the compilation model. It's not like C++ where you can just compile individual object files. You really are hoping that the linker cleans it up.
R. R. Coveyou, R. D. MacPherson,
"Fourier Analysis of Uniform Random
Number Generators", Journal of the
ACM, Volume 14, Issue 1, Jan. 1967,
Pages 100-119.
From memory (might check the paper):
i -- a positive integer
ip1 -- a positive integer
Set i = 1
Do Forever
Set ip1 = i * 5^15 + 1 mod 2^47
Return(ip1)
Set i = ip1
End
some1else 35 days ago [-]
Here's one I use for cryptographic purposes:
let invocations = 0;
export function rand3000() {
invocations += 1;
const timestamp = new Date().getTime() + invocations * 1000;
const masked = (timestamp ^ (timestamp >> 8)) & 0xFF;
const result = masked / 255;
return result;
}
stephc_int13 35 days ago [-]
1/ What?
If this kind of insane dependencies for such a simple thing is common practice in the Rust ecosystem then the language is in a lot worse place than I imagined it to be.
koakuma-chan 35 days ago [-]
Most of those dependencies are for procedural macros (compile time only), don't be misled by this silly article.
kibwen 35 days ago [-]
It's not insane, the author has been bitten by their poor experiences with dependencies in other languages and is misapplying that experience to Rust out of hand.
Listen, I'd be as happy as anyone to have random numbers in the Rust standard library. Compared to the Rust developers, I'm a believer in stdlib maximalism, downsides be damned. But all this recent hand-wringing about dependencies is a tiresome moral panic.
burntsushi 35 days ago [-]
"moral panic" is a bit of a reach don't you think? Increasing dependencies is a real problem with real downsides. There are plenty of characters expressing unreasonable things, but that doesn't mean everyone expressing concern about dependencies is indulging in a moral panic. There is nuance!
If there weren't real costs to dependencies then I personally never would have published regex-lite.
kibwen 30 days ago [-]
> If there weren't real costs to dependencies
The OP isn't addressing the real costs of dependencies, the moral panic in question is the automatic assertion that more dependencies is worse than fewer dependencies, which implies that e.g. all the work you have done to cleanly separate regex out into reusable regex-syntax and regex-automata crates has done a disservice to your users. There are real arguments to be made about wrangling one's trusted computing base, but this isn't making that argument, and by throwing the baby out with the bathwater it sets us back as a profession.
burntsushi 27 days ago [-]
> The OP isn't addressing the real costs of dependencies
Sorry, what? The section on "Compilation Times" is absolutely a real cost!
> the moral panic in question is the automatic assertion that more dependencies is worse than fewer dependencies
I agree that is a moral panic, but I disagree that Armin is indulging in that assertion.
> Now this is a bit misleading, because like many times most of this is within windows-. But how much of windows- does getrandom need? A single function.
See also the Azure CLI. There's a github issue, still open, from 2018 complaining about their 350 MB monstrosity bloating up a docker container. By now the thing weighs more than 1.5 GB. Fucking MS.
https://github.com/Azure/azure-cli/issues/7387
(Disclaimer, I didn’t work anywhere near this, and am just making up a guess).
https://www.reddit.com/r/ProgrammerHumor/comments/6jw33z/int...
Systems are like babies: once you get one, you have it.[a.] They don’t go away. On the contrary, they display the most remarkable persistence. They not only persist; they grow. And as they grow, they encroach. The growth potential of Systems was explored in a tentative, preliminary way by Parkinson, who concluded that Administrative Systems maintain an average rate of growth of five to six percent per annum (corrected for inflation) regardless of the work to be done.
- from Systematics by John Gall
0. https://en.wikipedia.org/wiki/Junk_DNA (Well aware of how “junk” DNA is linked to functions elsewhere, and that the subtleties involved represent an evolutionary aggregation equivalent to why acli is 1.5G.)
1. https://en.wikipedia.org/wiki/Katamari_Damacy
0. https://en.wikipedia.org/wiki/Systemantics
In theory that would not be necessary any more on more modern rustc versions, but if you want to target rustc < 1.70 you still need that.
EDIT: It's gotten even worse in the last 6 years!
I recently ripped out the rand crate and replaced it with some much simpler code ported from a C++ codebase. Still does what I need it to do but way fewer LOC and way less complexity. Is it as flexible as what rand and related crates offer? Maybe not, but that flexibility comes at a cost.
Windows-sys is necessary for w Windows OS kernel, libc is similar thing for *Nix, cfg-if is necessary for specializing targets per OS, arch, or SIMD capabilities.
Biggest offender is honestly zerocopy-derive. Which pulls in most dependencies.
On the article, two random thoughts:
- I like that the Rust standard lib is "tiny" and a lot of stuff is delegated to third-parties crates. I wish crates support namespaces (for instance std) so it's easier to see what dependencies are blesser. For the moment, you can obtain a good name, like http, and squat it forever (http being just an example, don't know if there is an actual crate)
- when you vendor a Rust project, does it vendor also the flag dependencies? For instance, Rand is dependent on Serde because of an optional flag for serializing a random generator. Serde loc should be ignored in that case if we count the code lines.
That being said, I simultaneously think the concerns are slightly overblown on the safety part. Having stable pillar crates that everyone builds around is a good thing not a bad thing. The build issues for things like that should be solved at the language/tooling level (e.g. pulling in a crate for 1 function should be trivially cheap) while relying on the network effects of auditing the components (i.e. it’s OK to rely on a crate with a stronger chain of trust than you have yourself).
* To be fair, they call out alternatives that you might find more appealing, but the type and module system being what it is (+ the name rand being so concise and appealing when you come at it with a first glance), it becomes the de facto standard.*
EDIT: it's a similar situation to general purpose allocators. If you know you can use a simple one, it's orders of magnitude less code and complexity than a general one.
You can use getrandom directly.
getrandom v0.3.1 ├── cfg-if v1.0.0 └── libc v0.2.169
Do you happen to also know if there is any ongoing work in rustc to make compilation faster for this situation?
I’m imagining that one could typecheck and then see if ASTs are used / unused and eliminate them before generating code. Maybe that would speed up compilation? Perhaps this is already being done.
There still is:
From memory (might check the paper):If this kind of insane dependencies for such a simple thing is common practice in the Rust ecosystem then the language is in a lot worse place than I imagined it to be.
Listen, I'd be as happy as anyone to have random numbers in the Rust standard library. Compared to the Rust developers, I'm a believer in stdlib maximalism, downsides be damned. But all this recent hand-wringing about dependencies is a tiresome moral panic.
If there weren't real costs to dependencies then I personally never would have published regex-lite.
The OP isn't addressing the real costs of dependencies, the moral panic in question is the automatic assertion that more dependencies is worse than fewer dependencies, which implies that e.g. all the work you have done to cleanly separate regex out into reusable regex-syntax and regex-automata crates has done a disservice to your users. There are real arguments to be made about wrangling one's trusted computing base, but this isn't making that argument, and by throwing the baby out with the bathwater it sets us back as a profession.
Sorry, what? The section on "Compilation Times" is absolutely a real cost!
> the moral panic in question is the automatic assertion that more dependencies is worse than fewer dependencies
I agree that is a moral panic, but I disagree that Armin is indulging in that assertion.
I'm not surprised. Reddit is a cesspool