It feels like problem with these aren't the ideas, rather the "let's just" approach/expectation in front of them.
For instance "Let's just add an API." I think the approach to an API as "just" a feature to your product will be about as successful as saying "let's just add a UI". To implement a successful UI one needs to be thoughtful, thorough, and bring in people who specialize in it. Why should any other interface for your product be any different? It's not that it's a bad, or good, idea, rather one that shouldn't "just" be done.
btilly 17 days ago [-]
At a previous employer we had a rule. The only person allowed to say, "just", was the developer responsible for actually making it work. Another developer who said it, just volunteered themselves.
> “Precisely. It’s what I call a ‘Lullaby Word.’ Like ‘should,’ it lulls your mind into a false sense of security. A better translation of ‘just’ in Jeff’s sentence would have been, ‘have a lot of trouble to.'”
I do a lot of audition and sometimes start talking in theses phrases - it is really annoying for other people, when my vocabulary differs in meaning to the standard body of language.
bryanrasmussen 17 days ago [-]
Ok so if you have a problem and the developer thinks that the solution to it is to add an API, all they have to do to get their way is to say "let's just add an API!", although they would of course be the one that had to make it work.
Yes I like this rule, how many things I could have solved at the worst run place I ever worked at if it had been implemented there. Although I guess the management might not have liked that essentially I could choose what got done and what I got to work on just by running my big mouth.
facetiousness aside, I'm trying to point out that most often when people say "let's just" it is because they are advocating a course of action and that rule won't work unless the course they are advocating gets chosen. If they say "let's just make an api" and you say no we're doing a UI and you will work on it because you used the word "Just" I guess that's a way to lose your tech talent relatively quickly.
sharpy 17 days ago [-]
I am totally going to advocate for this at my workplace. I myself am guilty of it - not saying "just", but questioning people's estimates based on my limited understanding of the research they had to do to come up with the estimates in the first place.
ffsm8 17 days ago [-]
It sounds pretty toxic if you're not allowed to even question other people's estimates. IME that's a large part of Grooming/refinements
dambi0 17 days ago [-]
Aiming for an environment where uniformed challenges to estimates are discouraged is an entirely different thing to forbidding any challenges to estimates.
tjalfi 17 days ago [-]
We had a similar rule for application support at my last job. Anyone who introduced an application was assigned as the support lead.
cgriswald 17 days ago [-]
There’s a fundamental difference between being able to recognize the best option and being able to implement the option.
These sorts of rules are really about people feeling devalued or disliking being volunteered or told what to do (often by people they consider less knowledgable). They aren’t really about effectively distributing work.
“Just” gets a bad wrap. There’s a sort of hidden assumption here that “you can just” is equivalent to “you can easily”. It sometimes means that but more
generally it means something more like “…will be easiest” which can be true even when the action itself is hard or a lot of work.
btilly 16 days ago [-]
"Just" absolutely deserves its bad rap. There is nothing that you can say with it that you can't say without it. And the alternative is almost always strictly better.
BlarfMcFlarf 17 days ago [-]
Punishing people with more work doesn’t make sense in a well run organization. Work is continuous and more work only affects the backlog, not the actual developer’s life or experience.
btilly 17 days ago [-]
It was not a punishment. It was more, "You're sure that you know an efficient way to do this? You get a chance to prove that you're right." It made people careful because being wrong had a consequence, but it usually moved work to the person who thought that they had the best approach.
This happened in Sprint Planning. Obviously by the end of the meeting, everyone would have a full sprint. So volunteering for one thing required giving something else up. This was definitely part of the equation.
watwut 17 days ago [-]
It is literally punishment for saying something plus passive aggressive explanation. Somewhat manipulative.
"Giving someone a chance" is a positive thing and wont make people careful.
btilly 16 days ago [-]
How so is it "literally punishment"?
People wind up with as much work as they think they can reasonably do. They would have done that whether or not they "just volunteered" themselves. You only wind up with too much if you didn't estimate your own work well.
As for being passive aggressive, the word "just" usually is used in a passive aggressive way. Making people careful about saying it was an improvement. And we all agreed that it was.
watwut 12 days ago [-]
Per your description, it is used to stop people from saying or doing something by assigning an uncomfortable unwanted consequence. Someone thinking something is easy does not nor should not imply they actually want it. Most people do not want to do easy tasks all the time.
And no, they did not volunteered themselves. That is just manipulative language.
And both of these are passive aggression.
CRConrad 16 days ago [-]
Preferably used in sentences like "That's just not possible" or "No, we'll just not do that", right? (OK, probably not... But let a man dream.)
jrs235 17 days ago [-]
Needs to also be for "that's easy"...
dataflow 17 days ago [-]
Wouldn't this be a easy way to get out of doing whatever current work you don't want to be doing?
lukan 17 days ago [-]
Can't you just work harder and smarter?
btilly 16 days ago [-]
Taking on what others thought was more work, in return for abandoning less current work, is a loss. Unless, of course, you're right about the better approach.
So no, not easy. Not unless you're right, and someone else was wrong.
StefanBatory 17 days ago [-]
Or it lead to people being silent and keeping any ideas to themselves ;P
btilly 16 days ago [-]
People were free to offer any idea that they want. But you couldn't say "just".
Equivalent weasel words are easy to come by. "I don't see why it wouldn't work to ..." But now you're asking for an explanation, without dismissing the person who will be explaining it.
gdsdfe 17 days ago [-]
That's a good one!
justin_oaks 17 days ago [-]
Exactly. There's no "just" when you're adding an API. APIs require a lot of work and complexity to do right:
APIs need to be well-designed or the client may need to make multiple API calls when one should suffice. Or the API could be confusing and people will call it wrong or fail to use it at all.
APIs require authentication and authorization. This means setting up OAuth2 or at least having a secure API token generation, storage, and validation.
APIs need handle data securely or you'll leak data you shouldn't, or allow modification you shouldn't.
APIs need to be performant or you database may be crushed under the load. This may involve caching which then adds the complexity of cache invalidation and other servers/processes to support caching.
APIs need to be rate-limited or sloppy clients will hammer your API.
APIs require thorough documentation or they're useless. You may even have to add SDKs (libraries for different programming languages) to make it easier for people to use the API.
APIs need good error messages or users who are getting started will not know why their calls are failing.
nyrikki 17 days ago [-]
I disagree, those are all technical issues that are typically blocked by socio-technical and organizational problems.
Retrofitting an API is a context specific problem, with few things that generalize but the above aren't the hard parts, just the complex ones.
I am going to point you to a Public Policy book here, look at page 32 and their concept of 'wicked problems', which applies to tech but doesn't suffer the problems with the consulting industry co-opting as much as in the tech world.
Obviously most companies won't accept the Amazon style API edicts that require Externalizable interfaces and force a product mindset and an outside in view.
Obviously an API is still a "Cognitively complex problem" as defined by the above.
But what makes most API projects fail is the fact that aspects arise that result in ‘wickedness’ A.K.A intractability.
API gateways, WAFs, RESTful libs with exponential backoff with jitter....etc... all exist and work well and almost any system that you have a single DB you are concerned about killing is possible to just add a anti-corruption layer if you don't have too much code debt, complicated centralized orchestration etc....
But these general, but difficult tasks like adding an API almost never fail due to tech reasons, but due to politics, poor communication, focusing on tech and not users, unrealistic timelines, and turf battles.
You are correct that "APIs require thorough documentation or they're useless" but more importantly they need enforceable standards on contract ownership, communication.
The direction of control, stability, audience, and a dozen other factors effect the tradeoffs there as to what is appropriate.
Typically that control is dictated purely by politics and not focused on outcomes.
That is often what pushes these complex problems into failed initiatives, and obviously this is far more complicated. Often orgs can't deal with the very real uncertainty in these efforts and waste lots of project time on high effort, low value tasks like producing gantt charts that are so beyond the planning horizon that they can only result in bad outcomes at best.
The point being is that unless you are on the frontier of knowledge and technical capabilities, it is almost never the tech that causes these efforts to fail.
harrall 17 days ago [-]
When I read OP’s list of wants for an API, I don’t read it so literally.
I read it as an example of the complexity required to implement something like an API.
I agree that it’s also a people problem but I disagree that many of these failed initiatives aren’t due to bad tech as well. A complex initiative requires picking the right tech that gives you a good benefit/cost ratio. Knowledge of tech plays hand in hand with the political aspects and IMO many orgs lack both completely.
nyrikki 17 days ago [-]
> A complex initiative requires picking the right tech that gives you a good benefit/cost ratio.
That choice should never be a one way door, you don't control your customers or the future, it is the frame problem.
The idea is to be able to iterate and adapt, not get things perfect from the start.
It is fine and expected that one will use past experience as a starting point, but you always have to reflect and check your assumptions along the way.
It is a complicated topic ruled by nuances and contexts, but if you cannot pivot away from your initial assumptions over time, that lets you know you are leaking implemention details.
While time is limited and abstractions have a real cost, you need to apply them where they make sense.
Some, like just breaking code onto two files in the same directory are fairly low cost, others are much more expensive.
For several years I was jumping in to save failed cloud migrations.
Whenever tech was the 'blocker' it was because of a myth that there was only one way to do things.
But lets say you are writing machine code for Apple silicon? Do you really think those registers are concrete? They aren't. They are a facade hiding 100s behind a legacy interface.
Vendor mitigation important and often neglected and also results in people producing balls of mud.
But there are a lot of distributed monoliths pretending to be micro services out there. And we had decades of people producing fragile enterprise service buses that were built because people wanted to future proof their systems.
The sizing of components and balancing integration and disintegration drivers is incredibly hard, you will never get it right.
You have to leave options open, no matter if that is through abstractions or keeping biz logic centralized to assist in a easy rewrite of context boundaries that arise from scale or changing needs.
Obviously choices have benefits and costs, but for most needs you can keep options open, if not it is probably best to reevaluate the reason for choosing a solution.
Obviously there are predatory vendors like Oracle that base their entire income on captive customers, but that relates to the vendor mitigation above.
I think the US federal government de-risking guide is a good overview of that topic. Microsoft is a strong driver for this BTW.
For anyone who wants to read more about what the Amazon experience was like, I highly recommend Steve Yegge's platform rant at https://gist.github.com/chitchcock/1281611.
com 17 days ago [-]
It’s depressing how far Google seems to have fallen since then.
eddythompson80 17 days ago [-]
lol, I can already hear the responses of plenty of managers/management I worked with to these points.
> APIs need to be well-designed or the client may need to make multiple API calls when one should suffice. Or the API could be confusing and people will call it wrong or fail to use it at all.
That's fine. It's a first iteration we can change it later if people complain. Lets get something out there and iterate.
> APIs require authentication and authorization.
Lets not worry about authorization/permissions. It's just one key or whatever account they use to log in with now.
> APIs need handle data securely or you'll leak data you shouldn't, or allow modification you shouldn't.
You're saying you don't know how to do that?
> APIs need to be rate-limited or sloppy clients will hammer your API.
Either "Lets worry about that later when it happens" or "Here is the first github link from a google result for 'api rate limit open source free'. Lets use that"
> APIs require thorough documentation or they're useless.
Either "We need to have the API first, docs, clis, etc could come later after we have gauged the usage or had asks for them. We can handhold the customers asking for them for now." or "here is a github project that autogenerates docs, SKDs and clis from an OpenAPI spec"
> APIs need good error messages or users who are getting started will not know why their calls are failing.
You're saying you don't know how to do that?
"It doesn't have to be perfect. We have customers who are asking for it and to win them we need to implement something then we can work with them to improve it. Otherwise we will lose them"™
9dev 17 days ago [-]
Then again, these answers are valid. Your fancy API documentation will only be skimmed, nobody is going to notice the elaborate and elegant URI design or the discussion around whether to use PUT or PATCH, and the super-secure token mechanism you come up with will be written on a sheet of paper in the office.
Most projects are far too fluid in their shape to warrant a proper design up-front anyway.
lucianbr 17 days ago [-]
Recently I was working on a 12-year old project where the product people kept asking for "quick wins" and "leave the refactorings and in-depth analysis and difficult problems for later". When will they get to the "later", if 12 years was not enough? I eventually left.
Of course, YMMV. Every company and manager is different.
justin_oaks 17 days ago [-]
In every organization I've been in, "later" meant "never".
lazyasciiart 17 days ago [-]
They’re really not. That approach is how you get “why do I need this account ID to look it up in the API but the only way I can get it is by using the UI? Why can I update this object but not see its current state?” and other complaints that users very reasonably have about a thing I worked on.
9dev 17 days ago [-]
Don’t get me wrong, I love to obsess over that kind of detail, but complaints like these usually aren’t a deal breaker, and yet it’s reasonable to count those complaining users as successful conversions regardless.
17 days ago [-]
liontwist 17 days ago [-]
Yep, helps a lot when your management is in the domain.
Phased implementation isn’t the worst idea, they just have to be committed to ending the experiment or seeing it through.
17 days ago [-]
mdaniel 17 days ago [-]
And there's an implied risk that once you have added an API for customers to exfiltrate data or integrate with their own system (or, worse, IMHO: a competitor's tool) then they'll never return to your site again
I'm cognizant that if it truly is a make-or-break for the deal, there may not be any choice, but along with all the risks you cited is an underlying obsolescence one
Groxx 17 days ago [-]
Every single one of these is true of a website, but we have far more websites than APIs.
If there's one thing APIs suffer from more, it's "social cost to make changes". A concurrent vN+1 largely resolves that though, unless your API consuming ecosystem is large enough to be worth investing vastly more resources into.
saghm 17 days ago [-]
Is "let's just make a website?" a common thing people say? It sounds like the issue isn't with "make an API", it's with "just".
vasco 17 days ago [-]
Most professional advice is like relationship advice, people extrapolate something that went wrong for them into general cases but that rarely directly applies to someone who is not you in that very similar situation. And the advice that seems to work consistently gets so generic that is almost useless. Something like "be thoughtful, try to do the right thing and reflect back to see how it went" doesn't write blog posts or sells books!
jabroni_salad 17 days ago [-]
I am pretty sure presales as a career exists solely because of the word 'just'. We have to make sure they wont get spooked when we start lifting the rest of the iceberg out of the water.
karmakaze 17 days ago [-]
Yes exactly. The article would be much better positioned as What it actually takes to make systems ideas "just work". Then it isn't about whether it can or can't but rather how it succeeds or fails. Having details about what it takes then stops people from saying "just" because they don't know what they're talking about.
But I pretty much agree 100% about DSLs, it's an unnecessary/cute complication. The only people who should be allowed to make them should be ones who have made a successful programming language and updated it to deal with all their mistakes or a 2nd version/2nd language, and still got many things wrong.
aprilthird2021 17 days ago [-]
Yes, and a very well designed API can be the foundation of and moat for a successful business. Stripe is one of the best examples. Even as hundreds of copycats pop up, their excellent API design and the trust devs have in them to nail the next API design, means they can continue to expand into other financial products with ease.
fosk 17 days ago [-]
Furthermore, API is a product that needs its own lifecycle to be versioned, decommissioned, continuously improved.
Often I see developers creating new APIs ad-hoc all the time instead of curating and enhancing the one they already have.
gmuslera 17 days ago [-]
Cargo cults around these system ideas are dangerous. The idea may be good, but without understanding what really makes them work or not is dangerous.
fullstackchris 17 days ago [-]
Dangerous? In the sense a bad API is dangerous? If so then I've been in danger consuming catastrophic APIs for at least a decade now
nswanberg 17 days ago [-]
Yegge wrote about the business idea version of this as "Shit's Easy Syndrome":
It'd have been delightfully ironic had either of these Steves concluded their essays with a named methodology to "just" apply whenever faced with these "let's just" situations but alas...
(4) Anomaly detection is not inherently a problem of distributed systems like the others, but someone facing the problems they've been burned with might think they need it. Intellectually it's tough. The first algorithm I saw that felt halfway smart was https://scikit-learn.org/1.5/modules/outlier_detection.html#... which is sometimes a miracle and I had good luck using it on text with the CNN-based embeddings we had in 2018 but none at all w/ SBERT.
travisjungroth 17 days ago [-]
I've written two DSLs (one with a team) and I'd consider them both successful. They solved the problem and no one cursed them out. I think the most important factor is they were both small.
They were very similar. I even reused the code. One was writing rules to validate giant forms, the other was writing rules to for decisions based on form responses.
Ok, just ranting on DSLs. Good DSLs take someone from can't to can. A DSL that's meant to save time is way less likely to be useful because it's very likely to not save you time.
In both of my DSLs, it's that we needed to get complex domain behavior into the program. So you either need to teach a programmer the domain, partner a programmer with a domain expert, or teach a domain expert how to program.
Putting the power in the hands of the domain expert is attractive when there's a lot of work to be done. It frees up programmers to do other things and tightens the feedback loop. If it's a deep domain, it's not like you want to send your programmer to school to learn how to do this. If it's shallow, you can probably have someone cheaper do it.
A DSL comes with a lot of cognitive overhead. If the other option is learning a full programming language, this becomes more reasonable.
A time saving DSL is where someone already knows how to write code, they just want to write less of it. This is generally not as good because the savings are marginal. Then when some programmer wants to change something, they have to learn/remember this whole DSL instead of more straightforward code.
Actually, this makes a simpler rule of thumb. A DSL for programmers is less likely to be a good idea than a DSL for non-programmers.
f1shy 17 days ago [-]
DSL are just great, have never failed for me. I have done it several times, the last one just the past year, for programming complex ASICs. I‘ve seen uncountable times working like a charm.
I cannot understand why seems it was bad for him…
travisjungroth 17 days ago [-]
I think there are two big reasons DSLs often underdeliver. The first is not understanding the cognitive overhead budget. If it's something where it's being used infrequently or by a lot of new people, that's a lot of overhead to be spent each time. Sometimes people think of writing DSLs for tests because they have to write a bunch, but it's really easy for this to suck. I have a test turn red and now I have to learn a DSL to deal with it? Ew.
The second is fuzzier. It's putting a DSL over something complex and hoping this will fix things. Writing SQL queries for this system takes a bunch of time and is error prone? Just put a DSL over it! Except all those details and errors are probably going to leak right through your DSL.
You want to master the domain before you put a DSL over it.
f1shy 17 days ago [-]
>> The first is not understanding the cognitive overhead budget. If it's something where it's being used infrequently or by a lot of new people, that's a lot of overhead to be spent each time. Sometimes people think of writing DSLs for tests because they have to write a bunch, but it's really easy for this to suck. I have a test turn red and now I have to learn a DSL to deal with it?
What is the alternative to the DSL with lower cognitive load? I do not follow. Every single DSL I’ve seen REDUCES the cognitive load, by allowing to express the concept in the mere language of the problem at hand, for which the SME should be more than familiar with.
About the second point: I see many critics in this thread based on DSls above SQL. Whatever somebody is doing above SQL and selling as a DSL, it is not. Period. I cannot think in any possible way of doing a DSL above a query language. No doubt people hate the idea. Is a BAD one.
travisjungroth 17 days ago [-]
> What is the alternative to the DSL with lower cognitive load?
In the test example, writing it directly in the programming language. This will usually lead to code that is more verbose and repetitive, but understanding the first example will be faster.
I think of cognitive load like a line. X is the number of cases you’re working with, Y is cognitive load [0]. For someone who already knows a programming language, the DSL is going to have a higher Y intercept since you have to learn something new before you understand the first case. Hopefully, it’s a shallower slope so as you deal with more cases the upfront cost gets paid back. If you have lots of people dealing with one case or doing it infrequently enough they have to relearn each time, this payoff never happens.
This model extends past DSLs to all abstractions. It’s why people often end up happier with test code that’s less abstract/DRY. The access pattern supports it.
Looking at it this way also explains why a DSL for a non-programmer is more likely to be useful. The intercept can be lower than an actual programming language, so you’re ahead from the start.
[0] It’s really more of a curve, but the line model works conceptually.
epgui 17 days ago [-]
These examples sound like things that aren’t really DSLs… Or in other words it sounds like someone is trying to make something “simpler than it actually is”.
DSLs are supposed to be for making it easier to perform computation in a specific context. Software tests have about as many degrees of freedom as the programming language they are written in, so I’m not sure they are an ideal use case for a DSL— not without a lot of discipline at least.
For a DSL to make sense, IMHO, you need to be able to write down a complete and correct specification for it. I doubt that is even possible in the given examples :shrug:
layer8 17 days ago [-]
Are your DSLs used by other people and do they share your opinion? In my experience DSLs are nice to work with for the creator, but it’s much more work in documentation, training, intelligible error handling, and so on, to make a DSL that’s easy for others to learn and use.
I do like DSLs, but the value proposition is often difficult, IMO.
crabbone 17 days ago [-]
There's a difference between "works once for my very specific problem" and "works most of the time for a wide range of problems".
DSLs, in my experience, usually fail in the later definition. It's very hard to make a small language that precisely captures its domain of application, will produce easy to manage programs no matter the size, would be easy to analyze in terms of performance and side-effects.
bjourne 17 days ago [-]
There are hundreds of DSLs for ASIC design but not a single one of them has ever been used for actual tapeout. It's 100% unheard of. Hence, I doubt your DSL saved any time over using an RTL language directly. Sorry for sounding harsh, but if you work in the area you understand my skepticism about ASIC design DSLs.
throw16180339 17 days ago [-]
> I cannot understand why seems it was bad for him…
There are many poorly designed libraries, and DSL design is no easier. While I haven’t personally encountered any, I’m sure there are numerous half-baked DSLs out there.
For example, bash, SQL DSLs may be immediately useful by protecting against shell,sql injection: shutil.run(sh"command {arg}") may translate to subprocess.run(["command", os.fspath(arg)])
No shell--no shell injection. The assumption is that it enables sh"a | b > {c}" syntax (otherwise just call subprocess.run directly). Implementing it in pure Python by hand would be more verbose, less readable, more error-prone).
travisjungroth 17 days ago [-]
Yes, there are definitely counterexamples. It’s not black and white at all.
Majromax 17 days ago [-]
I think the theory of domain-specific languages is very valuable, even if there's rarely a need for a full implementation of one.
As I see it, a DSL is just the end-state of a programmer creating abstractions and reusable components to ultimately solve the real problem. The nouns and verbs granted by a programming interface constrain how one thinks, so a flexible and intuitive vocabulary and grammar can make the "real program" powerful and easy to maintain. Conversely, a rigid and irregular interface makes the "real program" a brittle maintenance nightmare.
travisjungroth 17 days ago [-]
I agree. The line between a DSL and regular old programming abstractions is fuzzy. Learning some language design is very helpful because you’ll see every abstraction is a little piece of language.
lifeisstillgood 17 days ago [-]
>>> A DSL for programmers is less likely to be a good idea than a DSL for non-programmers.
Nail on the head time - somewhere else in the thread is jooq which is (yet another) SQL DSL where you end up with from(table).where(student=bob)
This is a perfect example of why the programmer should (just?) learn SQL instead of the DSL - and your comment nails it
PaulHoule 16 days ago [-]
(1) JooQ shines for query generation. For instance at work we have a search engine that can search maybe 50 fields that are mostly materialized but involve the occasional subquery. Also you can easily write a template method like
and have type inference do the right thing in the compiler and IDE and all of that. You get "hygenic macros", you can write an Java class which is parameterized by one or more related tables which can be specialized by adding more parameters, subclassing, etc.
(2) Circa 2005 coding PHP I came to the conclusion that the ORM I needed was
insert(table,{col1: val1, col2:val2, ...})
because writing simple inserts and updates against SQL is for birds, let freshers do it and they will forget to escape something. Such a "framework" can be so simple that you can bend it to the needs of your application. JooQ gives you something like that but backed by the Java type system.
lifeisstillgood 16 days ago [-]
I do get it (I think!) - but there is a world of difference between “I have 20 years SQL experience and do not want to spend hours maintaining 200 SQL templates, and believe the overhead of this DSL is worth the trade off” vs “use the DSL and you won’t have to teach junior Devs SQL!”
My comment is more aimed at the second part. SQL
Is tied to the implementation and demands coders understand it all. A DSL can allow domain experts to express their understanding without having to worry about software trade offs.
The most successful “DSL” I know of like this is fitnesse tests - just a large number of simple tests where domain experts can spreadsheet style throw in the “gotchas”.
Something like that but more spool is fixated is a holy grail - Behaviour driven tests like cucumber come close but there is that weird intermediate translation from English phrase to random function - now you have to understand the function to use the phrase and suddenly you are reading real
Code to be able to use the fake code and it never feels clean
One day I will be clever enough to be able to write a really good test DSL
It’s just whenever I think of “Given used is logged in, visit “textbox” and enter “word” .. it just looks like BDD test not a DSL. Like I said, one day I will be clever enough
PaulHoule 15 days ago [-]
Sure. Note though that there's a long tradition of systems for embedding SQL in conventional programming languages such as
which for whatever reasons never caught on in the open source world. (I'd blame limitations of current compiler technologies and the values of people who make compilers... If we had composable parsers you could just say "here's a spot for a SQL query in a Java method" in 10 lines of code) JooQ approaches that without requiring any change in the compiler. In the past it was awkward to embed SQL in Java because there were no multi-line strings. In Python you could write
do_query("
... a really crazy complicated queries with lots of joins and subqueries
that is carefully indented to fit in with the rest of the program ...
",{"arg1": val1, "arg2": val2})
but without real map literals, multi-line strings and such this was terribly awkward. (If you think List.of(), Map.of() and such are cool I was writing a computer chess program last month that used List.of(A,B) to create a list that was used in an inner loop and it was terrifying how slow it was compared to using an ArrayList)
ramchip 16 days ago [-]
I'm not familiar with jooq, but I've used Ecto a ton, and the point was never to avoid learning SQL. It's about making queries composable and mapping to domain objects automatically so eg. there aren't dozens of queries to update when you add a field to a domain object.
stickfigure 17 days ago [-]
jOOQ is a disaster and I would not recommend it to anyone.
You write some SQL queries, test them in datagrip or whatnot, then spend the next several hours figuring out how to convert them to the DSL. This problem is compounded when you use "exotic" SQL features like json expressions. Debugging is "print the generated sql, copy it into datagrip/whatnot, tune the query, then figure out how to retrofit that back into the DSL".
It's a huge waste of time.
The primary selling point of jOOQ is "type safe queries". That became irrelevant when IntelliJ started validating SQL in strings in your code against the real data. The workflow of editing SQL and testing it directly against the database is just better.
jOOQ reinforces the OP's point about DSLs.
geophile 17 days ago [-]
This is a very specific and popular subset of the DSL point: Let's just invent a language L that is better than horrible standard language X but translates to X. Imagine the vast cubicle farms of X programmers who will throw off their chains and flock to our better language L!
In many scenarios (including JOOQ and all ORMs), X is SQL. I should know, I spent years working on a Java-based ORM. So believe me when I say: ORMs are terrible. To use SQL effectively, you have to understand how databases work at the physical level -- what's a B-tree lookup, what's a scan, how these combine, etc. etc. You can often rely on the optimizer to do a good job, but must also be able to figure out the physical picture when the optimizer (or DBA) got things wrong. You're using an ORM? To lift a phrase from another don't-do-this context: congratulations, you now have two problems. You now have to get the ORM to generate the SQL to do what really needs to be done.
And then there are the generalizations of point made above: There are lots of tools that work with SQL. Lots of programmers who know SQL. Lots of careers that depend on SQL. Nobody gives a shit about your ORM just because it saves you the trouble of the easiest part of the data management problem.
stickfigure 17 days ago [-]
This is an odd take. Your programming language works with objects, the data is in relational tables, you need software to map the relations to objects. Thus the Object Relational Mapper. There's no reason you can't write SQL and let an ORM handle the result set mapping.
sgarland 17 days ago [-]
If that’s all you were doing, then maybe, but it never is. ORMs enable people who have no idea how RDBMS works to use them, which rarely ends well.
I’m not suggesting that to use RDBMS you should know how to administrate and tune it (though it helps), but knowing their language, and understanding a single data structure (B+ trees) isn’t too much to ask, I think.
gedy 17 days ago [-]
> ORMs enable people who have no idea how RDBMS works to use them, which rarely ends well.
In some cases, but the more frequent issue I saw back in the day was the DBA making some really complex schema tuned for what they wanted, then an application trying to use the data in a pretty reasonable OOP manner (1 to many relationships, etc) and the DBA pissed they were using an ORM instead of their perfect SQL queries and procedures.
sgarland 17 days ago [-]
> the DBA pissed they were using an ORM instead of their perfect SQL queries and procedures.
Tbh, I don't understand why this is seen as a bad thing. Correction: I know why it is (any changes are obviously going to be dramatically slowed down), but in the long run, I don't understand why people are against it. You wanted something done correctly, so you went to the SME for that specific field, and had them do it for you. Then you decided to throw it away?! Why are you bothering to ask them in the first place?
> 1 to many relationships, etc
I know this was just an example, but 1:M is a perfectly natural part of any RDBMS, and in no way requires an ORM to be done.
gedy 15 days ago [-]
> Then you decided to throw it away?! Why are you bothering to ask them in the first place?
Usually this was a mismatch of mgmt or expectations. Hiring old school DBAs and letting them think they "own the data", while plopping them into a huge dev team changing the big SaaS features daily is a recipe for trouble.
I don't fault DBAs per se, though I did work with some who wouldn't look outside their blinders at all.
stickfigure 17 days ago [-]
Of course. And after you understand SQL and databases, ORMs can save you a lot of typing. I've never understood the either/or attitude.
sgarland 17 days ago [-]
Sometimes it can. Other times, you have the SQL already written in your head, but then you have to figure out how to coerce the ORM to doing what you want.
stickfigure 17 days ago [-]
Even Hibernate has `em.createNativeQuery("type your sql here", SomeResult.class)`. I've never seen an ORM (for an RDBMS) that didn't make it easy to run SQL.
crabbone 17 days ago [-]
Then what's the point of using Hibernate? Just use the ODBM driver... why are you dragging the gorilla and all of the jungle with you if all you wanted was a banana?
stickfigure 17 days ago [-]
Since we're talking about Hibernate, I assume you mean the JDBC driver? Because the API is tedious and unpleasant.
The mapping of database results to java objects with Hibernate is convenient. The basic "load entity, change a couple fields, let Hibernate persist it" flow is convenient. In a limited set of cases, basic entity graph navigation is convenient.
As I said, if you're working in an object-based language, by definition you need something that maps relations to objects. Hibernate is a competent choice. There are other competent choices, but JDBC is not one of them unless your app is trivial.
crabbone 16 days ago [-]
Yeah, I confused multiple acronyms here :)
Anyways. Hibernate works on top of JDBC, so, if you like its interface, then it means you could make your own, but skipping >99% of the rest of Hibernate code that has nothing to do with wrapping the driver.
Or, imagine there was a library Hibernate', that threw away all the ORM stuff, and only offered mapping of SQL results to Java objects and sending queries to the database. Then, why not use Hibernate' instead of Hibernate?
NB. About triviality. From experience: trivial apps tend to work OK with ORM. Non-trivial will usually ditch the ORM because of performance, missing functionality and general difficulty with servicing it. So, it's the other way around: if you are shooting for the stars, you are probably not going to use Hibernate, Hibernate is one of the variety of tools that helps losers loose less, it's not a tool for the winners.
stickfigure 16 days ago [-]
What you've said makes no sense. The "ORM stuff" is what I want, the Object Relational Mapping. Taking relational data and converting it back and forth to objects. And Hibernate is actually pretty good at this.
I think you've built up a strawman in you mind of what you think "ORM" is. Yes Hibernate is huge and has a lot of features that people shouldn't use. But you can say the same about Microsoft Word, the problem is that everyone uses a different 5% of the huge feature set.
People who work with these technologies on a daily basis don't screw up the core acronyms. I suggest softening your opinion and dropping the platitudes.
crabbone 16 days ago [-]
It's clear that you want ORM. But you didn't explain why you want it. For all I know, you like to suffer, and that's why you want it, but you've made no compelling argument for people who don't like to suffer to use ORM.
BTW. I'm absolutely on-board with you: nobody should use Microsoft Word. There's absolutely no reason to do that. It's a marketing ploy with a lot of grease money paid to people in charge of procurement in various places. It's absolutely not about 5% of features. It's just downright worst kind of text editor that's in popular use today. Ask me how I know this? I worked in a newspaper! Somehow, Microsoft never ventured into this field, and didn't sell their garbage there. And nobody uses Microsoft in book publishing or any other sort of publishing. Not for any % of its features. So much so that if you bring a manuscript (as an outside author) to publish a book or an article in a newspaper / magazine, and it will be in MS Word format, you'll be most likely asked to convert it to another format. And we are talking about people who need a lot of different features of text editing!
And, I really don't care about what you have to suggest. You aren't in a position to make suggestions really ;)
specialist 15 days ago [-]
Heh. Nice. I like your zinger.
My go to metaphor has been "XYZ is an angry 800lb gorilla sitting between you and your work."
reaanb2 15 days ago [-]
> you need software to map the relations to objects
If you start with a network data model perspective and build that into your system, then it follows that you'll want a network data model to SQL mapper. That's what ORMs are, and the need for them comes from your approach, not from the tools you use.
There's a different approach - use OOP to build computational abstractions rather than model data. Use it to decompose the solution rather than model the problem. Have objects that talk to the database, exchange sets of facts between it and themselves, and process sets of facts. In the process, you can also start viewing data relationally - as n-ary relations over sets of values - as opposed to binary relationships between tables of records.
Information systems are not domain simulations, simulations compute the future state of the domain whereas information systems derive facts from known facts at the present time.
For a visual metaphor, car engineers don't use roadmaps as design diagrams and they don't model the problem domain in the systems they build. A car isn't built from streets, turns, road signs, traffic lights, etc. And despite that, cars function perfectly well in the problem domain. A car generally doesn't need to be refactored and reassembled when roads or rules change.
crabbone 17 days ago [-]
Nah. That's an inconsequential part of the interaction between the database and the application. The reality is that your code has both, the database and the application. And if you want to write good software, you need to know how both work and be an expert at that.
It's infinitely easier and less error-prone to keep the interface between the database and the application to the minimum (just convert the final results of a query to the application objects and embed complete queries in the application code) than to try and create complex query builders behind the scenes of object-to-object interaction.
If you want to make a good product, you may start with ORM, as it may, for a time, delay the need of understanding the relationship between the application and the database, and allow you to experiment faster at the expense of lost performance. Once you know what you need to do, ORM just no longer works: you will have to break it at least in order to deal with performance issues, but often you will also find yourself dealing with the fact that a lot of what you want to express in your queries is either too difficult or even impossible to express in a particular ORM.
hot_gril 12 days ago [-]
I used MySQL before I understood it at the physical level, and now I'm using some other ones that I don't really understand. A MySQL/Postgres noob can get pretty far just knowing to avoid seq scans. It's not ideal, but it'll work. Understanding schema design is more important.
The thing is, ORMs encourage bad schema design and get in the way of the SQL you want. I've seen entire projects ruined this way. I think the only valid reason for an ORM was before RDBMSes had json etc types. Maybe you had a table with very many cols that you just want to get/set, say a "user profile" table. This also contributed to the NoSQL fad. Nowadays you can throw that into one json col.
sroussey 17 days ago [-]
> To use SQL effectively, you have to understand how databases work at the physical level -- what's a B-tree lookup, what's a scan, how these combine, etc.
This is a good reason to use an ORM. But also, as a ORM designer, don’t let the ORM be flexible to do any SQL. Only let it do performant data access.
hot_gril 17 days ago [-]
Yep, abstracting away SQL is a common and very costly mistake. The article is about more general system design, otherwise I would have expected to see that in the list.
crabbone 17 days ago [-]
I've never seen a good DSL beside something like regular expressions, and even there, I hear, a lot of people are upset by the language.
Examples of popular DSLs that I would characterize as bad if not outright failures:
* HCL (Terraform configuration language). It was obvious from the very beginning that very common problems haven't been addressed in the language, like provisioning a variable number of similar appliances. The attempts to add the functionality later were clumsy and didn't solve the problem fully.
* E4X (A JavaScript DSL for working with XML). In simple cases allowed for more concise expression of operations on XML, but very quickly could become an impenetrable wall of punctuation. This is very similar to Microsoft's Linq in that it gave no indication to the authors of how computationally complex the underlying code would be. Eventually, any code using this DSL would rewrite it in a less terse, but more easy to analyze way.
* XUL (Firefox' UI language for extending the browser's chrome). It worked OK if what you wanted to do was Firefox extensions, but Firefox also wanted to sell this as a technology for enterprise to base their in-house applications on Firefox, and it was very lacking in that domain. It would require a lot of trickery and round-about ways of getting simple things done.
* Common Lisp's string formatting language (as well as many others in this domain). Similar to above: works OK for small problems, but doesn't scale. Some formatting problems require some very weird solutions, or don't really have a solution at all (I absolutely hate it when I see code that calls format recursively).
All in all. The most typical problem I see with this approach is that it's temporary and doesn't scale well. I.e. it will very soon run into the problems it doesn't have a good solution for. Large programs in DSL languages are often a nightmare to deal with.
corinroyal 17 days ago [-]
I'm always baffled by hate for DSLs until I realize that what people are criticizing aren't DSLs, but DSLs you have to write from scratch. If you host your DSL on Lisp, then all you have to write is your domain logic, not the base language. Most of the work is already done, and your language is useful from day one. I don't understand why people insist on creating new languages from scratch just to watch them die on the vine, when these langs could have been hosted DSLs on Lisp and actually get used.
TimTheTinker 17 days ago [-]
Not just Lisp, but any language that has strong support for either literal in-language data expressions like JSON or YAML, or meta-language support like Ruby, Elixir, JSX/TSX (or both!).
Every time you write a React JSX expression, terraform file, config.yaml, etc., you're using a DSL.
I once wrote a JSON DSL in Ruby that I used for a template-based C# code generator. This enabled a .NET reporting web app to create arbitrarily shaped reports from arbitrary rdmbs tables, saving our team thousands of hours. Another team would upload report data to a SQL Server instance, write a JSON file in the DSL, check it against a tiny schema validator website, submit it, and their reports would soon be live. One of the most productive decisions I ever made.
hot_gril 12 days ago [-]
Technically yeah, but JSX isn't what people think of when you mention a DSL. I know JS, I know HTML, so I know JSX immediately since it's just templatized HTML inside JS.
CyberDildonics 17 days ago [-]
This is generally a terrible way to work. Making a bunch of custom syntax even in the same language is just adding more stuff to memorize for no gain.
Even in C using the "goes to operator" of while(i --> 0) or using special operator overloading like the C++ STL >> and << operators for concatenation is just making people memorize nonsense so someone writing can be clever.
People don't give presentations with riddles and limericks either. It can be clever as a puzzle but when things need to get done, it is just indulging someone showing off their cleverness at the expense of everyone who has to deal with it.
f1shy 17 days ago [-]
I think you misunderstood what a DSL is, or at least the point of the OP?
We are advocating exactly to keep the syntax the same as the base language, and add semantic value through the abstractions of the language.
CyberDildonics 17 days ago [-]
I didn't misunderstand anything.
If you're not changing any syntax and are just using normal function calls, that's an API and that's direct.
If you're not just using normal function calls and are making your own "semantic value through abstractions of the language" you aren't making something that is direct and are creating something that needs to be memorized.
The cleverness and indirection of the new stuff that hides what is really going on is 99% of the time not worth what it gives you, because you have to memorize this new clever thing someone came up with, then you have to learn what it is actually doing underneath that is being hidden.
f1shy 17 days ago [-]
> If you're not changing any syntax and are just using normal function calls, that's an API and that's direct.
No. Sorry. Wrong. Look SICP where they explain the concept of embedded DSl. Hint: you may be conflating syntax and language.
CyberDildonics 17 days ago [-]
Everyone understands the concept. Understanding why you shouldn't do it is what takes experience.
If you look at the source code for doom it is very straight forward. No fancy stuff, not cleverness, no pageantry of someone else's idea of what "good programming" is, just what needs to happen to make the program.
I'll even give you an example of an exception. Most for loops in C and successors are more complicated than they need to be. Many loops are looping from 0 to a final index and they need a variable to keep track which index they are on. Instead of a verbose for loop, you can make a macro to always loop from 0 and always give you an index variable, so you just give it the length and what symbol to use. Then you have something simplified and that's useful. It's shorter, it's clear, it will save bugs and be easier to read when you need nested loops through arrays with multiple dimensions.
I already gave examples before where clever extra syntax creates an exceptional situation but gains nothing.
The fundamental point here is that these opportunities are rare. Thinking that making up new syntax is a goal of programming is doing a disservice to everyone who has to deal with it in the future.
f1shy 17 days ago [-]
You are 100% right in all, except you are talking about syntax extensions (the for example) and not DSL. A DSL does not need a new syntax, is a collection of abstractions that allow to express problems in the language of the domain problem. It is not an API, because is not an interface for a functionality. Is not about exposing functionality, but to add semantic value to the upper layers. May (not necessarily, but may) be formed by a collection of functions, in that case similar to an API, in that sense. Sometimes may include indeed extensions to a language, but in that case by the standard means of abstraction preferred in that language: clases, templates, functions, structures. The key is to reduce the cognitive load for the end programmers, who could be expert in the problem at hand, but not in the underlying language of the embedding.
There is also the possibility of embedding in a non programming language, like XML (E.g. launch language in ROS), or S-exp in the Oracle listener config file. Also you can do ad-hoc like in the .msg files of ROS. But is always about semantics, not syntax. Syntax is the medium only.
CyberDildonics 16 days ago [-]
Is not about exposing functionality, but to add semantic value to the upper layers.
Sometimes may include indeed extensions to a language, but in that case by the standard means of abstraction preferred in that language: clases, templates, functions, structures.
You keep saying that there are no problems and that it isn't like anything mentioned but you don't have any examples.
What is an example of "adding semantic value" that isn't using the languages normal constructs but is still not something someone needs to learn and memorize?
f1shy 16 days ago [-]
You said a DSL has to have its own syntax, or have to change the language and it implies more cognitive load. That is just not the case, as stated with sources like SICP and Wikipedia.
The whole idea of a DSL is exactly to avoid learning something new. Of course there will be some piece of information to be learned, but what are we comparing against? Is there a solution where somebody does not need to learn absolutely anything? Of course not! You have to learn something, to be able to use it, the question is how to minimize the cognitive load.
You are right it would help some example, I have a couple in which I recently worked on:
1) We had a very complex ASIC which had a complicated way of configuring it: there were RF parameters and also a program that runs in the ASIC; say “repeat 20 times {send, receive, analyze, phase-shift}” of course the real thing is much more complicated. Now the ASIC manufacturer gives an API for doing everything, which involves setting registers, flags, internal state machines, etc. we have an expert that knows lots about RF and the application, but is weak in programming. We did it in lisp, but I will try to explain like if it was C: we made a bunch of functions, lots are very API like, setters and getters. But to program the sequence, we have functions that do flow control. In C looks a little bit awkward, in Lisp is much better. The example above would be: “repeat(20); send(); receive(); analyze (); phase_shift(); iterate();” The guy who writes that “code” does not care about the base language (we had previously never heard about Lisp, he was only able of basic Python). But he was already writing those programs in pseudocode for documentation. So the cognitive load for him is minimal. He has to remember to add “();” at the end of each instruction, and the loops are “repeat(n) … iterate” That’s it! That was much less, than if he had to learn the whole API of the ASIC, he is not a programmer, he is an RF engineer. You may say: it is an API, but look, there was already an API. Makes no sense to do API over API. It was all about transforming the language of the API, to the language of the problem at hand. The API tries to expose every detail of the hardware, in a language which is based on hardware and C, the DS language tries to hide details or translate things into the language of the problem. So the user of the DSL has to learn less.
2) There was an automated planner which lots of rules. Think about it as “1000 ifs, some nested”, originally without DSL, all was hardcoded in C++. We developed based on libconfig (think JSON with C syntax) a little language to express the ifs. Note: there was no new syntax invented, it is the underlying JSON/Libconfig, which are well known syntax. We only made a big “forach” for all elements in the config file, and each passed in a big “case” to dispatch the substructure to the handling function for each instruction. Took 1 day to implement. After that, the intelligence was in separated files, it could be reloaded dynamically, and the people doing the intelligence did not need to be C experts.
CyberDildonics 16 days ago [-]
a DSL has to have its own syntax
If it's the same language it can't be a new language. You didn't link anything with your sources.
The whole idea of a DSL is exactly to avoid learning something new
But you have to learn the DSL and you have to throw away all your tools. These are two big problems they introduce so the problem they solve better be big and tools/debugging needs to be part of making the DSL. This is why a small DSL is not a good idea.
We had a very complex ASIC which had a complicated way of configuring it: there were RF parameters
This is another side of the story. Passing parameters is data. Inside a program this is a very bad idea because you can already pass around all the data you want any way you want though function calls and memory layouts.
Passing data from one program to another or one computer to another is different, but then that isn't a language, that's a data format like any other file. GCode is a list of 'commands', but fundamentally it is a data format. If you look at the .obj format, it is ascii and needs to be parsed, but not thought of as a language.
Think about it as “1000 ifs, some nested”, originally without DSL, all was hardcoded in C++. We developed based on libconfig (think JSON with C syntax) a little language to express the ifs
This sounds like a data format. If something isn't being executed directly, it's data. If it is being executed directly, don't make a new language, because it takes a decade and hundreds of people to get it to work well.
f1shy 15 days ago [-]
I would really want to have a face to face conversation, because I see you have genuine interest in the discussion, it seems we are talking past each other.
> If it's the same language it can't be a new language. You didn't link anything with your sources.
A language is more than the syntax. For example common lisp, emacs lisp, racket and scheme are different languages with exact same syntax. Java and C have very similar syntax, but are 2 languages. Source SICP https://web.mit.edu/6.001/6.037/sicp.pdf or the videos in youtube.
A DSL does not need to have a new syntax. Source wikipedia article, under embedded DSL.
If your DSL follows existing syntax, you can use the tools. Note my example with JSON.
>> Passing parameters is data. (…)
Passing data from one program to another or one computer to another is different, but then that isn't a language
Well actually it is. And data and code cannot be tell apart. I can only recommend to go throw the SICP lectures in youtube. Your example with GCcode is good, code is data, data is code. Also about the example, consider it is, as said, a great simplification, there are lots of details and constraints that I cannot possibly enumerate here. Also note that one way of passing data between 2 computers can by done via RPC which is a language (procedures and functions are called remotely, executing code in the remote computer, which works with the data) that was actually the case in the example.
> This sounds like a data format. If something isn't being executed directly, it's data. If it is being executed directly, don't make a new language, because it takes a decade and hundreds of people to get it to work well.
A C program is also a data format. All is a data format. At the end in the compiler or interpreter the program is an AST, ALWAYS! And an AST ist just a data structure!
lispm 15 days ago [-]
> common lisp, emacs lisp, racket and scheme are different languages with exact same syntax
Far from it. On the s-expression level there are already differences. On the actual language level, Common Lisp for example provides function definitions with named arguments, declarations, documention strings, etc.
For example the syntax for function parameter definition in CL is:
Above is a syntax definition in an EBNF variant used by Common Lisp to describe the syntax of valid forms in the language. There are different operator types and built-in operators and macro operators have especially lots and sometimes complex syntax. See for example the extensive syntax of the LOOP operator in Common Lisp.
f1shy 15 days ago [-]
Yes, of course I meant the basic S-exp syntax. They are indeed very different languages. The IMHO the biggest differences are scoping, and 1-Lisp and 2-Lisp; which makes different worlds.
lispm 15 days ago [-]
all four now use lexical scope. Scheme also supports dynamic scope.
1-lisp or 2-lisp is also a difference, though all support lexical closures and function objects.
Racket now has a variant without s-expressions. That's also a huge difference.
CyberDildonics 15 days ago [-]
You keep saying there is some mythical "DSL" that isn't actually a new language, no new syntax, works will whatever tools (no word on what language or what tools), not an API, "adds semantic value", but there are no examples after all these comments.
Well actually it is.
This is conflating the term 'language' to mean whatever you want at the moment. There are things that execute and things that don't. These two should be kept as separate as possible, but this is a less that people usually need to learn for themselves after being burned many times by complexity that doesn't need to be there.
And data and code cannot be tell apart. I can only recommend to go throw the SICP lectures in youtube
A C program is also a data
You aren't the first person to be mesmerized by SICP, but if someone gets involved in thinking something is a silver bullet, they will tend to try to find information that validates this belief and reject info that doesn't. This pattern is found elsewhere in life too.
To understand some context, early in the life of LISP and Scheme, there weren't as many scripting languages and people mostly hadn't had a lot of experience with being able to eval tiny programs in their programs. These days that might be used to enable people to write small expressions in a GUI instead of a constant parameter. Many times in programming history people see something new and think it will solve all their problems.
Java went through the same thing. For a long time people though deep inheritance hierarchies would save them until gradually people realized how ridiculous and complicated it made things that could be simple. Inheritance from a base object let people use general data structures and garbage collection + batteries included seemed great, but programmers conflated everything together and thought this terrible aspect of programming was a step forward.
Lisp was very influential, people didn't have scripting languages back then but it isn't a modern way to program.
Data formats are a separate issue and mixing in execution to those is a bad idea too, because the problem they solve is getting data into a program. When you put in execution you no longer know what you're looking at. Instead of being able to see or read directly the data you want, now you need to execute something to see what the values actually are. When you need to execute something you have all sorts of complexity including the need to debug and iterate just to see what was once directly visible.
f1shy 15 days ago [-]
>You keep saying there is some mythical "DSL" that isn't actually a new language, no new syntax, works will whatever tools (no word on what language or what tools), not an API, "adds semantic value", but there are no examples after all these comments.
I gave you 2 examples, one in lisp, one based on JSON. I said no new syntax, but indeed you have to learn something, if it is a DSL, it is a new language, is on the very name. As long as you make something new, it has to be learned. The point is, if the new thing looks very near the problem domain, an expert in that domain will have no problem in learning it faster than anything else. Again, what are the alternatives?
I do think data and code must no be separated strictly. I do bot like the OOP hype because the reasons you mentioned about Java. BUT: the idea of putting together data and the code in an object I find good in general.
> You aren't the first person to be mesmerized by SICP, but if someone gets involved in thinking something is a silver bullet, they will tend to try to find information that validates this belief and reject info that doesn't. This pattern is found elsewhere in life too.
I do thin SICP is great, and it was a before and after for me. But I do bow found any silver bullet there, quite the opposite, I learned many good ideas, DSLs also, but I use them only when they make sense.
> Java went through the same thing.
My take on java (little off topic) like many other popular languages, started as a bunch of very good ideas, and was victim of its own popularity, it was over hyped, as the solution for all, got bloated, also many subpar programmers started writing tons of it, until the whole ecosystem was totally ruined. Something similar happened with basic, VB, and is happening with Python to certain degree.
> because the problem they solve is getting data into a program. When you put in execution you no longer know what you're looking at. Instead of being able to see or read directly the data you want, now you need to execute something to see what the values actually are. When you need to execute something you have all sorts of complexity including the need to debug and iterate just to see what was once directly visible.
It sounds to me like you got burned by a shitty mixing of code and data, that made your life hard.
> This is conflating the term 'language' to mean whatever you want at the moment. There are things that execute and things that don't.
A language has not to be executable. There are query, configuration, markup languages. A DSL must not be a new scripting language, or even executable. Can be for configuration. And note that is not that I’m stretching the definition by any means: TeX and MD are languages, is overall in the documentation. Also SQL is a language. Maybe we have a different definition of language and there comes all the confusion? Again, I’m 100% that if we meet we would be on the same page in 95% of the topics! :)
CyberDildonics 15 days ago [-]
It seems like most of what your saying is just conflating an overloaded term of language.
Yes, people use the term language for different things, it doesn't mean they are the same.
Also what you called a language in your first example everyone else would call an API. What you called a language in your second example is just a config file.
It seems that the reality of what you're saying is that you are using 'lots of little languages' because you are calling lots of things languages that no one else does.
vbezhenar 17 days ago [-]
If you wrote some functions, it's not DSL, it's functions.
If you calling them in a fancy way with overloads and whatnot, it's not DSL, it's fancy functions.
DSL is domain specific language. It includes domain specific syntax, domain specific semantics and domain specific libraries.
f1shy 17 days ago [-]
Absolutely no. It may have specific syntax, but is not needed. Where do you have such definition? In fact the typical example is in Lisps, where you add no syntax.
Is not about fancy functions. And not about new syntax. Is about adding semantic value. If somebody adds a collection of functions that allow the expression of solutions to a problem in the very language of the problem, that is a DSL, if the syntax chosen, for whatever reason, e.g. simplicity, happens to be the same as some underlying language, that takes nothing to the fact that it is a DSL.
If you look at the examples of SICP, they are “just” fancy functions. But they are DSLs
An extract of the wikipedia article:
As embedded domain-specific language (eDSL)[4] also known as an internal domain-specific language, is a DSL that is implemented as a library in a "host" programming language. The embedded domain-specific language leverages the syntax, semantics and runtime environment (sequencing, conditionals, iteration, functions, etc.) and adds domain-specific primitives that allow programmers to use the "host" programming language to create programs that generate code in the "target" programming language.
f1shy 17 days ago [-]
Exactly. Good DSL are typically (although not always) embedded in another. When that is the case, they tend to be a perfect abstraction (if decently implemented)
PaulHoule 16 days ago [-]
with static imports in Java I buld DSLs that are basically Lisp style DSLs like
var f = f1(f2(a,b,f3(c,quote(f4)))
which have a grammar backed by the full faith and credit of the Java type system. You can code gen the static imports.
seanmcdirmid 17 days ago [-]
Eh, you can host DSLs in Kotlin and C# these days, you don’t even have to sell your engineering team on Lisp. The biggest challenge is to explain how an embedded DSL differs from being just a library (interop outside of the eDSL to the host language is still hard).
billyp-rva 17 days ago [-]
> (1) DSLs work great sometimes.
I'll take a stab at fleshing this out: DSLs work great when they have an IDE with autocomplete and a quick (or instant) feedback loop.
diggan 17 days ago [-]
I was gonna say something like "DSLs work great when they're small, purposeful and easy to test", I guess yours kind of helps when they're not what I'd suggest :)
tinthedev 17 days ago [-]
This whole thread said exactly what I wanted to write. Feels bad to be so pre-empted.
DSLs that solve a specific problem with a page or two of documentation overhead are great.
Trying to reinvent paradigms or scope creep is where the pain comes in. Seems like the post author has been burned by that type of DSLs.
earnestinger 17 days ago [-]
> DSLs that solve a specific problem with a page or two of documentation overhead are great.
Do you have any example? I’ve heard lots of good things of dsl, but never had the luck to witness it’s full glory.
(except for regex, which I love, but it has more than two pages of docs)
tinthedev 15 days ago [-]
I've coded some myself, and have used some... but it depends on where you draw the line.
I'd consider Python's f-string syntax a DSL of sorts.
YAML might be considered a simple DSL, if you don't consider it a language/format instead. It's a bit more than 2-3 pages, but it's not hundreds of pages. And a simplified version could be constructed with <10 pages.
Similar to YAML, but for Markdown. I'd call that a DSL too, and it's even simpler than YAML.
Then, something more tiered as: CSV, JSON, TOML, INI, AsciiDoc
Once you're in the short form, it's a bit blurry what's a format, what's a DSL, and what is a language.
PS. Sorry for the late answer, I missed the direct question for a bit.
f1shy 17 days ago [-]
I would say that is the line between DSL and just “L” another language…
PaulHoule 17 days ago [-]
Maybe his problem is with those yucky distributed systems as in Kube, Antilles, etc. I write plain ordinary Java programs that work with a cloud API and compile bash scripts that colonize machines.
verdverm 17 days ago [-]
I think of Kube as more of abstractions whereas Wasp & Dark lang would be DSLs for the same concepts
crabbone 17 days ago [-]
Counterexample: regex. In terms of how successful DSLs are, I think, something like Perl's regular expression is, at least, in the top ten. Most regex users don't care about there being an IDE for it, I don't think there's a lot of value for regex autocomplete, even if such thing existed.
hot_gril 12 days ago [-]
Regex is a good counterexample. It's the only useful DSL I can think of. That said, IDE support for regex would be cool, especially considering many languages have special syntax for regex.
17 days ago [-]
mdaniel 17 days ago [-]
> and a quick (or instant) feedback loop.
And yet Terraform/Tofu continues to poison people's brains. It boggles(!) the mind
f1shy 17 days ago [-]
So any DSL as an embedded language inside another.
Spivak 17 days ago [-]
Also k8s is living proof that control loops can and do work. That's like the entire point of its existence.
kernelbandwidth 16 days ago [-]
Counterpoint: k8s is a bad orchestration system with bad scaling properties compared to the state machine versions (Borg, Tupperware/Twine, and presumably others). I say this as someone who has both managed k8s at scale and was a core engineer of one of the proprietary schedulers.
xorcist 15 days ago [-]
It is, and from experience it is also a good example how control loops are harder than you think. Few people understand it, and it is the source of much underlying trouble.
einpoklum 17 days ago [-]
(3.b) "Being clever rather than over-provisioning" is not generally thought of as a good idea. People would be rather apprehensive if you told them "I'm don't something really clever so we can under- or exactly-provision". I mean, sure, it may indeed work, but that's not the same thing.
(5) Hybrid parallelism - also, many people think it's a bad idea because it makes your software system more complex. Again, it may be very useful sometimes, but it's not like many people would go "yes, that's just what I'm missing right now, let's do parallelism with different hardware and different parts of the workflow and everything will run something kind of different and it'll all work great like a symphony of different instruments".
pizlonator 17 days ago [-]
I think you’re responding to the tweet from Martin that Steven includes at the top, not to Steven’s list.
echelon 17 days ago [-]
The tweet has some claims that don't fly.
You don't get the luxury of offline migrations or single-master writes in the high-volume payments space. You simply don't have the option. The money must continually flow.
kelnos 17 days ago [-]
That's fine. The points made in the tweet are interesting to discuss too.
hot_gril 17 days ago [-]
I've yet to see a DSL work great. Every single time, I'm asking "why isn't this just Python (or some other lang)" especially the times when it's some jacked up variant of Python.
dude187 17 days ago [-]
Groovy running on jython
hot_gril 12 days ago [-]
Oh wait, regex is a DSL
shermantanktop 17 days ago [-]
There are many successful examples of all of these. Using “almost” as an escape hatch doesn’t work here.
This is just pessimism and weary cynicism. I get it, I’ve felt that way too, and sometimes it’s hard to talk an eager engineer out of a bad idea. But for me, this vibe is toxic.
tptacek 17 days ago [-]
It's just a high-engagement (if you like, "bait-y") way of saying "these are deceptively tricky things to get right or deploy effectively".
yuliyp 17 days ago [-]
A lot of those "successful" examples have teams of battle-scarred engineers dealing with all the failures of those ideas. Control loops running away to infinity or max/min bounds, a cache that can't recover from a distributed failure, corrupted live-migrated state, bursts causing overload at inconvenient times, spurious anomaly-detection alerts informing you of all the world's holidays, etc.
Underneath all of those ideas is a tangle of complexity that almost everyone underestimates.
api 17 days ago [-]
I know I'm not alone in this, but after doing this for more than 20 years I can't shake the idea that we are doing it wrong -- meaning programming. Is it really this nit-picky, brittle, and hard?
The brittleness is what gets me. In physical mechanical and even analog electrical systems there are tolerances. Things can almost-work and still work for varying degrees of work. Software on the other hand is unbelievably brittle to the point that after 50 years of software engineering we still really can't re-use or properly modularize code. We are still stuck in "throw it away and do it over" and constantly reinventing wheels because there is no way to make wheels fit. The nature of digital means there are no tolerances. The concept isn't even valid. Things fit 100% or 0%.
We keep inventing languages. They don't help. We keep inventing frameworks. They don't help. We keep trying "design patterns" and "methodologies." They don't help. If anything all the stuff we invent makes the problem worse. Now we have an ecosystem with 20 different languages, 50 different runtimes, and 30 variants of 5 OSes. Complexity goes up, costs go up, reusability never happens, etc.
I remember for a while seeing things like the JVM and CLR (the VM for C# and friends) as a way out-- get away from the brittle C API and fully compiled static native code and into a runtime environment that allowed really solid error handling and introspection. But that paradigm never caught on for whatever reason, probably because it wasn't free enough. WASM is maybe promising.
intelVISA 17 days ago [-]
No, it's not terribly hard for a suitably compensated and skilled team with the appropriate tools and timeframe. Yes, we are doing it wrong.
Most of the 'inventions' you describe are more aimed toward reducing the barriers of entry: the promise that your team of expensive C wizards would now use Java at greater speed with less defects became "now we can just use cheap CS grads" at slightly worse but still acceptable levels.
Without any real consequences for poor software (see CrowdStrike's YTD despite its multi-billion dollar farce in July) it's only logical that the standard will always be "bare minimum that can be shipped". Developer productivity is a misnomer really - it just means company profits increase thanks to a widening pool to hire from and even more crapware per dollar can now be squeezed from each worker.
skywhopper 17 days ago [-]
Nah. CLR/JVM/WASM are the same pipe dream of a universal architecture/shared-compute utopia.
But I think you have the wrong take on “reusability”. Every non-software engineering project is an exercise in custom solutions as well. The reusable parts are the tools and materials. Likewise in software engineering the languages, OSes, protocols, libraries, design patterns, and frameworks are the reusable bits. Code is how we describe how it all fits together, but a huge amount of what it takes to run a system is being constantly reused, much bigger than the code we write to implement it.
api 17 days ago [-]
I’m aware that many HN people view this as a pipe dream, but why? It works. The largest compute platform in the world, namely the web, is like this, and many of the largest businesses run on the JVM and the CLR. Loads of nasty problems go away when you are not directly mangling bits in memory and when you have a real runtime.
Of course modern safe languages like Rust give you some of those benefits in compiled code too.
wakawaka28 17 days ago [-]
>The brittleness is what gets me. In physical mechanical and even analog electrical systems there are tolerances. Things can almost-work and still work for varying degrees of work.
I think you are misrepresenting how flexible software is versus hardware. Mechanical and electrical systems have tolerances but if you go outside those tolerances, the whole system can be destroyed. Nothing like that is common in software. Worst-case outcomes might be like "the performance isn't as good as we want" or "this code is difficult to work with." Software components are very flexible compared to anything physical, even in the worst cases.
>We are still stuck in "throw it away and do it over" and constantly reinventing wheels because there is no way to make wheels fit. The nature of digital means there are no tolerances. The concept isn't even valid. Things fit 100% or 0%.
I don't know how one can look at the amazing array of libraries out there and conclude that we have no reuse. Sometimes people build their own solutions because they need something very simple and the libraries are too big to be worth importing and learning in those circumstances. That's not a flaw in the libraries. It's human nature.
>We keep inventing languages. They don't help. We keep inventing frameworks. They don't help. We keep trying "design patterns" and "methodologies." They don't help. If anything all the stuff we invent makes the problem worse. Now we have an ecosystem with 20 different languages, 50 different runtimes, and 30 variants of 5 OSes. Complexity goes up, costs go up, reusability never happens, etc.
All of this is too pessimistic. These tools do help. Exactly how many languages do you think we should have? Do you think exactly one group is going to develop for each use case and satisfy everyone?
>WASM has its uses but I can't escape the idea that it's like "let's build a VM and carry all the shortcomings of C into it."
I'm not a web guy but this sounds silly. It's not meant to be written directly. Complaining about shortcomings of WASM is literally like complaining about shortcomings of assembly language. It's not intended for human consumption, in modern times.
ghaff 17 days ago [-]
There's probably more tendency to be sloppy in software because "we can always fix it in post." But absolutely, with hardware, if you don't get it right--especially with heavy construction or modern electronics, you're going to have to rip a lot out and start over.
vacuity 17 days ago [-]
I've thought that we should be able to move so fast with software, and sometimes I see that successfully, but usually it seems like we leverage computers' superior performance poorly. To pick on frontend, it's trivial to create the next shiny Javascript framework, but what of it? Or, how come refactoring can be so painful, when the semantic change might be small? I think the flexibility and performance of computers is such that we programmers are usually incapable of effectively using them. It's like a 3D optimization problem visualization, looking for the highest peak around, except the cursor moves a lot faster than it can peruse the landscape. It zooms around aimlessly, easily getting to arbitrary places but without the capacity to make sense of them. When the train is in motion, switching tracks is hard, even if that would be the best move.
ghaff 17 days ago [-]
It's easy to assume that you can always fix things after the fact. And I was only half joking with the fix it in post comment. Modern film suffers from some of the same problem. No need to get it right on the first pass. We can always apply corrections later.
NBJack 17 days ago [-]
I think that's fair: the success stories are very rarely "yay, we did it!" but much more often "this single change was the sole focus of a team/multiple teams for X months, and the launch/release/fix was considered a culmination of a significant investment of resources".
samatman 17 days ago [-]
Basically agree. This is a list of systems ideas which are harder than one might initially think, and should be approached seriously, never casually.
That slant-rhymes with "sound good but almost never work" but in detail is completely different. When treated as difficult problems, and committed to accordingly, having them work and work well is a normal result, eminently achievable.
As afterthoughts, or when naïvely thought to be easy, then yeah, they frequently go poorly.
phil21 17 days ago [-]
> This is just pessimism and weary cynicism.
I don't read it that way. I read it as engineers engaging in pre-optimization for no business benefit. It's utterly rampant in the industry because it's fun to design and build a redundant auto-scaling spaceship vs. just over-provisioning your server by 200% for a tenth (or less!) of the cost and having backups ready to deploy in a few hours.
Sometimes these ideas make sense - after you need them. Not designed-in at the early product stage. Very few products go on to need the scale, availability, or complexity most of these implementations try to solve.
pizlonator 17 days ago [-]
I think Steven is saying that these things are hard and unusually don’t work out, not that they’re impossible.
f1shy 17 days ago [-]
Mmmm danger! Then let’s stop doing anything that is difficult?
If that is bot the message, what is that? “These things are hard, often don’t work, but GO FOR IT”?
I pretty much read: “try to avoid” which is bad advice in my opinion. Like “documenting SW properly while doing development is hard, and often goea wrong” so what?!
Spivak 17 days ago [-]
Don't do anything difficult until you've exhausted the easy ways first is generally good advice.
It's "work smarter not harder" for knowledge workers who don't realize that using your brain
more is the hard work in the saying.
aprilthird2021 17 days ago [-]
Pessimism and weary cynicism can be very valuable in many tech environments though. It keeps you stable, working on tried and tested things, and safe. There's a lot of situations where that's super valuable
shermantanktop 17 days ago [-]
I had to learn that attitude in order to operate at one employer. I had to unlearn it to survive at the next, where engineers were better and routinely pulled off things I had dismissed as unrealistic.
The value of that approach is very situational…though I will acknowledge that the majority of places probably warrant at least some of that.
riwsky 17 days ago [-]
So many people here trying to thread the needle looking for subtle decision functions for exceptions. It's pretty simple, really: these ideas are awesome when I do them, and never work as intended when that idiot before me did them.
plagiarist 17 days ago [-]
That sounds exactly right. And sometimes the idiot before me was myself from several months ago.
stevebmark 17 days ago [-]
I would add "Domain Driven Design" - locking your business design in place by trying to make your application match your business structure is a recipe for disaster. If you have a small or stagnant business you probably won't notice any issues. If your business is successful and/or grows, you're going to immediately regret trying to build domains with horrific descriptive names tied to your already obsolete business practices. Instead, design around functionality layers (how we've been doing it for decades, tried and true), and as much as possible keep business logic in config, rows in databases, and user workflows, which makes them extremely flexible.
eddythompson80 17 days ago [-]
You'll regret both options. You mentioned the pitfalls of "domain driven design" (outdated language, little code/systems reuse for new endeavors).
However, a highly abstract design with all the business logic in config, workflows, etc will only makes your system extremely flexible as long every one up and down the organization is fairly aware of the abstractions, the config, and uncountable permutations they can take for your business logic to emerge.
Those permutations quickly explode into a labyrinth of unknown/unexpected behaviors what people will rely on. It also makes the cost of onboarding new developers, changing the development team insurmountable. Your organization will be speaking 2 different languages. Most seemingly straightforward "feature asks" that break your abstraction either become a massive system re-design/re-architect or a "let's just hack this abstraction so it's a safer smaller change for now". The former will always be really hard unless you have excellent engineers who have full understanding of the entire system and its behavior and code base along with and excellent engineering practices and processes, and still will take you months or years to pull off. The latter is the more likely to happen and it's why all those "highly abstract, functionality layers, config driven, business logic emerging) projects start perfect and flexible and end up as a "what the fuck is even this".
After a system is implemented, that emergent business logic becomes the language everyone will speak in. Having your organization speaking 2 or 3 completely irreconcilable languages is very painful and unless you have multiple folks up, down and sideways in the organization that can fluently translate between the 2, you'll be in a world of pain and wish you had some closer representation of your domain
jkaptur 17 days ago [-]
Along with "make impossible states impossible to represent". If you're designing your types to make a state unrepresentable, you'd better be absolutely sure the state really is impossible for the lifetime of the design.
wesselbindt 16 days ago [-]
I don't understand, you're not allowed to change the types? Suppose you disallow any number other than 1, 2, and 3, and you model this with some enum with three members or something. Then you find out that actually a prospective client would love to also work with with the number 4, then you just add 4 to the enum in your next release, no?
spencerflem 17 days ago [-]
If its not impossible, you should handle it though.
I think what that quote is against is the common middle ground where states are expected to be 'impossible' and thus not handled and cause bugs when they are found to be not actually.
Either deciding that they are possible or must be impossible is usually better and which one to go with depends on the specifics
pvillano 17 days ago [-]
The variant of this I've found useful is to have a separate types for raw/dirty and parsed/validated data
Brian_K_White 17 days ago [-]
I don't understand the load-responsive control loop one. That's a basic and fundamental component in countless systems. The centrifugal governor on a 1800's steam engine or 1900's victrola record player is a load-responsive control loop. All of electronics is a mesh of load-responsive control loops. The automatic transmission in your car...
lclarkmichalek 17 days ago [-]
The usual issue is the addition of control loops without much understanding of the signals (CPU utilization is a fun one), and the addition of control loops without the consideration of other control loops. For example, you might find that your cross region load balancer gets into a fight with your in-process load shedding, because the load balancer's signals do not account for load shedding (or the way they account for the load shedding is inaccurate). Other issues might be the addition of control loops to optimize service local outcomes, to the detriment of global outcomes.
My general take is that you want relatively few control loops, in positions of high leverage.
panic 17 days ago [-]
It’s not totally clear, but it could be talking about CPU load in particular, which has some problems as described in https://arxiv.org/abs/2312.10172.
Spivak 17 days ago [-]
I've always used connection backlog as the metric for load and it's worked pretty well. Most web servers have it as a number you can expose as a metric. It's not perfect but it's at least a true measure of when servers are behind.
nostrademons 17 days ago [-]
There is a pattern to all of these problems, notably that they are all orthogonal concerns that add constraints to the sequential data-munging programming model that programmers are familiar with. Whenever you add constraints, you add things that future programmers need to think about, for all future development they do on the system. It is very easy to get into a situation where the system is overconstrained and it's impossible to make forward progress without relaxing some of the constraints. Even if it's not impossible, it's going to be slow, as developers need to consider how their new feature interacts with the API/security/synchronization/latency/other-platforms/native-code that the system has already committed to supporting.
That's also why it's possible to support all of these attributes. If you make say transparent data synchronization a core value prop of the platform, then all future development supports that first, and you evolve your feature set based on what's possible with that constraint. That feature set might not be exactly what your users want, but it's what you support. Your product appeals to the customers for whom that is their #1 purchase decision.
fancyfredbot 17 days ago [-]
I've worked on several DSLs, a P2P cache, and a project employing hybrid parallelism. All of them worked. All great fun to create. With one exception these projects were good investments (The P2P cache wasn't necessary so never really paid off). My point is that it's definitely wrong to say these things almost never work. They are complicated but that complexity brings functionality which is hard to achieve in another way. The lesson from the P2P cache example was to be sure you actually need that functionality.
ibejoeb 17 days ago [-]
"Let's just sync the data" is the reason why rough days exist, as far as I'm concerned. I've run into so many systems that were designed for "internet scale" or whatever that add queues, event processing, etc., when the natural range is so far below that threshold. These teams are either naive or, at worst, taking advantage of non-engineering management and funneling money toward playing with these problems for the fun of it.
mrkeen 17 days ago [-]
I read "just sync the data" as naively doing reads over here and writes over there, and hoping that the two sources of truth don't diverge.
Queues and event processing are a necessity to do it right.
ibejoeb 17 days ago [-]
That's sort of the point. The operative word is "just." In reality, it adds a huge subsystem that needs more resources than the overall system does. Rather than solve the underlying problem directly, they introduce (more interesting) problems to solve transitively via these complexities.
mrkeen 17 days ago [-]
I don't know of any way to replicate data ("solve the underlying problem") aside from a replicated log.
It's like git solving the problem without commits, or banking solving the problem without transactions.
ibejoeb 17 days ago [-]
Ah, ok, I'm failing to communicate my point. The underlying problem in this scenario is something basic which doesn't require replication/sync at all. I'll use an actual example that fell on me a while back. There was an app that displayed a news feed. The theoretical upper limit of the volume, due to the size of audience, was hundreds of posts per day, while the practical number was dozens. The architecture was:
* postgresql as the system of record
* firestore as the upstream source for clients
* ES for full-text search
* client-side store for actual client-side reads
* http api for mutations
That required three sync systems: pg->firestore, pg->ES, and firestore->local store. Then it needed messages for the async mutations to propagate back to the clients. And then these things require more things to make them work, like data transformers to support the three different formats for each stage.
This certainly did not require some giant CQRS system and could have been built entirely on postgres. It was a fractal of code that didn't go toward the actual objective.
bhouston 17 days ago [-]
I've executed a bunch of these ideas successfully. So it reads a little weird.
mdavid626 17 days ago [-]
Either you really know what you are doing, or, you really don't.
liontwist 17 days ago [-]
Anyone have any experience with the control loop one?
My uninformed guess is CS people just underestimate the skills and experience to analyze feedback systems and so write it off as a bad technology after a poor implementation.
Maybe the real problem is you want to know when your system is maxing out the range you anticipated?
avidiax 17 days ago [-]
Control theory is an entire discipline.
One problem that constantly comes up is ratcheting or poisoning your own inputs.
Let's say you want to block "noisy neighbors" from taking large amounts of resources, while allowing all loads to have bursty use of the full system power. Easy, right? Detect the noisy neighbors and throttle them to a small percentage. Unthrottle when they show some substantial idle time. But now, many of those noisy neighbors can't get out of jail because they will, of course, use nearly 100% of their restricted load, even if they would now be well-behaved and merely bursty.
There is also a cascading effect, where you have N related bursty loads. One bursts for too long, gets throttled, and now the load is handled by the remaining N-1 loads. But that makes those loads more likely to get throttled, and so on. Only unthrottling all the loads simultaneously will allow them to return to normal bursty operation.
bobnamob 17 days ago [-]
The fun thing about control theory in CS is that (in my experience) it's far easier to build accurate simulations or even run scaled down experiments than it would be in a physical setting.
See articles like [1] or any of Marc Brooker's [2] blog for inspiration
DSLs make the impossible possible. Whether this is good or bad depends on the design and implementation.
heisenbit 17 days ago [-]
Compared to general purpose programming when using real world DSLs one also can encounter the case where the unusually possible becomes virtually impossible.
fancyfredbot 17 days ago [-]
Which often is exactly what the DSL authors would have intended. If a DSL can make the common case faster, easier to program, and less buggy then that DSL is a success. The uncommon case should be written in a general purpose language just as it would have been if there was no DSL. Writing a DSL to be general enough to handle every possible use case is probably going to end up being a general purpose language, and it's probably going to be worse than the one you started with.
k__ 17 days ago [-]
Most of these things work pretty well.
It's just that they don't "just" work.
Just because AWS or Google can pull it off doesn't mean it's something anyone can do.
sgarland 17 days ago [-]
THIS. And adding to it, most people really do not like being told that they aren’t as clever as AWS or Google.
jcims 17 days ago [-]
Yep, I think the title should be “things that are way harder than they seem”.
mjaseem 17 days ago [-]
I feel "Let's just" is doing much of the work here. You can add the phrase in front of any idea and make it sound awful:
"Let's just make planes fly themselves"
"Let's just put a giant battery in a car instead of an engine"
"Let's just make electricity from wind"
"Let's just be friends"
I would love to see a list like this without such a big asterisk.
earnestinger 17 days ago [-]
Let’s just make a list without asterisk.
ChrisMarshallNY 17 days ago [-]
The Programmer’s Credo:
We do what we do. Not because it is easy; but because we thought it would be easy.
nottorp 17 days ago [-]
> Let's make it cross-platform.
So if making it cross platform "won't work", this is not about what won't work, but what is cheaper?
Since the OP kinda mentions gaming, let the customers fiddle with their wine installs and steam decks and spend the time on adding more loot boxes instead?
IshKebab 17 days ago [-]
Yeah I don't buy that at all. He says "9 times out of 10" and then gives MS Office as the proof? MS Office is the 10th time out of 10, not the 1-9.
I think they absolutely could have made Office cross platform (it doesn't really do anything you can't do in Qt), and also the fact that they forked it and made two entirely separate sets of apps has pretty serious consequences. The feature sets are surprisingly different. E.g. you can add PDFs to documents on Mac; not on Windows.
Most software can be made cross platform (as long as you don't explicitly prevent it by using a platform specific GUI toolkit or whatever).
throwaway63467 17 days ago [-]
Hm I can think of several instances for each point where the ideas work just great. It’s just difficult getting complex stuff to work in general.
CyberDildonics 17 days ago [-]
A lot of times what people want from a DSL is really just something that doesn't have to compile and/or something to run in a VM. Sometimes people just want a text format for data that could be done with json.
Making a new language throws away everything and starts over including debugging, tools, syntax checking, auto completion, documentation etc. It is take way too lightly and just becomes a hassle.
You can see this in multiple GUI libraries, where getting the parameters to set up a GUI is really not difficult and is just data through function calls, but it gets made into a separate XML like DSL markup language with it's own quirks and opacity, and that XML is for most people being given using a big string from within the language that they're using.
This stuff persists because it sounds easier on paper and it just creates more problems in practice but it takes experience to realize all the you're losing. That's where designers need to be experienced and do what works instead of what will suck in people that don't know any better.
adzm 17 days ago [-]
With how easy it is nowadays to embed Lua, and even JavaScript with just a little bit more effort, it's pretty inexcusable to develop your own language for a tiny space. A great example of this is Adobe embracing JavaScript for it's own extension and scripting and expression language, which has allowed it to benefit from an entire ecosystem of JavaScript libraries.
xyzsparetimexyz 17 days ago [-]
I've been writing glsl recently, and I think it's a good example of a DSL that _doesn't__ work. It should have just been C without the standard library, or ideally C++. Not being able to use pointers is especially painful. Hlsl is better and has some nice quality of life features. Rust-gpu is the best approach but basically nobody uses it
poooka 17 days ago [-]
Disagree with plugin talk. I use plugin system that loads from submodules so I can have same architecture in multiple airgaped networks. Can have implementations load (for that environment) outside of main code base. Nothing wrong w that given complexities of maintaining complex ass configs if everything was shoved into the same artifact.
ChrisMarshallNY 17 days ago [-]
Happy 2025, folks!
My (recent) experience:
> Let's make that asynchronous
It can be done, but here, there be dragonnes.
I just had to make a formerly synchronous load in a recently released app, into an async one, because users with large connection lists (think "friends," in Facebook, but not as "friendly"), were having extremely slow loads.
This was a big change.
First, I had to swap out an entire SDK that accessed the most important server in the app, because the old one didn't play well, with threads (my bad). That actually went fairly smoothly, because of the abstraction (boo hiss, I guess?) that I had used for the SDK. Took about a day, to have the operation running smoothly.
Testing...Testing...Testing...
Next, I had to test like crazy, for weeks, on the user-level code, because the new threading brought the beast out in that code. I found all kinds of places, where I had written thread-unstudly code. None of the issues were serious, and many folks would have said "Fuck it. Let's ship," but I'm a bit anal about certain things, and I'm not being paid, anyway...
In the end, the conversion was a success (not one complaint —fingers crossed), and we got the results we needed, but it took a hell of a lot of testing (especially monkey testing), and Release Day was a nervous one. The UI basically didn't change at all (except for some loading throbbers on the profile avatars), but under the hood, a lot had changed.
Sometimes, it needs to be done, but it's tempting to make it seem easy (which many folks will). I am a rather scarred veteran of "That should be easy," so I went in, with eyes open, and (as noted) had already prepared, with a certain level of abstraction. I figured that the SDK swap would be [relatively] easy, but didn’t bargain for all the little bugs, in the code I thought was already sorted.
Guthur 17 days ago [-]
I don't necessarily think the ideas themselves are wrong but rather the devil is in the details.
A good example is the DSL one. In reality every significantly complex software system is essentially a DSL, it will have it's own collection of nouns and verbs that constitute the application domain, this is a language and rarely is it understandable to anyone without the domain knowledge.
The problems often arise when custom denotational semantics are added that hinder composition. This is overcome by 'not using a DSL' and instead just expose the semantics of your implementation language, but essentially an API is little different from a DSL just with all the baggage and coupling to the underlying implementation details.
NBJack 17 days ago [-]
I'm wondering about the first example of device drivers. Whether we are talking about Linux or Windows, anything beyond generic device drivers (which push the problem to the hardware) will frequently be vendor specific. You often can't get the full potential of a device until you do so.
exabrial 17 days ago [-]
I thought live migrating process state did work...? we have containers that can freeze/resume.
Granted, you have to freeze the whole damn OS image, but it does seem to work.
All of the others I agree, I've seen them tried so many times in my lifetime, from mainframes to serverless, and nobody gets them right.
thraxil 17 days ago [-]
Yeah, and I was live migrating entire VMs between hosts on a Xen cluster 10-15 years ago. It was easy for me to get working, reliable, and I was far from an expert.
bee_rider 17 days ago [-]
What do people want to use hybrid parallelism for in systems programming?
It works really well in the HPC world I think (MPI+OpenMP on a node is the de-facto standard). But… I dunno, I guess when I think systems programming I think of the stuff doing all the bookkeeping. The bookkeeper better not steal all my cores!
DSL is a sort of interesting one. What’s a DSL for systems programming? I’ll naively throw C and Rust as the systems programming DSLs. Of course, the domain of systems programming is, uh, controlling all the hardware. So it isn’t that surprising that the DSLs of systems programming quickly become the languages that everybody wants to use for everything, right? The problem isn’t the “specific language” part, it is that a good enough systems programming language quickly gets the domain of “everything,” haha.
IDK about anomaly detection being on that list. When I worked at a large tech company, the in-house anomaly detection and root cause analysis capabilities worked like magic compared to other places I've been. When done right, it can be extremely valuable.
jldugger 17 days ago [-]
Half of the magic is knowing _when_ to apply it. If you just dump your prometheus timeseries data into an anomaly detection system and ask it to constantly scan for anomalies, you will always find them.
As an SRE, I don't actually care about anomalies all that much. For initial alerting, I want phase shift detection. One customer sending a few bad API calls on one specific minute is uninteresting and pretty much inactionable. That same error rate over 10 minutes is more interesting and more likely to be a systems problem I can actually resolve. But the raw AD stream is just too damn noisy for all sorts of reasons. This is why our alerting tools have a `duration` field: any signal above threshold must remain so for multiple observation periods before summoning human inspection. And why health checks have grace periods and retries before killing services.
Where anomaly detection works better, IMO, is post-alert analysis. At that point anomalies are welcomed as hypotheses, since the system features complex interactions between components. I've built a couple of dashboards using extremely simple math, like Laplace smoothing and time series correlation that help surface relevant information from the flood of metrics and logs collected. But critically, these tools generally don't use the time domain as their baseline. Usually, I'm comparing a cluster against another one in a different region, or one metric against another, rather than now versus twelve hours ago.
kitd 17 days ago [-]
Sounds like you hit the 1/10. Which is great and I agree, very rewarding.
cesaref 17 days ago [-]
Let's consider the original list. It starts with DSLs. Are we saying that using SQL 'almost never works'? Shader languages? BPF? If I squint a bit i'd say that all the inference engines are either DSLs or APIs which wrap an underlying DSL, so TensorFlow etc.
I could probably come up with similar examples from the rest of the list. If the message is 'system ideas which seem simple but are hard, so you should only use existing off the shelf examples of' then i'd certainly have more sympathy for the statement.
banq 17 days ago [-]
Why do people keep talking about it? Is it because they've fallen into a dialectical trap? No, it's because they lack the awareness that "Context is King."
cratermoon 16 days ago [-]
"Microsoft essentially existed because it made its business building cross-platform apps"
If your idea of cross-platform is that making it work on all versions of Windows from 3.1 to Vista, sure.
Microsoft has always been about capturing users within its ecosystem.
LAC-Tech 17 days ago [-]
There are equations/properties for data that syncs nicely and allows mutli-master writes. Ie, if your merges follow a few simple properties, this stuff will work.
These are basically what CRDTs are. And I know what you're thinking; "but I'm not writing a mutli user text editor!" or "automerge won't scale!". But CRDTs aren't a library, they're a set of properties. if your whole data system obeys the properties - you have a CRDT.
klysm 17 days ago [-]
I’m not really sure what you’re saying. Do you mean you achieve the gaurantees of CRDTs without using CRDTs? What technology are you implicitly referring to?
Whenever I hear someone say “sync” data, I instantly get scared. Consensus is fraught with peril and very very very difficult to implement correctly.
LAC-Tech 17 days ago [-]
It's more shifting the view point. CRDTs are not just something you 'use'. They are laws a system must be obey to achieve strong eventual consistency.
Eventual Consistency:
If Node A and Node B have received the same set of events (ie in any order), they will eventually have the same state.
Strong eventual consistency:
Exactly the same, but replace "eventually" with "immediately".
Meaning as long as all nodes get the same events, they'll have the same state, straight away. That's sync that works.
As long as your merge algorithm is commutative, associative, and atomic, your sync will work. That's what the CRDT people uncovered.
A data structure can obey these laws (the aforementioned CRDT
libraries)
A database can obey these laws (the original Amazon Dynamo did this, it was a CRDT).
Any arbitrary system can obey these laws (there's 1982 paper described this for a distributed file system)
klysm 17 days ago [-]
Okay I’m with you, but if I’m not mistaken I believe you have slightly mis-characterized what strong eventual consistency is.
> Strong eventual consistency: Exactly the same, but replace "eventually" with "immediately".
I believe this isn’t quite correct. I was under the impression that the delivery doesn’t have to be immediate, but rather that any two nodes with the same set of events, regardless of received order, must be in the same state.
I'm taking my definitions from section 2.2 here. I feel like I've summarised it fairly accurately, but if I've made a mistake would be happy to be corrected.
mlhpdx 17 days ago [-]
Everything on the list I’ve seen done (been hands-on with) very successfully, most of them repeatedly. I’ve also seen many, many more failed (and often pointless) attempts. I don’t have any great answer to why other than the importance (practical value to customers) of it being clear, well understood by leaders and doers, and high.
If having an API is core to the value customers will realize then it’s likely a good API will emerge. Etc.
adzm 17 days ago [-]
> Windows NT is riddled with excess abstractions that were never really used primarily because they were there from the start before there was a real plan to use them.
It is? Like what? I know there are some abstractions for cpu architecture etc but they've come in handy for x64 and now arm and others in the past.
I know the author certainly has some insight into this but I've never really thought of NT as being riddled with excess abstractions.
senderista 17 days ago [-]
It's possible that he's referring to purely internal interfaces?
17 days ago [-]
dools 17 days ago [-]
An amazing example of a DSL was Butler for Trello before it was acquired and dumbed down to a “no code” tool. It was magical to use. I had started consulting using it but they said I wasn’t allowed to use it anymore so I replicated the functionality as a library to run inside a Google sheet using Google apps script.
ghjfrdghibt 17 days ago [-]
The "let's make it cross platform" is what lead me to dart and flutter. These aren't mentioned and for the developers of them is definitely a hard problem. But as far as I'm concerned they're doing a bang up job.
I agree with sync though. Hard problem with no simple solution for an idiot like me to do.
Sparkyte 17 days ago [-]
All rely heavily on one aspect, consistency. If you system engineering can consistently operate without change, innovation or issue, then it is possible that these ideas could exist in a closed environment. However they just don't work when dealing with external factors.
liontwist 17 days ago [-]
> Most of the first 25 years of computer science was figuring out how to make things work asynchronously
This is not true. Computers focused on single threaded designs before getting thrown into parallelism (need parallelism to run into dining philosopher problems).
mrkeen 17 days ago [-]
I'm not sure what counts as the "first 25 years", but Dining Philosophers is from 1965.
tinthedev 17 days ago [-]
The whole post, and both the Twitter messages sound like they refer to a very specific style of work.
These things "sound good but almost never work" when they're taken as an afterthought or committed to without due research/process/design.
Any architect worth their salt will avoid implementing hard solutions to these problems. These are mostly here to not solve crucial issues... but the article seems to be addressing a developer profile that adds APIs or asynchronous processing casually.
Sounds like a strawman, or just exceedingly pessimistic look at the industry.
jmsdnns 17 days ago [-]
p2p cache sharing is an interseting one. it seemed to work for spotify for a long time, but they eventually found it easier to use edge caching. afaik they still use it for places with less developed infrastructure
this part of the abstract gets right to the point: 8.8% of music data played comes from Spotify's servers while the median playback latency is only 265 ms (including cached tracks)
dividuum 17 days ago [-]
Steam added a P2P caching mechanism recently and allows downloading games for other local machines that already have the game installed. Greatly helps with avoiding duplicate downloads on a steam deck, for example.
I also added a similar mechanism to my own product years ago. Works flawlessly. Using content based addressing already made it quite easy to implement.
quotemstr 17 days ago [-]
The ideas in the first tweet screenshot: uh, they do sound good and they do work. Your system is full of control loops. (What do you think regulates how many dirty pages the kernel writes back to disk?)
rogerthis 17 days ago [-]
Event sourcing.
PS: on mobile, can't expand (issues around versioning, data migration)
klysm 17 days ago [-]
Incredibly hard to make work in practice because of the semantic muddiness that results from having events about events etc. You have to think about the domain and its first derivative
bcrosby95 17 days ago [-]
Unless you're a large corporation: fighting Conway's Law.
hot_gril 17 days ago [-]
Why "unless" and not "especially when"? :)
mikewarot 17 days ago [-]
>Let's just add access controls later.
This (Computer Security) is a solved problem, kids. One of the lessons learned in the Viet Nam conflict was the need for a computer system that could safely handle multiple levels of classified data. The solution was Multilevel security, included in that was the Bell-LaPadula model. Several actually secure OSs emerged over the decades since, including KeyKOS, CapROS and Eros.
I'm hoping that someday I can make Genode, the latest capability based operating system, my daily driver, so I never have to worry about virus scanners again.
nostrademons 17 days ago [-]
Computer security is a solved problem for systems with a fully-specified set of capabilities, usage patterns, and legal operations. You follow the Principle of Least Privilege, build in defense in depth, use capabilities, and never expose anything that the user doesn't have a reason to access.
The problem is that the constraints this imposes on the system usually do not line up with the constraints that the market will pay for. It's very common for customers to change their mind; decide they need to hack around access protections; add new users with new roles that are some hybrid of current access; ask for new features; not think through who should have access to new features; want to enable serendipity where untrusted users discover new use-cases and new markets for their product; and so on. It's also very common for them to ignore security as a differentiator when making their purchase decision, figuring that if there's a breach, somebody else will pay for it, or they'll be long gone from the company and unable to be blamed for it. So the market ends up bypassing the secure solutions that exist and choosing to buy insecure systems that can offer the features they want right now.
When security is absolutely critical, like in military or certain financial applications, it's pretty easy to achieve. There are companies like Galois that specialize in "high assurance systems". But they are expensive for their feature set, and so the general public would rather buy from cheaper and more insecure options.
mikewarot 17 days ago [-]
Eventually we're going to have to collectively decide that the operating system is the correct place to enforce capabilities, as is done in mainframe OSs.
Memory holing any mention of this solution isn't productive in the long run.
nostrademons 17 days ago [-]
Yes, the operating system is the correct place to enforce capabilities.
The problem with this is that no mainstream OS does this correctly, which means that correctly doing security requires writing a new OS and getting all the userspace programs ported over to it (which is a non-trivial port, because the programming model for capabilities is pretty significantly different from mainstream OSes). It's very hard to convince users to ditch their entire computing ecosystem for a new one unless all of their devices get pwned and they can't access their computing ecosystem anyway.
mikewarot 17 days ago [-]
I'm convinced the way this will be done is to take a capabilities based OS, and tack on an emulation layer to allow Windows or Linux binaries to run, and only let them see the things that the user has decided they need to see, by emulating the dialog boxes to the app, and then transparently enforcing those choices. Thus a copy of a windows Text editor could run, and ONLY get access to the file the user chooses, without having to re-write anything.
The crux of the issue is command line programs... I'm not sure how to deal with those, but I suspect it'll be an outer job control language.
wpollock 17 days ago [-]
You're right about Bell-LaPadula, but note that model isn't used as much as the Biba model, the basis of Windows UAP. Even "solved" problems tend to evolve or get superceded over time; nothing seems to stay solved for long!
mikewarot 17 days ago [-]
Windows UAP is a horrible thing. It would be far better to just replace the system dialog boxes like file open, save, etc. with power boxes that then give file capabilities to applications.
It would be minimal work to refactor applications, and provide almost perfect security with no UX change.
jitl 14 days ago [-]
This is how macOS apps work by default these days. Each sandboxed app can R/W to a private "container" directory only, and to access anything else you present an "open" dialog that gives you back a special URL object with the requested capabilities for that resource. This is pretty fascinating: https://www.mothersruin.com/software/Archaeology/reverse/boo...
mdavid626 17 days ago [-]
Hmm? As the joke goes, the only secure computer is the one which is turned off. But I guess, that's not even true nowadays (taking security keys from cooled memory modules).
devjab 17 days ago [-]
Around here people abstract because they are taught to abstract in multiple different CS education programs. For whatever reason they are taught Clean Architecture even though it’s horrible. Then they spend the next 5-10 years learning not to abstract or write 5 line functions which then lead down a chain of 900 function calls. As an external examiner or some or these CS students it always pains me to examine them in a curriculum that I know will lead to dead end job in stagnant small sized companies. It is what it is though. I guess it’s hard to compete with the entire industry of pseudo-science and shitty consultancy when nobody is pushing an “indirection manifesto”.
Hilariously Uncle Bob will write off any criticism as “they misunderstood the principles”. He’s correct too, but maybe the principles are simply too vague when we’ve had them for 20 years and our industry has never been more of a mess.
bedobi 17 days ago [-]
A lot of Robert C Martins pieces are just variations on his strong belief that ill-defined concepts like "craftsmanship" and "clean code" (which are basically just whatever his opinions are on any given day) is how to reduce defects and increase quality, not built-in safety and better tools, and if you think built-in safety and better tools are desirable, you're not a Real Programmer (tm).
I'm not the only one who is skeptical of this toxic, holier-than-thou and dangerous attitude.
Removing braces from if statements is a great example of another dangerous thing he advocates for no justifiable reason
The current state of software safety discussion resembles the state of medical safety discussion 2, 3 decades ago (yeah, software is really really behind time).
Back then, too, the thoughts on medical safety also were divided into 2 schools: the professionalism and the process oriented. The former school argues more or less what Uncle Bob argues: blame the damned and * who made the mistakes; be more careful, damn it.
But of course, that stupidity fell out of favor. After all, when mistakes kill, people are serious about it. After a while, serious people realize that blaming and clamoring for care backfires big time. That's when they applied, you know, science and statistic to safety.
So, tools are upgraded: better color coded medicine boxes, for example, or checklists in surgery. But it's more. They figured out what trainings and processes provide high impacts and do them rigorously. Nurses are taught (I am not kidding you) how to question doctors when weird things happen; identity verification (ever notice why nurses ask your birthday like a thousand times a day?) got extremely serious; etc.
My take: give it a few more years, and software, too, probably will follow the same path. We needs more data, though.
linhns 17 days ago [-]
I don't know if he considers the expiration date of his ideas. Some of his ideas actually creates worse software and crater projects.
devjab 17 days ago [-]
I’m not sure any of them do really. It’s been 22 years since TDD made its entry into our field and it’s still worse than the runtime assertions which helped put people on the moon. I know I was lashing out at uncle Bob before but it’s really all of them.
I do agree with these people that nobody has ever regretted writing a test. Well, I mean, someone probably has, but the idea of it is fairly solid. It’s just also useless, because it’s so vague. You can write a lot of tests and never be safe at runtime.
hitchstory 17 days ago [-]
There isnt much consensus on the right kind of test to write with TDD, but when you get it right or wrong it makes or breaks TDD.
Recently Ive been writing mostly "end-to-end unit tests" - stateless, faking all external services (database, message queue, etc.) with TDD which works great.
There is a sweet spot on default test types - at a high a level as possible while being hermetic seems to be ideal.
The other un-talked about thing is that to be able to always write this kind of test you need test infrastructure which isnt cheap to build (all those fakes).
devjab 16 days ago [-]
My biggest problem with TDD is that it doesn't actually protect you from the things you miss. This is great for the consultants who sell TDD courses, because they get to tell you that you did it wrong. They'll be right, but that fact is also useless. I don't have a problem with testing though, as I said I think it's hard to regret doing it, but I'm not sure why we wouldn't push something like runtime assertions instead if we were to do the whole "best practice" thing. With runtime assertions you'll get both the executable documentation and the ability to implement things like triple modular redundancy, so basically everything TDD does but much better.
Yet many developers don't even know what a runtime assertion is while everyone knows what TDD is. I guess it doesn't really matter if you're working on something which can crash and then everyone will be like "oh it's just IT, it does that".
hitchstory 16 days ago [-]
It often protects me from things I miss. E.g. when I'm TDD'in a feature and I'm implementing the 5th-6th test I often find myself inadvertently breaking one of the earlier tests.
I do think there is such a thing as overtesting - i.e. regretted tests. TDD actually protects you from this to an extent by tying each test / modification of a test to a change in code.
Runtime assertions definitely give you more bang for the buck (they are ridiculously cheap) but they are complementary to tests, not a replacement. It attacks the same problem from the bottom up instead of top down.
I also find that when you combine the two, the tests become more useful - previously passing tests will fail when they trigger those assertions.
debarshri 17 days ago [-]
I would like to add building a PaaS/ internal tools that abstract cloud for productivity sake. Bad idea.
Spiwux 17 days ago [-]
I've implemented several of these ideas, and they worked just fine.
ww520 17 days ago [-]
Haha. This is a great list. I have heard variants of these ideas over the years.
17 days ago [-]
qianli_cs 17 days ago [-]
Or let’s just reinvent a new database
Aloha 17 days ago [-]
Or worse, lets reinvent the database, but not call it or think of it as a database.
Or even worse yet, lets reinvent the filesystem then host it on top of a filesystem.
hot_gril 17 days ago [-]
I've heard "it's not a database" multiple times, in reference to whatever custom DB our team inherited. Terrifying.
asveikau 17 days ago [-]
Just remember that this is the genius who brought Windows 8 and the Windows RT tablet that wouldn't run recompiled Win32 apps.
plagiarist 17 days ago [-]
Maybe some experience making large mistakes is how they came up with the list?
17 days ago [-]
asveikau 17 days ago [-]
Judging by some of the list I don't think he's learned the right lessons.
retrocryptid 17 days ago [-]
I think each of the ideas listed: dsls, control loops, being clever, etc. They all have their place, and I wouldn't say they never work. I've had each of these work well in specific situations.
But the OP does have a point, they each can introduce more trouble then they're worth. Were I to write this post, I would have titled it something more like "Systems Ideas Yo Really Should Think About Long And Hard Before Doing."
But yeah, that might not be enough warning.
insane_dreamer 17 days ago [-]
"seldom work as expected" would have been more accurate than "almost never work"
I did like this:
> More importantly, an offering an API doesn’t mean anyone wants to use it. Almost every new API comes up because the co/product wants features, but it doesn’t want to prioritize them enough and the theory is the API will be “evangelized” to some partner in the space. Turns out those people are not sitting around waiting to fill in holes in your product.
For instance "Let's just add an API." I think the approach to an API as "just" a feature to your product will be about as successful as saying "let's just add a UI". To implement a successful UI one needs to be thoughtful, thorough, and bring in people who specialize in it. Why should any other interface for your product be any different? It's not that it's a bad, or good, idea, rather one that shouldn't "just" be done.
It was a rule that worked well for us.
> “Precisely. It’s what I call a ‘Lullaby Word.’ Like ‘should,’ it lulls your mind into a false sense of security. A better translation of ‘just’ in Jeff’s sentence would have been, ‘have a lot of trouble to.'”
Yes I like this rule, how many things I could have solved at the worst run place I ever worked at if it had been implemented there. Although I guess the management might not have liked that essentially I could choose what got done and what I got to work on just by running my big mouth.
facetiousness aside, I'm trying to point out that most often when people say "let's just" it is because they are advocating a course of action and that rule won't work unless the course they are advocating gets chosen. If they say "let's just make an api" and you say no we're doing a UI and you will work on it because you used the word "Just" I guess that's a way to lose your tech talent relatively quickly.
These sorts of rules are really about people feeling devalued or disliking being volunteered or told what to do (often by people they consider less knowledgable). They aren’t really about effectively distributing work.
“Just” gets a bad wrap. There’s a sort of hidden assumption here that “you can just” is equivalent to “you can easily”. It sometimes means that but more generally it means something more like “…will be easiest” which can be true even when the action itself is hard or a lot of work.
This happened in Sprint Planning. Obviously by the end of the meeting, everyone would have a full sprint. So volunteering for one thing required giving something else up. This was definitely part of the equation.
"Giving someone a chance" is a positive thing and wont make people careful.
People wind up with as much work as they think they can reasonably do. They would have done that whether or not they "just volunteered" themselves. You only wind up with too much if you didn't estimate your own work well.
As for being passive aggressive, the word "just" usually is used in a passive aggressive way. Making people careful about saying it was an improvement. And we all agreed that it was.
And no, they did not volunteered themselves. That is just manipulative language.
And both of these are passive aggression.
So no, not easy. Not unless you're right, and someone else was wrong.
Equivalent weasel words are easy to come by. "I don't see why it wouldn't work to ..." But now you're asking for an explanation, without dismissing the person who will be explaining it.
APIs need to be well-designed or the client may need to make multiple API calls when one should suffice. Or the API could be confusing and people will call it wrong or fail to use it at all.
APIs require authentication and authorization. This means setting up OAuth2 or at least having a secure API token generation, storage, and validation.
APIs need handle data securely or you'll leak data you shouldn't, or allow modification you shouldn't.
APIs need to be performant or you database may be crushed under the load. This may involve caching which then adds the complexity of cache invalidation and other servers/processes to support caching.
APIs need to be rate-limited or sloppy clients will hammer your API.
APIs require thorough documentation or they're useless. You may even have to add SDKs (libraries for different programming languages) to make it easier for people to use the API.
APIs need good error messages or users who are getting started will not know why their calls are failing.
Retrofitting an API is a context specific problem, with few things that generalize but the above aren't the hard parts, just the complex ones.
I am going to point you to a Public Policy book here, look at page 32 and their concept of 'wicked problems', which applies to tech but doesn't suffer the problems with the consulting industry co-opting as much as in the tech world.
https://library.oapen.org/bitstream/id/f3358d25-56c0-47f2-90...
Obviously most companies won't accept the Amazon style API edicts that require Externalizable interfaces and force a product mindset and an outside in view.
Obviously an API is still a "Cognitively complex problem" as defined by the above.
But what makes most API projects fail is the fact that aspects arise that result in ‘wickedness’ A.K.A intractability.
API gateways, WAFs, RESTful libs with exponential backoff with jitter....etc... all exist and work well and almost any system that you have a single DB you are concerned about killing is possible to just add a anti-corruption layer if you don't have too much code debt, complicated centralized orchestration etc....
But these general, but difficult tasks like adding an API almost never fail due to tech reasons, but due to politics, poor communication, focusing on tech and not users, unrealistic timelines, and turf battles.
You are correct that "APIs require thorough documentation or they're useless" but more importantly they need enforceable standards on contract ownership, communication.
The direction of control, stability, audience, and a dozen other factors effect the tradeoffs there as to what is appropriate.
Typically that control is dictated purely by politics and not focused on outcomes.
That is often what pushes these complex problems into failed initiatives, and obviously this is far more complicated. Often orgs can't deal with the very real uncertainty in these efforts and waste lots of project time on high effort, low value tasks like producing gantt charts that are so beyond the planning horizon that they can only result in bad outcomes at best.
The point being is that unless you are on the frontier of knowledge and technical capabilities, it is almost never the tech that causes these efforts to fail.
I read it as an example of the complexity required to implement something like an API.
I agree that it’s also a people problem but I disagree that many of these failed initiatives aren’t due to bad tech as well. A complex initiative requires picking the right tech that gives you a good benefit/cost ratio. Knowledge of tech plays hand in hand with the political aspects and IMO many orgs lack both completely.
That choice should never be a one way door, you don't control your customers or the future, it is the frame problem.
The idea is to be able to iterate and adapt, not get things perfect from the start.
It is fine and expected that one will use past experience as a starting point, but you always have to reflect and check your assumptions along the way.
It is a complicated topic ruled by nuances and contexts, but if you cannot pivot away from your initial assumptions over time, that lets you know you are leaking implemention details.
While time is limited and abstractions have a real cost, you need to apply them where they make sense.
Some, like just breaking code onto two files in the same directory are fairly low cost, others are much more expensive.
For several years I was jumping in to save failed cloud migrations.
Whenever tech was the 'blocker' it was because of a myth that there was only one way to do things.
But lets say you are writing machine code for Apple silicon? Do you really think those registers are concrete? They aren't. They are a facade hiding 100s behind a legacy interface.
Vendor mitigation important and often neglected and also results in people producing balls of mud.
But there are a lot of distributed monoliths pretending to be micro services out there. And we had decades of people producing fragile enterprise service buses that were built because people wanted to future proof their systems.
The sizing of components and balancing integration and disintegration drivers is incredibly hard, you will never get it right.
You have to leave options open, no matter if that is through abstractions or keeping biz logic centralized to assist in a easy rewrite of context boundaries that arise from scale or changing needs.
Obviously choices have benefits and costs, but for most needs you can keep options open, if not it is probably best to reevaluate the reason for choosing a solution.
Obviously there are predatory vendors like Oracle that base their entire income on captive customers, but that relates to the vendor mitigation above.
I think the US federal government de-risking guide is a good overview of that topic. Microsoft is a strong driver for this BTW.
https://guides.18f.gov/derisking-government-tech/
> APIs need to be well-designed or the client may need to make multiple API calls when one should suffice. Or the API could be confusing and people will call it wrong or fail to use it at all.
That's fine. It's a first iteration we can change it later if people complain. Lets get something out there and iterate.
> APIs require authentication and authorization.
Lets not worry about authorization/permissions. It's just one key or whatever account they use to log in with now.
> APIs need handle data securely or you'll leak data you shouldn't, or allow modification you shouldn't.
You're saying you don't know how to do that?
> APIs need to be rate-limited or sloppy clients will hammer your API.
Either "Lets worry about that later when it happens" or "Here is the first github link from a google result for 'api rate limit open source free'. Lets use that"
> APIs require thorough documentation or they're useless.
Either "We need to have the API first, docs, clis, etc could come later after we have gauged the usage or had asks for them. We can handhold the customers asking for them for now." or "here is a github project that autogenerates docs, SKDs and clis from an OpenAPI spec"
> APIs need good error messages or users who are getting started will not know why their calls are failing.
You're saying you don't know how to do that?
"It doesn't have to be perfect. We have customers who are asking for it and to win them we need to implement something then we can work with them to improve it. Otherwise we will lose them"™
Most projects are far too fluid in their shape to warrant a proper design up-front anyway.
Of course, YMMV. Every company and manager is different.
Phased implementation isn’t the worst idea, they just have to be committed to ending the experiment or seeing it through.
I'm cognizant that if it truly is a make-or-break for the deal, there may not be any choice, but along with all the risks you cited is an underlying obsolescence one
If there's one thing APIs suffer from more, it's "social cost to make changes". A concurrent vN+1 largely resolves that though, unless your API consuming ecosystem is large enough to be worth investing vastly more resources into.
But I pretty much agree 100% about DSLs, it's an unnecessary/cute complication. The only people who should be allowed to make them should be ones who have made a successful programming language and updated it to deal with all their mistakes or a 2nd version/2nd language, and still got many things wrong.
Often I see developers creating new APIs ad-hoc all the time instead of curating and enhancing the one they already have.
https://steve-yegge.blogspot.com/2009/04/have-you-ever-legal...
It'd have been delightfully ironic had either of these Steves concluded their essays with a named methodology to "just" apply whenever faced with these "let's just" situations but alas...
(2) Elastic Load Balancer is a control loop responsive to workloads, that kind of thing is a commodity
(3) Under-provisioning is rampant in most industries; see https://erikbern.com/2018/03/27/waiting-time-load-factor-and... and https://www.amazon.com/Goal-Process-Ongoing-Improvement/dp/0...
(4) Anomaly detection is not inherently a problem of distributed systems like the others, but someone facing the problems they've been burned with might think they need it. Intellectually it's tough. The first algorithm I saw that felt halfway smart was https://scikit-learn.org/1.5/modules/outlier_detection.html#... which is sometimes a miracle and I had good luck using it on text with the CNN-based embeddings we had in 2018 but none at all w/ SBERT.
They were very similar. I even reused the code. One was writing rules to validate giant forms, the other was writing rules to for decisions based on form responses.
Ok, just ranting on DSLs. Good DSLs take someone from can't to can. A DSL that's meant to save time is way less likely to be useful because it's very likely to not save you time.
In both of my DSLs, it's that we needed to get complex domain behavior into the program. So you either need to teach a programmer the domain, partner a programmer with a domain expert, or teach a domain expert how to program.
Putting the power in the hands of the domain expert is attractive when there's a lot of work to be done. It frees up programmers to do other things and tightens the feedback loop. If it's a deep domain, it's not like you want to send your programmer to school to learn how to do this. If it's shallow, you can probably have someone cheaper do it.
A DSL comes with a lot of cognitive overhead. If the other option is learning a full programming language, this becomes more reasonable.
A time saving DSL is where someone already knows how to write code, they just want to write less of it. This is generally not as good because the savings are marginal. Then when some programmer wants to change something, they have to learn/remember this whole DSL instead of more straightforward code.
Actually, this makes a simpler rule of thumb. A DSL for programmers is less likely to be a good idea than a DSL for non-programmers.
I cannot understand why seems it was bad for him…
The second is fuzzier. It's putting a DSL over something complex and hoping this will fix things. Writing SQL queries for this system takes a bunch of time and is error prone? Just put a DSL over it! Except all those details and errors are probably going to leak right through your DSL.
You want to master the domain before you put a DSL over it.
What is the alternative to the DSL with lower cognitive load? I do not follow. Every single DSL I’ve seen REDUCES the cognitive load, by allowing to express the concept in the mere language of the problem at hand, for which the SME should be more than familiar with.
About the second point: I see many critics in this thread based on DSls above SQL. Whatever somebody is doing above SQL and selling as a DSL, it is not. Period. I cannot think in any possible way of doing a DSL above a query language. No doubt people hate the idea. Is a BAD one.
In the test example, writing it directly in the programming language. This will usually lead to code that is more verbose and repetitive, but understanding the first example will be faster.
I think of cognitive load like a line. X is the number of cases you’re working with, Y is cognitive load [0]. For someone who already knows a programming language, the DSL is going to have a higher Y intercept since you have to learn something new before you understand the first case. Hopefully, it’s a shallower slope so as you deal with more cases the upfront cost gets paid back. If you have lots of people dealing with one case or doing it infrequently enough they have to relearn each time, this payoff never happens.
This model extends past DSLs to all abstractions. It’s why people often end up happier with test code that’s less abstract/DRY. The access pattern supports it.
Looking at it this way also explains why a DSL for a non-programmer is more likely to be useful. The intercept can be lower than an actual programming language, so you’re ahead from the start.
[0] It’s really more of a curve, but the line model works conceptually.
DSLs are supposed to be for making it easier to perform computation in a specific context. Software tests have about as many degrees of freedom as the programming language they are written in, so I’m not sure they are an ideal use case for a DSL— not without a lot of discipline at least.
For a DSL to make sense, IMHO, you need to be able to write down a complete and correct specification for it. I doubt that is even possible in the given examples :shrug:
I do like DSLs, but the value proposition is often difficult, IMO.
DSLs, in my experience, usually fail in the later definition. It's very hard to make a small language that precisely captures its domain of application, will produce easy to manage programs no matter the size, would be easy to analyze in terms of performance and side-effects.
There are many poorly designed libraries, and DSL design is no easier. While I haven’t personally encountered any, I’m sure there are numerous half-baked DSLs out there.
For example, bash, SQL DSLs may be immediately useful by protecting against shell,sql injection: shutil.run(sh"command {arg}") may translate to subprocess.run(["command", os.fspath(arg)])
No shell--no shell injection. The assumption is that it enables sh"a | b > {c}" syntax (otherwise just call subprocess.run directly). Implementing it in pure Python by hand would be more verbose, less readable, more error-prone).
As I see it, a DSL is just the end-state of a programmer creating abstractions and reusable components to ultimately solve the real problem. The nouns and verbs granted by a programming interface constrain how one thinks, so a flexible and intuitive vocabulary and grammar can make the "real program" powerful and easy to maintain. Conversely, a rigid and irregular interface makes the "real program" a brittle maintenance nightmare.
Nail on the head time - somewhere else in the thread is jooq which is (yet another) SQL DSL where you end up with from(table).where(student=bob)
This is a perfect example of why the programmer should (just?) learn SQL instead of the DSL - and your comment nails it
(2) Circa 2005 coding PHP I came to the conclusion that the ORM I needed was
because writing simple inserts and updates against SQL is for birds, let freshers do it and they will forget to escape something. Such a "framework" can be so simple that you can bend it to the needs of your application. JooQ gives you something like that but backed by the Java type system.My comment is more aimed at the second part. SQL Is tied to the implementation and demands coders understand it all. A DSL can allow domain experts to express their understanding without having to worry about software trade offs.
The most successful “DSL” I know of like this is fitnesse tests - just a large number of simple tests where domain experts can spreadsheet style throw in the “gotchas”.
Something like that but more spool is fixated is a holy grail - Behaviour driven tests like cucumber come close but there is that weird intermediate translation from English phrase to random function - now you have to understand the function to use the phrase and suddenly you are reading real Code to be able to use the fake code and it never feels clean
One day I will be clever enough to be able to write a really good test DSL
It’s just whenever I think of “Given used is logged in, visit “textbox” and enter “word” .. it just looks like BDD test not a DSL. Like I said, one day I will be clever enough
http://infolab.stanford.edu/~junyang/cs145/or-proc.html
which for whatever reasons never caught on in the open source world. (I'd blame limitations of current compiler technologies and the values of people who make compilers... If we had composable parsers you could just say "here's a spot for a SQL query in a Java method" in 10 lines of code) JooQ approaches that without requiring any change in the compiler. In the past it was awkward to embed SQL in Java because there were no multi-line strings. In Python you could write
do_query(" ... a really crazy complicated queries with lots of joins and subqueries that is carefully indented to fit in with the rest of the program ... ",{"arg1": val1, "arg2": val2})
but without real map literals, multi-line strings and such this was terribly awkward. (If you think List.of(), Map.of() and such are cool I was writing a computer chess program last month that used List.of(A,B) to create a list that was used in an inner loop and it was terrifying how slow it was compared to using an ArrayList)
You write some SQL queries, test them in datagrip or whatnot, then spend the next several hours figuring out how to convert them to the DSL. This problem is compounded when you use "exotic" SQL features like json expressions. Debugging is "print the generated sql, copy it into datagrip/whatnot, tune the query, then figure out how to retrofit that back into the DSL".
It's a huge waste of time.
The primary selling point of jOOQ is "type safe queries". That became irrelevant when IntelliJ started validating SQL in strings in your code against the real data. The workflow of editing SQL and testing it directly against the database is just better.
jOOQ reinforces the OP's point about DSLs.
In many scenarios (including JOOQ and all ORMs), X is SQL. I should know, I spent years working on a Java-based ORM. So believe me when I say: ORMs are terrible. To use SQL effectively, you have to understand how databases work at the physical level -- what's a B-tree lookup, what's a scan, how these combine, etc. etc. You can often rely on the optimizer to do a good job, but must also be able to figure out the physical picture when the optimizer (or DBA) got things wrong. You're using an ORM? To lift a phrase from another don't-do-this context: congratulations, you now have two problems. You now have to get the ORM to generate the SQL to do what really needs to be done.
And then there are the generalizations of point made above: There are lots of tools that work with SQL. Lots of programmers who know SQL. Lots of careers that depend on SQL. Nobody gives a shit about your ORM just because it saves you the trouble of the easiest part of the data management problem.
I’m not suggesting that to use RDBMS you should know how to administrate and tune it (though it helps), but knowing their language, and understanding a single data structure (B+ trees) isn’t too much to ask, I think.
In some cases, but the more frequent issue I saw back in the day was the DBA making some really complex schema tuned for what they wanted, then an application trying to use the data in a pretty reasonable OOP manner (1 to many relationships, etc) and the DBA pissed they were using an ORM instead of their perfect SQL queries and procedures.
Tbh, I don't understand why this is seen as a bad thing. Correction: I know why it is (any changes are obviously going to be dramatically slowed down), but in the long run, I don't understand why people are against it. You wanted something done correctly, so you went to the SME for that specific field, and had them do it for you. Then you decided to throw it away?! Why are you bothering to ask them in the first place?
> 1 to many relationships, etc
I know this was just an example, but 1:M is a perfectly natural part of any RDBMS, and in no way requires an ORM to be done.
Usually this was a mismatch of mgmt or expectations. Hiring old school DBAs and letting them think they "own the data", while plopping them into a huge dev team changing the big SaaS features daily is a recipe for trouble.
I don't fault DBAs per se, though I did work with some who wouldn't look outside their blinders at all.
The mapping of database results to java objects with Hibernate is convenient. The basic "load entity, change a couple fields, let Hibernate persist it" flow is convenient. In a limited set of cases, basic entity graph navigation is convenient.
As I said, if you're working in an object-based language, by definition you need something that maps relations to objects. Hibernate is a competent choice. There are other competent choices, but JDBC is not one of them unless your app is trivial.
Anyways. Hibernate works on top of JDBC, so, if you like its interface, then it means you could make your own, but skipping >99% of the rest of Hibernate code that has nothing to do with wrapping the driver.
Or, imagine there was a library Hibernate', that threw away all the ORM stuff, and only offered mapping of SQL results to Java objects and sending queries to the database. Then, why not use Hibernate' instead of Hibernate?
NB. About triviality. From experience: trivial apps tend to work OK with ORM. Non-trivial will usually ditch the ORM because of performance, missing functionality and general difficulty with servicing it. So, it's the other way around: if you are shooting for the stars, you are probably not going to use Hibernate, Hibernate is one of the variety of tools that helps losers loose less, it's not a tool for the winners.
I think you've built up a strawman in you mind of what you think "ORM" is. Yes Hibernate is huge and has a lot of features that people shouldn't use. But you can say the same about Microsoft Word, the problem is that everyone uses a different 5% of the huge feature set.
People who work with these technologies on a daily basis don't screw up the core acronyms. I suggest softening your opinion and dropping the platitudes.
BTW. I'm absolutely on-board with you: nobody should use Microsoft Word. There's absolutely no reason to do that. It's a marketing ploy with a lot of grease money paid to people in charge of procurement in various places. It's absolutely not about 5% of features. It's just downright worst kind of text editor that's in popular use today. Ask me how I know this? I worked in a newspaper! Somehow, Microsoft never ventured into this field, and didn't sell their garbage there. And nobody uses Microsoft in book publishing or any other sort of publishing. Not for any % of its features. So much so that if you bring a manuscript (as an outside author) to publish a book or an article in a newspaper / magazine, and it will be in MS Word format, you'll be most likely asked to convert it to another format. And we are talking about people who need a lot of different features of text editing!
And, I really don't care about what you have to suggest. You aren't in a position to make suggestions really ;)
My go to metaphor has been "XYZ is an angry 800lb gorilla sitting between you and your work."
If you start with a network data model perspective and build that into your system, then it follows that you'll want a network data model to SQL mapper. That's what ORMs are, and the need for them comes from your approach, not from the tools you use.
There's a different approach - use OOP to build computational abstractions rather than model data. Use it to decompose the solution rather than model the problem. Have objects that talk to the database, exchange sets of facts between it and themselves, and process sets of facts. In the process, you can also start viewing data relationally - as n-ary relations over sets of values - as opposed to binary relationships between tables of records.
Information systems are not domain simulations, simulations compute the future state of the domain whereas information systems derive facts from known facts at the present time.
For a visual metaphor, car engineers don't use roadmaps as design diagrams and they don't model the problem domain in the systems they build. A car isn't built from streets, turns, road signs, traffic lights, etc. And despite that, cars function perfectly well in the problem domain. A car generally doesn't need to be refactored and reassembled when roads or rules change.
It's infinitely easier and less error-prone to keep the interface between the database and the application to the minimum (just convert the final results of a query to the application objects and embed complete queries in the application code) than to try and create complex query builders behind the scenes of object-to-object interaction.
If you want to make a good product, you may start with ORM, as it may, for a time, delay the need of understanding the relationship between the application and the database, and allow you to experiment faster at the expense of lost performance. Once you know what you need to do, ORM just no longer works: you will have to break it at least in order to deal with performance issues, but often you will also find yourself dealing with the fact that a lot of what you want to express in your queries is either too difficult or even impossible to express in a particular ORM.
The thing is, ORMs encourage bad schema design and get in the way of the SQL you want. I've seen entire projects ruined this way. I think the only valid reason for an ORM was before RDBMSes had json etc types. Maybe you had a table with very many cols that you just want to get/set, say a "user profile" table. This also contributed to the NoSQL fad. Nowadays you can throw that into one json col.
This is a good reason to use an ORM. But also, as a ORM designer, don’t let the ORM be flexible to do any SQL. Only let it do performant data access.
Examples of popular DSLs that I would characterize as bad if not outright failures:
* HCL (Terraform configuration language). It was obvious from the very beginning that very common problems haven't been addressed in the language, like provisioning a variable number of similar appliances. The attempts to add the functionality later were clumsy and didn't solve the problem fully.
* E4X (A JavaScript DSL for working with XML). In simple cases allowed for more concise expression of operations on XML, but very quickly could become an impenetrable wall of punctuation. This is very similar to Microsoft's Linq in that it gave no indication to the authors of how computationally complex the underlying code would be. Eventually, any code using this DSL would rewrite it in a less terse, but more easy to analyze way.
* XUL (Firefox' UI language for extending the browser's chrome). It worked OK if what you wanted to do was Firefox extensions, but Firefox also wanted to sell this as a technology for enterprise to base their in-house applications on Firefox, and it was very lacking in that domain. It would require a lot of trickery and round-about ways of getting simple things done.
* Common Lisp's string formatting language (as well as many others in this domain). Similar to above: works OK for small problems, but doesn't scale. Some formatting problems require some very weird solutions, or don't really have a solution at all (I absolutely hate it when I see code that calls format recursively).
All in all. The most typical problem I see with this approach is that it's temporary and doesn't scale well. I.e. it will very soon run into the problems it doesn't have a good solution for. Large programs in DSL languages are often a nightmare to deal with.
Every time you write a React JSX expression, terraform file, config.yaml, etc., you're using a DSL.
I once wrote a JSON DSL in Ruby that I used for a template-based C# code generator. This enabled a .NET reporting web app to create arbitrarily shaped reports from arbitrary rdmbs tables, saving our team thousands of hours. Another team would upload report data to a SQL Server instance, write a JSON file in the DSL, check it against a tiny schema validator website, submit it, and their reports would soon be live. One of the most productive decisions I ever made.
Even in C using the "goes to operator" of while(i --> 0) or using special operator overloading like the C++ STL >> and << operators for concatenation is just making people memorize nonsense so someone writing can be clever.
People don't give presentations with riddles and limericks either. It can be clever as a puzzle but when things need to get done, it is just indulging someone showing off their cleverness at the expense of everyone who has to deal with it.
We are advocating exactly to keep the syntax the same as the base language, and add semantic value through the abstractions of the language.
If you're not changing any syntax and are just using normal function calls, that's an API and that's direct.
If you're not just using normal function calls and are making your own "semantic value through abstractions of the language" you aren't making something that is direct and are creating something that needs to be memorized.
The cleverness and indirection of the new stuff that hides what is really going on is 99% of the time not worth what it gives you, because you have to memorize this new clever thing someone came up with, then you have to learn what it is actually doing underneath that is being hidden.
No. Sorry. Wrong. Look SICP where they explain the concept of embedded DSl. Hint: you may be conflating syntax and language.
If you look at the source code for doom it is very straight forward. No fancy stuff, not cleverness, no pageantry of someone else's idea of what "good programming" is, just what needs to happen to make the program.
I'll even give you an example of an exception. Most for loops in C and successors are more complicated than they need to be. Many loops are looping from 0 to a final index and they need a variable to keep track which index they are on. Instead of a verbose for loop, you can make a macro to always loop from 0 and always give you an index variable, so you just give it the length and what symbol to use. Then you have something simplified and that's useful. It's shorter, it's clear, it will save bugs and be easier to read when you need nested loops through arrays with multiple dimensions.
I already gave examples before where clever extra syntax creates an exceptional situation but gains nothing.
The fundamental point here is that these opportunities are rare. Thinking that making up new syntax is a goal of programming is doing a disservice to everyone who has to deal with it in the future.
There is also the possibility of embedding in a non programming language, like XML (E.g. launch language in ROS), or S-exp in the Oracle listener config file. Also you can do ad-hoc like in the .msg files of ROS. But is always about semantics, not syntax. Syntax is the medium only.
Sometimes may include indeed extensions to a language, but in that case by the standard means of abstraction preferred in that language: clases, templates, functions, structures.
You keep saying that there are no problems and that it isn't like anything mentioned but you don't have any examples.
What is an example of "adding semantic value" that isn't using the languages normal constructs but is still not something someone needs to learn and memorize?
The whole idea of a DSL is exactly to avoid learning something new. Of course there will be some piece of information to be learned, but what are we comparing against? Is there a solution where somebody does not need to learn absolutely anything? Of course not! You have to learn something, to be able to use it, the question is how to minimize the cognitive load.
You are right it would help some example, I have a couple in which I recently worked on:
1) We had a very complex ASIC which had a complicated way of configuring it: there were RF parameters and also a program that runs in the ASIC; say “repeat 20 times {send, receive, analyze, phase-shift}” of course the real thing is much more complicated. Now the ASIC manufacturer gives an API for doing everything, which involves setting registers, flags, internal state machines, etc. we have an expert that knows lots about RF and the application, but is weak in programming. We did it in lisp, but I will try to explain like if it was C: we made a bunch of functions, lots are very API like, setters and getters. But to program the sequence, we have functions that do flow control. In C looks a little bit awkward, in Lisp is much better. The example above would be: “repeat(20); send(); receive(); analyze (); phase_shift(); iterate();” The guy who writes that “code” does not care about the base language (we had previously never heard about Lisp, he was only able of basic Python). But he was already writing those programs in pseudocode for documentation. So the cognitive load for him is minimal. He has to remember to add “();” at the end of each instruction, and the loops are “repeat(n) … iterate” That’s it! That was much less, than if he had to learn the whole API of the ASIC, he is not a programmer, he is an RF engineer. You may say: it is an API, but look, there was already an API. Makes no sense to do API over API. It was all about transforming the language of the API, to the language of the problem at hand. The API tries to expose every detail of the hardware, in a language which is based on hardware and C, the DS language tries to hide details or translate things into the language of the problem. So the user of the DSL has to learn less.
2) There was an automated planner which lots of rules. Think about it as “1000 ifs, some nested”, originally without DSL, all was hardcoded in C++. We developed based on libconfig (think JSON with C syntax) a little language to express the ifs. Note: there was no new syntax invented, it is the underlying JSON/Libconfig, which are well known syntax. We only made a big “forach” for all elements in the config file, and each passed in a big “case” to dispatch the substructure to the handling function for each instruction. Took 1 day to implement. After that, the intelligence was in separated files, it could be reloaded dynamically, and the people doing the intelligence did not need to be C experts.
If it's the same language it can't be a new language. You didn't link anything with your sources.
The whole idea of a DSL is exactly to avoid learning something new
But you have to learn the DSL and you have to throw away all your tools. These are two big problems they introduce so the problem they solve better be big and tools/debugging needs to be part of making the DSL. This is why a small DSL is not a good idea.
We had a very complex ASIC which had a complicated way of configuring it: there were RF parameters
This is another side of the story. Passing parameters is data. Inside a program this is a very bad idea because you can already pass around all the data you want any way you want though function calls and memory layouts.
Passing data from one program to another or one computer to another is different, but then that isn't a language, that's a data format like any other file. GCode is a list of 'commands', but fundamentally it is a data format. If you look at the .obj format, it is ascii and needs to be parsed, but not thought of as a language.
Think about it as “1000 ifs, some nested”, originally without DSL, all was hardcoded in C++. We developed based on libconfig (think JSON with C syntax) a little language to express the ifs
This sounds like a data format. If something isn't being executed directly, it's data. If it is being executed directly, don't make a new language, because it takes a decade and hundreds of people to get it to work well.
> If it's the same language it can't be a new language. You didn't link anything with your sources.
A language is more than the syntax. For example common lisp, emacs lisp, racket and scheme are different languages with exact same syntax. Java and C have very similar syntax, but are 2 languages. Source SICP https://web.mit.edu/6.001/6.037/sicp.pdf or the videos in youtube.
A DSL does not need to have a new syntax. Source wikipedia article, under embedded DSL.
If your DSL follows existing syntax, you can use the tools. Note my example with JSON.
>> Passing parameters is data. (…) Passing data from one program to another or one computer to another is different, but then that isn't a language
Well actually it is. And data and code cannot be tell apart. I can only recommend to go throw the SICP lectures in youtube. Your example with GCcode is good, code is data, data is code. Also about the example, consider it is, as said, a great simplification, there are lots of details and constraints that I cannot possibly enumerate here. Also note that one way of passing data between 2 computers can by done via RPC which is a language (procedures and functions are called remotely, executing code in the remote computer, which works with the data) that was actually the case in the example.
> This sounds like a data format. If something isn't being executed directly, it's data. If it is being executed directly, don't make a new language, because it takes a decade and hundreds of people to get it to work well.
A C program is also a data format. All is a data format. At the end in the compiler or interpreter the program is an AST, ALWAYS! And an AST ist just a data structure!
Far from it. On the s-expression level there are already differences. On the actual language level, Common Lisp for example provides function definitions with named arguments, declarations, documention strings, etc.
For example the syntax for function parameter definition in CL is:
Above is a syntax definition in an EBNF variant used by Common Lisp to describe the syntax of valid forms in the language. There are different operator types and built-in operators and macro operators have especially lots and sometimes complex syntax. See for example the extensive syntax of the LOOP operator in Common Lisp.1-lisp or 2-lisp is also a difference, though all support lexical closures and function objects.
Racket now has a variant without s-expressions. That's also a huge difference.
Well actually it is.
This is conflating the term 'language' to mean whatever you want at the moment. There are things that execute and things that don't. These two should be kept as separate as possible, but this is a less that people usually need to learn for themselves after being burned many times by complexity that doesn't need to be there.
And data and code cannot be tell apart. I can only recommend to go throw the SICP lectures in youtube
A C program is also a data
You aren't the first person to be mesmerized by SICP, but if someone gets involved in thinking something is a silver bullet, they will tend to try to find information that validates this belief and reject info that doesn't. This pattern is found elsewhere in life too.
To understand some context, early in the life of LISP and Scheme, there weren't as many scripting languages and people mostly hadn't had a lot of experience with being able to eval tiny programs in their programs. These days that might be used to enable people to write small expressions in a GUI instead of a constant parameter. Many times in programming history people see something new and think it will solve all their problems.
Java went through the same thing. For a long time people though deep inheritance hierarchies would save them until gradually people realized how ridiculous and complicated it made things that could be simple. Inheritance from a base object let people use general data structures and garbage collection + batteries included seemed great, but programmers conflated everything together and thought this terrible aspect of programming was a step forward.
Lisp was very influential, people didn't have scripting languages back then but it isn't a modern way to program.
Data formats are a separate issue and mixing in execution to those is a bad idea too, because the problem they solve is getting data into a program. When you put in execution you no longer know what you're looking at. Instead of being able to see or read directly the data you want, now you need to execute something to see what the values actually are. When you need to execute something you have all sorts of complexity including the need to debug and iterate just to see what was once directly visible.
I gave you 2 examples, one in lisp, one based on JSON. I said no new syntax, but indeed you have to learn something, if it is a DSL, it is a new language, is on the very name. As long as you make something new, it has to be learned. The point is, if the new thing looks very near the problem domain, an expert in that domain will have no problem in learning it faster than anything else. Again, what are the alternatives?
I do think data and code must no be separated strictly. I do bot like the OOP hype because the reasons you mentioned about Java. BUT: the idea of putting together data and the code in an object I find good in general.
> You aren't the first person to be mesmerized by SICP, but if someone gets involved in thinking something is a silver bullet, they will tend to try to find information that validates this belief and reject info that doesn't. This pattern is found elsewhere in life too.
I do thin SICP is great, and it was a before and after for me. But I do bow found any silver bullet there, quite the opposite, I learned many good ideas, DSLs also, but I use them only when they make sense.
> Java went through the same thing.
My take on java (little off topic) like many other popular languages, started as a bunch of very good ideas, and was victim of its own popularity, it was over hyped, as the solution for all, got bloated, also many subpar programmers started writing tons of it, until the whole ecosystem was totally ruined. Something similar happened with basic, VB, and is happening with Python to certain degree.
> because the problem they solve is getting data into a program. When you put in execution you no longer know what you're looking at. Instead of being able to see or read directly the data you want, now you need to execute something to see what the values actually are. When you need to execute something you have all sorts of complexity including the need to debug and iterate just to see what was once directly visible.
It sounds to me like you got burned by a shitty mixing of code and data, that made your life hard.
> This is conflating the term 'language' to mean whatever you want at the moment. There are things that execute and things that don't.
A language has not to be executable. There are query, configuration, markup languages. A DSL must not be a new scripting language, or even executable. Can be for configuration. And note that is not that I’m stretching the definition by any means: TeX and MD are languages, is overall in the documentation. Also SQL is a language. Maybe we have a different definition of language and there comes all the confusion? Again, I’m 100% that if we meet we would be on the same page in 95% of the topics! :)
Yes, people use the term language for different things, it doesn't mean they are the same.
Also what you called a language in your first example everyone else would call an API. What you called a language in your second example is just a config file.
It seems that the reality of what you're saying is that you are using 'lots of little languages' because you are calling lots of things languages that no one else does.
If you calling them in a fancy way with overloads and whatnot, it's not DSL, it's fancy functions.
DSL is domain specific language. It includes domain specific syntax, domain specific semantics and domain specific libraries.
Is not about fancy functions. And not about new syntax. Is about adding semantic value. If somebody adds a collection of functions that allow the expression of solutions to a problem in the very language of the problem, that is a DSL, if the syntax chosen, for whatever reason, e.g. simplicity, happens to be the same as some underlying language, that takes nothing to the fact that it is a DSL.
If you look at the examples of SICP, they are “just” fancy functions. But they are DSLs
An extract of the wikipedia article:
As embedded domain-specific language (eDSL)[4] also known as an internal domain-specific language, is a DSL that is implemented as a library in a "host" programming language. The embedded domain-specific language leverages the syntax, semantics and runtime environment (sequencing, conditionals, iteration, functions, etc.) and adds domain-specific primitives that allow programmers to use the "host" programming language to create programs that generate code in the "target" programming language.
I'll take a stab at fleshing this out: DSLs work great when they have an IDE with autocomplete and a quick (or instant) feedback loop.
DSLs that solve a specific problem with a page or two of documentation overhead are great.
Trying to reinvent paradigms or scope creep is where the pain comes in. Seems like the post author has been burned by that type of DSLs.
Do you have any example? I’ve heard lots of good things of dsl, but never had the luck to witness it’s full glory.
(except for regex, which I love, but it has more than two pages of docs)
I'd consider Python's f-string syntax a DSL of sorts.
YAML might be considered a simple DSL, if you don't consider it a language/format instead. It's a bit more than 2-3 pages, but it's not hundreds of pages. And a simplified version could be constructed with <10 pages.
Similar to YAML, but for Markdown. I'd call that a DSL too, and it's even simpler than YAML.
Then, something more tiered as: CSV, JSON, TOML, INI, AsciiDoc
Once you're in the short form, it's a bit blurry what's a format, what's a DSL, and what is a language.
PS. Sorry for the late answer, I missed the direct question for a bit.
And yet Terraform/Tofu continues to poison people's brains. It boggles(!) the mind
(5) Hybrid parallelism - also, many people think it's a bad idea because it makes your software system more complex. Again, it may be very useful sometimes, but it's not like many people would go "yes, that's just what I'm missing right now, let's do parallelism with different hardware and different parts of the workflow and everything will run something kind of different and it'll all work great like a symphony of different instruments".
You don't get the luxury of offline migrations or single-master writes in the high-volume payments space. You simply don't have the option. The money must continually flow.
This is just pessimism and weary cynicism. I get it, I’ve felt that way too, and sometimes it’s hard to talk an eager engineer out of a bad idea. But for me, this vibe is toxic.
Underneath all of those ideas is a tangle of complexity that almost everyone underestimates.
The brittleness is what gets me. In physical mechanical and even analog electrical systems there are tolerances. Things can almost-work and still work for varying degrees of work. Software on the other hand is unbelievably brittle to the point that after 50 years of software engineering we still really can't re-use or properly modularize code. We are still stuck in "throw it away and do it over" and constantly reinventing wheels because there is no way to make wheels fit. The nature of digital means there are no tolerances. The concept isn't even valid. Things fit 100% or 0%.
We keep inventing languages. They don't help. We keep inventing frameworks. They don't help. We keep trying "design patterns" and "methodologies." They don't help. If anything all the stuff we invent makes the problem worse. Now we have an ecosystem with 20 different languages, 50 different runtimes, and 30 variants of 5 OSes. Complexity goes up, costs go up, reusability never happens, etc.
I remember for a while seeing things like the JVM and CLR (the VM for C# and friends) as a way out-- get away from the brittle C API and fully compiled static native code and into a runtime environment that allowed really solid error handling and introspection. But that paradigm never caught on for whatever reason, probably because it wasn't free enough. WASM is maybe promising.
Most of the 'inventions' you describe are more aimed toward reducing the barriers of entry: the promise that your team of expensive C wizards would now use Java at greater speed with less defects became "now we can just use cheap CS grads" at slightly worse but still acceptable levels.
Without any real consequences for poor software (see CrowdStrike's YTD despite its multi-billion dollar farce in July) it's only logical that the standard will always be "bare minimum that can be shipped". Developer productivity is a misnomer really - it just means company profits increase thanks to a widening pool to hire from and even more crapware per dollar can now be squeezed from each worker.
But I think you have the wrong take on “reusability”. Every non-software engineering project is an exercise in custom solutions as well. The reusable parts are the tools and materials. Likewise in software engineering the languages, OSes, protocols, libraries, design patterns, and frameworks are the reusable bits. Code is how we describe how it all fits together, but a huge amount of what it takes to run a system is being constantly reused, much bigger than the code we write to implement it.
Of course modern safe languages like Rust give you some of those benefits in compiled code too.
I think you are misrepresenting how flexible software is versus hardware. Mechanical and electrical systems have tolerances but if you go outside those tolerances, the whole system can be destroyed. Nothing like that is common in software. Worst-case outcomes might be like "the performance isn't as good as we want" or "this code is difficult to work with." Software components are very flexible compared to anything physical, even in the worst cases.
>We are still stuck in "throw it away and do it over" and constantly reinventing wheels because there is no way to make wheels fit. The nature of digital means there are no tolerances. The concept isn't even valid. Things fit 100% or 0%.
I don't know how one can look at the amazing array of libraries out there and conclude that we have no reuse. Sometimes people build their own solutions because they need something very simple and the libraries are too big to be worth importing and learning in those circumstances. That's not a flaw in the libraries. It's human nature.
>We keep inventing languages. They don't help. We keep inventing frameworks. They don't help. We keep trying "design patterns" and "methodologies." They don't help. If anything all the stuff we invent makes the problem worse. Now we have an ecosystem with 20 different languages, 50 different runtimes, and 30 variants of 5 OSes. Complexity goes up, costs go up, reusability never happens, etc.
All of this is too pessimistic. These tools do help. Exactly how many languages do you think we should have? Do you think exactly one group is going to develop for each use case and satisfy everyone?
>WASM has its uses but I can't escape the idea that it's like "let's build a VM and carry all the shortcomings of C into it."
I'm not a web guy but this sounds silly. It's not meant to be written directly. Complaining about shortcomings of WASM is literally like complaining about shortcomings of assembly language. It's not intended for human consumption, in modern times.
That slant-rhymes with "sound good but almost never work" but in detail is completely different. When treated as difficult problems, and committed to accordingly, having them work and work well is a normal result, eminently achievable.
As afterthoughts, or when naïvely thought to be easy, then yeah, they frequently go poorly.
I don't read it that way. I read it as engineers engaging in pre-optimization for no business benefit. It's utterly rampant in the industry because it's fun to design and build a redundant auto-scaling spaceship vs. just over-provisioning your server by 200% for a tenth (or less!) of the cost and having backups ready to deploy in a few hours.
Sometimes these ideas make sense - after you need them. Not designed-in at the early product stage. Very few products go on to need the scale, availability, or complexity most of these implementations try to solve.
If that is bot the message, what is that? “These things are hard, often don’t work, but GO FOR IT”?
I pretty much read: “try to avoid” which is bad advice in my opinion. Like “documenting SW properly while doing development is hard, and often goea wrong” so what?!
It's "work smarter not harder" for knowledge workers who don't realize that using your brain more is the hard work in the saying.
The value of that approach is very situational…though I will acknowledge that the majority of places probably warrant at least some of that.
However, a highly abstract design with all the business logic in config, workflows, etc will only makes your system extremely flexible as long every one up and down the organization is fairly aware of the abstractions, the config, and uncountable permutations they can take for your business logic to emerge.
Those permutations quickly explode into a labyrinth of unknown/unexpected behaviors what people will rely on. It also makes the cost of onboarding new developers, changing the development team insurmountable. Your organization will be speaking 2 different languages. Most seemingly straightforward "feature asks" that break your abstraction either become a massive system re-design/re-architect or a "let's just hack this abstraction so it's a safer smaller change for now". The former will always be really hard unless you have excellent engineers who have full understanding of the entire system and its behavior and code base along with and excellent engineering practices and processes, and still will take you months or years to pull off. The latter is the more likely to happen and it's why all those "highly abstract, functionality layers, config driven, business logic emerging) projects start perfect and flexible and end up as a "what the fuck is even this".
After a system is implemented, that emergent business logic becomes the language everyone will speak in. Having your organization speaking 2 or 3 completely irreconcilable languages is very painful and unless you have multiple folks up, down and sideways in the organization that can fluently translate between the 2, you'll be in a world of pain and wish you had some closer representation of your domain
I think what that quote is against is the common middle ground where states are expected to be 'impossible' and thus not handled and cause bugs when they are found to be not actually.
Either deciding that they are possible or must be impossible is usually better and which one to go with depends on the specifics
My general take is that you want relatively few control loops, in positions of high leverage.
That's also why it's possible to support all of these attributes. If you make say transparent data synchronization a core value prop of the platform, then all future development supports that first, and you evolve your feature set based on what's possible with that constraint. That feature set might not be exactly what your users want, but it's what you support. Your product appeals to the customers for whom that is their #1 purchase decision.
Queues and event processing are a necessity to do it right.
It's like git solving the problem without commits, or banking solving the problem without transactions.
* postgresql as the system of record
* firestore as the upstream source for clients
* ES for full-text search
* client-side store for actual client-side reads
* http api for mutations
That required three sync systems: pg->firestore, pg->ES, and firestore->local store. Then it needed messages for the async mutations to propagate back to the clients. And then these things require more things to make them work, like data transformers to support the three different formats for each stage.
This certainly did not require some giant CQRS system and could have been built entirely on postgres. It was a fractal of code that didn't go toward the actual objective.
My uninformed guess is CS people just underestimate the skills and experience to analyze feedback systems and so write it off as a bad technology after a poor implementation.
Maybe the real problem is you want to know when your system is maxing out the range you anticipated?
One problem that constantly comes up is ratcheting or poisoning your own inputs.
Let's say you want to block "noisy neighbors" from taking large amounts of resources, while allowing all loads to have bursty use of the full system power. Easy, right? Detect the noisy neighbors and throttle them to a small percentage. Unthrottle when they show some substantial idle time. But now, many of those noisy neighbors can't get out of jail because they will, of course, use nearly 100% of their restricted load, even if they would now be well-behaved and merely bursty.
There is also a cascading effect, where you have N related bursty loads. One bursts for too long, gets throttled, and now the load is handled by the remaining N-1 loads. But that makes those loads more likely to get throttled, and so on. Only unthrottling all the loads simultaneously will allow them to return to normal bursty operation.
See articles like [1] or any of Marc Brooker's [2] blog for inspiration
[1] https://medium.com/yandex/good-retry-bad-retry-an-incident-s...
[2] https://brooker.co.za/blog/2022/04/11/simulation.html
That’s what I said.
It's just that they don't "just" work.
Just because AWS or Google can pull it off doesn't mean it's something anyone can do.
"Let's just make planes fly themselves"
"Let's just put a giant battery in a car instead of an engine"
"Let's just make electricity from wind"
"Let's just be friends"
I would love to see a list like this without such a big asterisk.
We do what we do. Not because it is easy; but because we thought it would be easy.
So if making it cross platform "won't work", this is not about what won't work, but what is cheaper?
Since the OP kinda mentions gaming, let the customers fiddle with their wine installs and steam decks and spend the time on adding more loot boxes instead?
I think they absolutely could have made Office cross platform (it doesn't really do anything you can't do in Qt), and also the fact that they forked it and made two entirely separate sets of apps has pretty serious consequences. The feature sets are surprisingly different. E.g. you can add PDFs to documents on Mac; not on Windows.
Most software can be made cross platform (as long as you don't explicitly prevent it by using a platform specific GUI toolkit or whatever).
Making a new language throws away everything and starts over including debugging, tools, syntax checking, auto completion, documentation etc. It is take way too lightly and just becomes a hassle.
You can see this in multiple GUI libraries, where getting the parameters to set up a GUI is really not difficult and is just data through function calls, but it gets made into a separate XML like DSL markup language with it's own quirks and opacity, and that XML is for most people being given using a big string from within the language that they're using.
This stuff persists because it sounds easier on paper and it just creates more problems in practice but it takes experience to realize all the you're losing. That's where designers need to be experienced and do what works instead of what will suck in people that don't know any better.
My (recent) experience:
> Let's make that asynchronous
It can be done, but here, there be dragonnes.
I just had to make a formerly synchronous load in a recently released app, into an async one, because users with large connection lists (think "friends," in Facebook, but not as "friendly"), were having extremely slow loads.
This was a big change.
First, I had to swap out an entire SDK that accessed the most important server in the app, because the old one didn't play well, with threads (my bad). That actually went fairly smoothly, because of the abstraction (boo hiss, I guess?) that I had used for the SDK. Took about a day, to have the operation running smoothly.
Testing...Testing...Testing...
Next, I had to test like crazy, for weeks, on the user-level code, because the new threading brought the beast out in that code. I found all kinds of places, where I had written thread-unstudly code. None of the issues were serious, and many folks would have said "Fuck it. Let's ship," but I'm a bit anal about certain things, and I'm not being paid, anyway...
In the end, the conversion was a success (not one complaint —fingers crossed), and we got the results we needed, but it took a hell of a lot of testing (especially monkey testing), and Release Day was a nervous one. The UI basically didn't change at all (except for some loading throbbers on the profile avatars), but under the hood, a lot had changed.
Sometimes, it needs to be done, but it's tempting to make it seem easy (which many folks will). I am a rather scarred veteran of "That should be easy," so I went in, with eyes open, and (as noted) had already prepared, with a certain level of abstraction. I figured that the SDK swap would be [relatively] easy, but didn’t bargain for all the little bugs, in the code I thought was already sorted.
A good example is the DSL one. In reality every significantly complex software system is essentially a DSL, it will have it's own collection of nouns and verbs that constitute the application domain, this is a language and rarely is it understandable to anyone without the domain knowledge.
The problems often arise when custom denotational semantics are added that hinder composition. This is overcome by 'not using a DSL' and instead just expose the semantics of your implementation language, but essentially an API is little different from a DSL just with all the baggage and coupling to the underlying implementation details.
Granted, you have to freeze the whole damn OS image, but it does seem to work.
All of the others I agree, I've seen them tried so many times in my lifetime, from mainframes to serverless, and nobody gets them right.
It works really well in the HPC world I think (MPI+OpenMP on a node is the de-facto standard). But… I dunno, I guess when I think systems programming I think of the stuff doing all the bookkeeping. The bookkeeper better not steal all my cores!
DSL is a sort of interesting one. What’s a DSL for systems programming? I’ll naively throw C and Rust as the systems programming DSLs. Of course, the domain of systems programming is, uh, controlling all the hardware. So it isn’t that surprising that the DSLs of systems programming quickly become the languages that everybody wants to use for everything, right? The problem isn’t the “specific language” part, it is that a good enough systems programming language quickly gets the domain of “everything,” haha.
As an SRE, I don't actually care about anomalies all that much. For initial alerting, I want phase shift detection. One customer sending a few bad API calls on one specific minute is uninteresting and pretty much inactionable. That same error rate over 10 minutes is more interesting and more likely to be a systems problem I can actually resolve. But the raw AD stream is just too damn noisy for all sorts of reasons. This is why our alerting tools have a `duration` field: any signal above threshold must remain so for multiple observation periods before summoning human inspection. And why health checks have grace periods and retries before killing services.
Where anomaly detection works better, IMO, is post-alert analysis. At that point anomalies are welcomed as hypotheses, since the system features complex interactions between components. I've built a couple of dashboards using extremely simple math, like Laplace smoothing and time series correlation that help surface relevant information from the flood of metrics and logs collected. But critically, these tools generally don't use the time domain as their baseline. Usually, I'm comparing a cluster against another one in a different region, or one metric against another, rather than now versus twelve hours ago.
I could probably come up with similar examples from the rest of the list. If the message is 'system ideas which seem simple but are hard, so you should only use existing off the shelf examples of' then i'd certainly have more sympathy for the statement.
If your idea of cross-platform is that making it work on all versions of Windows from 3.1 to Vista, sure. Microsoft has always been about capturing users within its ecosystem.
These are basically what CRDTs are. And I know what you're thinking; "but I'm not writing a mutli user text editor!" or "automerge won't scale!". But CRDTs aren't a library, they're a set of properties. if your whole data system obeys the properties - you have a CRDT.
Whenever I hear someone say “sync” data, I instantly get scared. Consensus is fraught with peril and very very very difficult to implement correctly.
Eventual Consistency: If Node A and Node B have received the same set of events (ie in any order), they will eventually have the same state.
Strong eventual consistency: Exactly the same, but replace "eventually" with "immediately".
Meaning as long as all nodes get the same events, they'll have the same state, straight away. That's sync that works.
As long as your merge algorithm is commutative, associative, and atomic, your sync will work. That's what the CRDT people uncovered.
A data structure can obey these laws (the aforementioned CRDT libraries)
A database can obey these laws (the original Amazon Dynamo did this, it was a CRDT).
Any arbitrary system can obey these laws (there's 1982 paper described this for a distributed file system)
> Strong eventual consistency: Exactly the same, but replace "eventually" with "immediately".
I believe this isn’t quite correct. I was under the impression that the delivery doesn’t have to be immediate, but rather that any two nodes with the same set of events, regardless of received order, must be in the same state.
I'm taking my definitions from section 2.2 here. I feel like I've summarised it fairly accurately, but if I've made a mistake would be happy to be corrected.
If having an API is core to the value customers will realize then it’s likely a good API will emerge. Etc.
It is? Like what? I know there are some abstractions for cpu architecture etc but they've come in handy for x64 and now arm and others in the past.
I know the author certainly has some insight into this but I've never really thought of NT as being riddled with excess abstractions.
I agree with sync though. Hard problem with no simple solution for an idiot like me to do.
This is not true. Computers focused on single threaded designs before getting thrown into parallelism (need parallelism to run into dining philosopher problems).
These things "sound good but almost never work" when they're taken as an afterthought or committed to without due research/process/design.
Any architect worth their salt will avoid implementing hard solutions to these problems. These are mostly here to not solve crucial issues... but the article seems to be addressing a developer profile that adds APIs or asynchronous processing casually.
Sounds like a strawman, or just exceedingly pessimistic look at the industry.
here is a paper written by the people who designed it: https://www.csc.kth.se/~gkreitz/spotify-p2p10/
this part of the abstract gets right to the point: 8.8% of music data played comes from Spotify's servers while the median playback latency is only 265 ms (including cached tracks)
I also added a similar mechanism to my own product years ago. Works flawlessly. Using content based addressing already made it quite easy to implement.
PS: on mobile, can't expand (issues around versioning, data migration)
This (Computer Security) is a solved problem, kids. One of the lessons learned in the Viet Nam conflict was the need for a computer system that could safely handle multiple levels of classified data. The solution was Multilevel security, included in that was the Bell-LaPadula model. Several actually secure OSs emerged over the decades since, including KeyKOS, CapROS and Eros.
I'm hoping that someday I can make Genode, the latest capability based operating system, my daily driver, so I never have to worry about virus scanners again.
The problem is that the constraints this imposes on the system usually do not line up with the constraints that the market will pay for. It's very common for customers to change their mind; decide they need to hack around access protections; add new users with new roles that are some hybrid of current access; ask for new features; not think through who should have access to new features; want to enable serendipity where untrusted users discover new use-cases and new markets for their product; and so on. It's also very common for them to ignore security as a differentiator when making their purchase decision, figuring that if there's a breach, somebody else will pay for it, or they'll be long gone from the company and unable to be blamed for it. So the market ends up bypassing the secure solutions that exist and choosing to buy insecure systems that can offer the features they want right now.
When security is absolutely critical, like in military or certain financial applications, it's pretty easy to achieve. There are companies like Galois that specialize in "high assurance systems". But they are expensive for their feature set, and so the general public would rather buy from cheaper and more insecure options.
Memory holing any mention of this solution isn't productive in the long run.
The problem with this is that no mainstream OS does this correctly, which means that correctly doing security requires writing a new OS and getting all the userspace programs ported over to it (which is a non-trivial port, because the programming model for capabilities is pretty significantly different from mainstream OSes). It's very hard to convince users to ditch their entire computing ecosystem for a new one unless all of their devices get pwned and they can't access their computing ecosystem anyway.
The crux of the issue is command line programs... I'm not sure how to deal with those, but I suspect it'll be an outer job control language.
It would be minimal work to refactor applications, and provide almost perfect security with no UX change.
Hilariously Uncle Bob will write off any criticism as “they misunderstood the principles”. He’s correct too, but maybe the principles are simply too vague when we’ve had them for 20 years and our industry has never been more of a mess.
I'm not the only one who is skeptical of this toxic, holier-than-thou and dangerous attitude.
Removing braces from if statements is a great example of another dangerous thing he advocates for no justifiable reason
https://softwareengineering.stackexchange.com/questions/3202...
Which caused the big OSX/iOS SSL bug in 2014, see https://www.imperialviolet.org/2014/02/22/applebug.html
This link and thread on hackernews is good too
https://news.ycombinator.com/item?id=15440848
I do agree with these people that nobody has ever regretted writing a test. Well, I mean, someone probably has, but the idea of it is fairly solid. It’s just also useless, because it’s so vague. You can write a lot of tests and never be safe at runtime.
Recently Ive been writing mostly "end-to-end unit tests" - stateless, faking all external services (database, message queue, etc.) with TDD which works great.
There is a sweet spot on default test types - at a high a level as possible while being hermetic seems to be ideal.
The other un-talked about thing is that to be able to always write this kind of test you need test infrastructure which isnt cheap to build (all those fakes).
Yet many developers don't even know what a runtime assertion is while everyone knows what TDD is. I guess it doesn't really matter if you're working on something which can crash and then everyone will be like "oh it's just IT, it does that".
I do think there is such a thing as overtesting - i.e. regretted tests. TDD actually protects you from this to an extent by tying each test / modification of a test to a change in code.
Runtime assertions definitely give you more bang for the buck (they are ridiculously cheap) but they are complementary to tests, not a replacement. It attacks the same problem from the bottom up instead of top down.
I also find that when you combine the two, the tests become more useful - previously passing tests will fail when they trigger those assertions.
Or even worse yet, lets reinvent the filesystem then host it on top of a filesystem.
But the OP does have a point, they each can introduce more trouble then they're worth. Were I to write this post, I would have titled it something more like "Systems Ideas Yo Really Should Think About Long And Hard Before Doing."
But yeah, that might not be enough warning.
I did like this:
> More importantly, an offering an API doesn’t mean anyone wants to use it. Almost every new API comes up because the co/product wants features, but it doesn’t want to prioritize them enough and the theory is the API will be “evangelized” to some partner in the space. Turns out those people are not sitting around waiting to fill in holes in your product.