Don’t code with mocks period.
Structure your code such that it has a functional core and imperative shell. All logic is unit testable (functional core). All IO and mutation Is not unit testable (imperative shell).
Mocks come from a place where logic is heavily intertwined with IO. It means your code is so coupled that you can’t test logic without touching IO.
IO should be a dumb layer as much as possible. It should be Extremely general and as simple as fetching and requesting in the stupidest way possible. Then everything else should be covered by unit tests while IO is fundamentally not unit testable.
If you have a lot of complex sql statements or logic going on in the IO layer (for performance reasons don’t do this if you can’t help it) it means your IO layer needs to be treated like a logical layer. Make sure you have stored procedures and unit tests written in your database IO language. You code should interface only with stored procedures as IO and those stored procedures should be tested as unit tests in the IO layer.
bcrosby95 19 days ago [-]
> IO should be a dumb layer as much as possible. It should be Extremely general and as simple as fetching and requesting in the stupidest way possible. Then everything else should be covered by unit tests while IO is fundamentally not unit testable.
Another way to put this is to ignore all the cool problems your database can solve for you and hand roll it instead. This also is a recipe for riddling your codebase with data races.
You're trading fairly easy problems for very hard problems. No thanks.
When running tests, optimize your database configuration for running tests against it, and make it possible to run those tests in parallel.
AndyNemmity 19 days ago [-]
Right, but you can see how they do it. The problems they are trading are likely not hard for them.
Whatever you use, in your personal toolbox changes the results of the tradeoffs.
paulryanrogers 19 days ago [-]
Please don't write a lot of stored procedures. Changing them in large or sharded DBs is not trivial and tooling to support them is minimal.
motorest 19 days ago [-]
> Don’t code with mocks period. Structure your code such that it has a functional core and imperative shell. All logic is unit testable (functional core). All IO and mutation Is not unit testable (imperative shell).
I'm not sure what your definition of mocks is here.
Even if you follow a principle where all IO and mutations is not unit testable, you still need test doubles to plug in the interface of your testable functional core. Because you need to pass data through to exercise code paths. How do you call the component that implements the interface of your dumb IO layer?
icedchai 19 days ago [-]
I haven't seen the "stored procs only"-style of DB development since the early 2000's, and I'd never go back to that.
SOLAR_FIELDS 19 days ago [-]
For every “I made this way more efficient/clean with a stored procedure” there’s some incident where the stored procedure was left off, fired too many times, or just caused some problem due to its lack of visibility to the end user in the codebase
Two4 17 days ago [-]
Exactly. I only break out SPs when I absolutely need to squeeze as much performance from my data store as possible. I'd rather pay for more horizontal scaling than push my logic across a system boundary where it's difficult to monitor and debug, because bugs in SPs are probably the single costliest class of bug in a prod system.
SOLAR_FIELDS 17 days ago [-]
I’ve always felt this was partly database vendors fault for not having tooling that integrates well. I think there is some theoretical world where pgTAP and other database specific test harnesses work fairly seamlessly with language runtimes via some well designed SDK. Tracing would need to walk into the db call stack from the application code.
Clearly this is harder to get right than expected since there aren’t many places doing stuff like that, that I’m aware of at least. Maybe if some of these database vendors built out better API’s for things like that with the expectation other language runtimes would be calling them the situation would be better. But it seems a lot of the old school db world is perfectly happy to build their products as if they were a black box from the application’s perspective
ramses0 19 days ago [-]
Lambda functions would like to have a word...
icedchai 19 days ago [-]
At least with Lambda you can use mainstream programming languages. You can also have single lambdas that handle multiple endpoints or events. It’s a bit different.
scarface_74 19 days ago [-]
I am actually at a point in my career that I won’t work for a company that is dependent on stored procedures. Rollbacks, versioning, branching, testing and everything else is a pain once you introduce stored procedures.
As a lead for mostly green field applications over the past decade, my one no compromise architectural edict is no stored procedures for online production code.
I might use it for background utilities or one off maintenance.
tbrownaw 19 days ago [-]
Is this about having application logic in multiple places, or having application logic co-located with the data? Does it make a difference if the procedures are database triggers vs being directly called by the application code?
Would splitting off some of the logic into a separate service (that does its own database accesses) have the same issues?
scarface_74 19 days ago [-]
Say I have functionality to get customer data. If the sql logic is in code, I can create a new branch and work on it, merge my changes, I can do blue/green deployments. A rollback if you’re using Kubernetes is a simple matter of reverting back to the configuration file containing a reference to the old Docker container etc.
How do you roll back a dozen stored procedures quickly? How do you just create a new branch or test a regression?
With all of the logic in code, it’s a simple matter of checking out a previous commit.
Triggers are even worse - “spooky actions at a distance”.
gregw2 19 days ago [-]
> How do you roll back a dozen stored procedures quickly?
Liquibase rollback feature. The SQL changesets and rollback strategy is all defined in my .sql liquibase files and goes inside my git repo alongside my scala/python code the associated CICD automated integration tests + deployments.
Blue Green deployment can handled via bidirectional active-active database replication across your two disaster recovery databases; you deploy the newer stored procedures on one but not the other. Or you have sharded databases for some but not all customers and deploy blue/green with appropriate blue/green sharding. Or you could have separate version numbers for the stored procedure names (perhaps with a wrapper stored procedure that works differently on different shards/databasenames to isolate the impact from your non-SQL codebase.) Or just put a form of feature flags in your stored procedures; it's not 100% blue green but with minor integration test coverage it can be quite safe and allow you to run two code bases at once even in a single database.
Agree with you on triggers (and I agree to a point that stored procedures are often not the right approach despite my above rebuttal... but it is quite straightforward with the right tooling/mindset. For high-volume analytics with aggregations you might be better off leaving the data in columnar database SQL and doing some stored procedures if the code was complex enough; but for CRUD apps I would strongly discourage stored procedures.)
scarface_74 19 days ago [-]
I agree with Liquibase for source control of schema.
But you don’t see how much harder this is for developers with feature flags in stored procedures, etc over standard git? There is tooling around feature flags for code
I think we are in violent agreement though, for OLAP and analytics, I wouldn’t care as much. Because of the way that Redshift works (columnar store) and how it compiles and optimizes queries across a cluster, I wouldn’t care about stored procedures.
gregw2 19 days ago [-]
Thanks! Good point about integrating with feature flag tooling... I'll have to think more about that.
zaphirplane 19 days ago [-]
In your post you talk about forking and rolling back.
I bet the same can be done for SP, even if is a fork and add suffix approach
The calling code can call either
On a phone hope I’m clear enough
scarface_74 19 days ago [-]
If I need to fork code am I going to then make copies of all of the stored procedures corresponding with the code and now have getCustomer1, getCustomer2… and doing the same with all of your procedures?
And then do I need to change all of my stored procedure references in my code? Do I merge my changes back to the main one when done?
Isn’t just doing a “git branch” easier?
Rolling back, is just doing a git revert in your separate Kubernetes repository (you are doing proper Gitops aren’t you?), recommitting your code and your cluster is now using your old Docker container.
If you aren’t using K8s, you just push your old code.
RaftPeople 18 days ago [-]
> How do you just create a new branch
I'm not really following what you are saying. Sprocs are in the same repo as any other code, a branch is a branch, why do you think you can't create a branch that contains sprocs?
scarface_74 18 days ago [-]
Am I then going to deploy the stored procedures to a shared database that other developers are using with their own branch or am I going to have my own database instance?
zeeZ 18 days ago [-]
Are all those developers running different branches of code with automated schema changes on the same database? What if someone adds or removes a field? Stored procedures are not the issue here
scarface_74 18 days ago [-]
I’m not referring to schema changes.
If I have logic that returns customer data in code, I can branch the code, make and test my changes locally and other developers can do the same without any conflicts.
Now imagine the next major release requires 50 different sql changes. Would that be easier to push and maintain one git branch or 50 stored procedures?
Again how do you rollback easily and effectively when your release depends on multiple store procedures?
Adding a nullable field shouldn’t be a breaking change, removing a field would be.
RaftPeople 17 days ago [-]
> Now imagine the next major release requires 50 different sql changes. Would that be easier to push and maintain one git branch or 50 stored procedures?
Why aren't those 50 stored procedure changes also in your one branch?
It kind of sounds like you only put some types of code in git, as opposed to everything. Is that correct?
scarface_74 17 days ago [-]
And now we are getting back to having to deploy those 50 stored procedures to a database to test your code when you branch.
Are you going to use a shared dev database? What happens when other developers are also making changes to other branches?
I’m assuming you need to test with a representative size data set with a representative size database server.
Just like with traditional Kubernetes Gitops where you have your source code in one repo and a pipeline to create your docker containers and your K8s configuration in another repo to manage your cluster and update the referenced Docker images, even if you don’t have stored procs, you would keep your schema change sql files in another repo and “version” your database separately with a separate pipeline.
RaftPeople 16 days ago [-]
> And now we are getting back to having to deploy those 50 stored procedures to a database to test your code when you branch.
Are you going to use a shared dev database? What happens when other developers are also making changes to other branches?
Valid questions, but there are also valid answers, the primary one is of course "it depends".
Sometimes a shared DB works fine, as long as you manage other work being done to avoid conflicts.
Sometimes an isolated DB is needed for either just this work, or this work plus related project work isolated from other projects.
We do all of the above, and yes it takes effort, but that effort exists regardless of sprocs or not due to the nature of most of our projects and changes, rarely do we have small isolated work.
But I'm guessing our environment is not the same as your based on your statements, this is our environment:
1-Medium sized enterprise which means we run many apps from vendors, some on prem, some in cloud, as well as internally developed apps - in other words it's not a homogeneous environment and we don't control all aspects.
2-Functionality that frequently either sits on the side of operational systems, or in between the operational systems or as a layer on top that is unifying the business activity across systems and realizing a higher level of abstraction and activity.
3-Focus on end to end testing due to higher bug detection rate (i.e. around 70% bug detection) vs unit testing (i.e. around 30% bug detection), and due to work flowing through multiple apps frequently.
> even if you don’t have stored procs, you would keep your schema change sql files in another repo
Our schema changes are typically in the branch being merged and deployed. Our deployment process has pre and post DB actions sections.
But some changes are too big for that type of thing and require multiple day deployments with various business and technical actions happening in a specific sequence across multiple different apps (e.g. hold new activity for set of skus, stores and web site complete final transactions, clear DCs, switch to new process where those skus have a new unique flow through the systems and the enterprise).
csomar 19 days ago [-]
This throws away most of what your database can do for you. I personally like to do authorization at the database layer. There are a tons of other things a DB can do, like trigger a webhook. Are you going to frequently pull from your DB for live updates?
I think OP issue is that he didn't understand what mock tests actually tested against. They are called mock for a reason. There should be a better way to provision a database when running tests and have a full simulated environment.
Spivak 19 days ago [-]
I think you've swung the other way a bit too far, I've never seen application logic split between the app and database not become a huge PITA.
I would never have my DB trigger a webhook for me but I would have it act as a message broker back to my own code that triggers a webhook. I don't want my DB ever initiating an outbound connection.
motorest 19 days ago [-]
> This throws away most of what your database can do for you. I personally like to do authorization at the database layer.
The whole point of writing maintainable software is to throw away most of what technologies can do, because you are way better off if you do not make the mistake of following that path.
For example, at a first glance doing auth at the database layer is a huge mistake. You design your systems with a few nodes having exclusive access to the DB to ensure your access is secure without having to waste the overhead of checking credentials in each db access. The main design trait is speed. But you do you.
> There are a tons of other things a DB can do, like trigger a webhook.
There are plenty of reasons and hard learned lessons behind principles such as separation of concerns.
ljm 19 days ago [-]
so what you should do instead is write a load of pure functions and intersperse them with impure functions, and when it comes to testing that, if someone writes a different impure function than the one you wanted, you just fail the test and trigger a post to HN: "don't code with mocks."
ninetyninenine 19 days ago [-]
You failed to read and/or understand what I wrote. Also how do I fail a test for a new code addition I didn’t even write?
Unit tests dont test impure functions all they do is test pure functions. Impure functions aren’t unit tested period. So when someone writes an impure function I don’t like what happens? Nothing. It wasn’t part of unit tests annyway, understand?
If you’re referring to integration tests well a newly written impure function isn’t part of integration tests anyway so nothing happens in this case either.
I found your last comment about how I’m triggered to write such a comment rude and lacking in even understanding what I wrote. Please try to understand my point before saying something like that… if you can’t please refrain from replying in such a way as it’s against the rules here. Thanks for reading.
piva00 19 days ago [-]
How do you write the expected return from an external API in your test setup?
ninetyninenine 19 days ago [-]
That’s an integration test. Your external api returns a number, let’s say 10. You have two local functions. One that is very general and fetches that number, and another that transforms that number. Your local transformation function is the one that is tested. The other function is part of integration tests and in general can’t be unit tested.
piva00 19 days ago [-]
Exactly, there's a layer (integration testing) where I usually do need to have a test against a mock to validate my function doing the external call is passing a request I expect. I can break it down into a parameter supplier function that will generate the proper request params and test that but I don't see any gain in abstracting that away simply to avoid mocks.
Spivak 19 days ago [-]
You use a "VCR" module to record the real request and response and then play it back in your unit test. Boom you get the power of an integration test with the speed and determinism of a unit test. Then you turn off the VCR in CI to catch if/when the upstream API changes on you and now it's a real integration test.
The problem with write tests, not too many, mostly integration is that unit tests are too damn good at giving you instant feedback and a tight dev loop. So we turned integration tests into unit tests. Combined with SQLite :memory: for unit tests and Postgres for the final integration tests we don't have to mock anything.
Source: Currently use this method at $dayjob with great success.
piva00 19 days ago [-]
My question was a rhetorical one, I do use request-replayers and other techniques (I've been through a bunch in the past 20+ years as they were developed).
I was challenging the absolutism of "never use mocks", it's just another technique and can be applied easily in integration tests if the guidelines are well set in the team on how to not use them.
Also, I do not do external calls in the CI/CD pipeline, it inevitably makes tests brittle and flaky.
ljm 19 days ago [-]
[flagged]
spawarotti 19 days ago [-]
Great advice. I follow it in my coding efforts and it has never failed me. Great book about this: Unit Testing Principles, Practices, and Patterns, Vladimir Khorikov, 2020
how do you test the imperative shell except with mocks? Suppose you're calling an external API
ninetyninenine 19 days ago [-]
The imperative shell is untestable with unit tests. You don’t test it this way period.
Think about it. This IO layer should be so thin that all you’re doing is fetching and getting so this layer does a direct external call. When you mock this entire thing is basically what is mocked.
So if you mock or don’t mock if you write your code with the pattern of functional core and imperative shell the imperative shell is what is mocked and is never tested via unit tests regardless. You cannot test this.
Another way to think about it, is like this. Write your code in a highly decoupled way such that all logic or transformations done in your code can operate independently from things you have to mock. Then for the things you have to mock, what’s the point of mocking them in the first place when your logic can be tested independently from the mocks?
What’s the point of unit testing code that touches IO and is supposed to be mocked? You’d just be writing mocks and testing mocks.
Don’t write functions like this:
fetchNumberFromIOAndAdd2
Do write two functions like this:
fetchNumberFromIO
addTwo(x)
The later function can be unit tested. The former can’t be unit tested and there’s no point in unit testing it. The former function is what lives in the imperative shell.
What you see in a lot of books are authors promoting patterns like this:
addTwo(database object)
This kind of pattern forces you mock by coupling IO with logic.
throwawaymaths 19 days ago [-]
> whats the point?
suppose you are writing some software that interfaces with personal finance SAAS. except you want your users to be able to interact with more than one. so you sure as shit better be testing that your abstractions are making the correct calls, especially if customer money is involved.
you also really want to be testing for unreliability, for example, if your saas call is taking too long, you dont double tap, if a tx looks like it fails but it didnt actually, you know that on the second attempt, etc, etc.
"the shell is untestable" is not an acceptable answer.
and you cant wriggle out of this by saying "well that's an integration test". you said in your op "don't write mocks". thar is terrible advice in general. im generally a big fan of imperative shell functional core, but you MUST mock your shell, especially if absolute correctness is required (people's lives or money at stake)
ninetyninenine 19 days ago [-]
No. You completely missed what I said. I differentiated between unit tests and integration tests. Your imperative shell is not testable in the context of unit tests. However you must test it via integration tests.
Mocks are a utility for unit tests hence the topic of this post. We are talking about unit tests and mocking aspects of the unit test that can’t be tested via unit tests.
Integration tests are another topic. With integration tests or end to end tests you test everything. You don’t mock. Mocking is not really a valid concept in the context of integration tests.
throwawaymaths 19 days ago [-]
> However you must test it via integration tests.
that is not what you said. you wrote: "Don’t code with mocks period"
> I differentiated between unit tests and integration tests.
you did not. your original comment doesnt mention integration tests at all. i quote:
"Structure your code such that it has a functional core and imperative shell. All logic is unit testable (functional core). All IO and mutation Is not unit testable (imperative shell).
Mocks come from a place where logic is heavily intertwined with IO. It means your code is so coupled that you can’t test logic without touching IO.
IO should be a dumb layer as much as possible. It should be Extremely general and as simple as fetching and requesting in the stupidest way possible. Then everything else should be covered by unit tests while IO is fundamentally not unit testable.
If you have a lot of complex sql statements or logic going on in the IO layer (for performance reasons don’t do this if you can’t help it) it means your IO layer needs to be treated like a logical layer. Make sure you have stored procedures and unit tests written in your database IO language. You code should interface only with stored procedures as IO and those stored procedures should be tested as unit tests in the IO layer
> Mocking is not really a valid concept in the context of integration tests.
what the fresh hell are you talking about? its a common practice to set up mocks (using dependency injection with expects) for an external API call in an integration test. in 10 years of professional software development I've mocked in integration tests at five empolyers.
Rarely used unless necessary is what I meant. The point of the integration test is to avoid mocking and I don’t want to argue a pedantic point with you and I also don’t like your attitude so I’m ending this section of the thread don’t bother replying.
throwawaymaths 19 days ago [-]
hey man the least you could say us "yeah i wrote something that I didn't mean, my bad" when i called you out. instead you dug in, moved goalposts, and claimed you wrote shit you didn't.
it started with literally bad professional advice. junior developers reading these forums need to not have professional habits influenced by the sort of behavior on display by you here.
and a ChatGPT transcript? really? just do a simple search, you will find tons of articles advocating for mocking external apis in integration tests.
ninetyninenine 18 days ago [-]
Ok. My bad. I have no problem saying that. But I cut off the convo because I don’t like your attitude. Didn’t like this reply either. Let’s just end it here I have nothing further to say to you and I don’t care about your technical opinions. Good day.
tomjakubowski 19 days ago [-]
The jargon around all this stuff is terrible. What I would probably write is, like, a fake version of that API, one which runs out of process from the library or test, and which can be "programmed" from the test set up -- "expect these requests, serve these responses", etc.
Is that a mock? I don't know. It sure does resemble one, but it doesn't use dependency injection or other runtime tricks which I'd associate with classic "mocks"; it's just another program you write to support the tests. The imperative shell doing I/O needs to be configurable to support this (think a "widgets API base URL"), but that's it.
BoiledCabbage 19 days ago [-]
Yes that still a mock. And thr point is to structure your code in such a way that there is little to no value left in testing that API call independently, because all of the related logic has been tested in a unit test.
Once you've done that, you can just run a simple integration or end-to-end test to cover that API. You never need to manually mock it (like you suggested), or use a mocking tool.
drewcoo 19 days ago [-]
> that still a mock
It's a fake, not a mock.
Mocks and fakes are both test doubles.
> And thr point is to structure your code in such a way that
Sometimes it's not your code. Sometimes you want to change legacy code and want tests to prove that your changes don't break anything.
Life would be so much easier if there were just one way to test, wouldn't it?
bluGill 19 days ago [-]
that is a fake. A mock is a device for verifying you called the expected funchtions. A fake emulates the api. Importantly if you change funcioms the fake will still pass if they mean the same - think of write which might take a buffer+length [C style], a list or a string (often there would be more than 10 variations)
andrekandre 19 days ago [-]
i learned something new today, thanks
--
following in that theme, just doing some digging [1] turns up the following for those that might have been confused like i was:
Dummy objects are passed around but never actually used. Usually they are just used to fill parameter lists.
Fake objects actually have working implementations, but usually take some shortcut which makes them not suitable for production (an in memory database is a good example).
Stubs provide canned answers to calls made during the test, usually not responding at all to anything outside what's programmed in for the test.
Spies are stubs that also record some information based on how they were called. One form of this might be an email service that records how many messages it was sent.
Mocks are what we are talking about here: objects pre-programmed with expectations which form a specification of the calls they are expected to receive.
The jargon is terrible because I think you misunderstood my point. You don’t mock period is my philosophy but your talking about mocking an entire api here.
blitzar 19 days ago [-]
How do you test that the external API returns are correct?
hinkley 19 days ago [-]
That’s not a unit test. That’s not even close to a unit test.
19 days ago [-]
bdangubic 19 days ago [-]
mock it and then just assert of the mocked result - never fails :)
That's the only site whose dev contents I ever paid for. Gary Bernhardt is very entertaining.
nrr 19 days ago [-]
I think it's probably worth mentioning that the principal concern for tests should be proving out the application's logic, and unless you're really leaning on your database to be, e.g., a source of type and invariant enforcement for your data, any sort of database-specific testing can be deferred to integration and UAT.
I use both the mocked and real database approaches illustrated here because they ultimately focus on different things: the mocked approach validates that the model is internally consistent with itself, and the real database approach validates that the same model is externally consistent with the real world.
It may seem like a duplication of effort to do that, but tests are where you really should Write Everything Twice in a world where it's expected that you Don't Repeat Yourself.
jcgrillo 19 days ago [-]
The database is often the thing that enforces the most critical application invariants, and is the primary source of errors when those invariants are violated. For example, "tenant IDs are unique" or "updates to the foobars are strictly serializable". The only thing enforcing these invariants in production is the interplay between your database schema and the queries you execute against it. So unless you exercise these invariants and the error cases against the actual database (or a lightweight containerized version thereof) in your test suite, it's your users who are actually testing the critical invariants.
I'm pretty sure "don't repeat yourself" thinking has led to the vast majority of the bad ideas I've seen so far in my career. It's a truly crippling brainworm, and I wish computer schools wouldn't teach it.
nrr 19 days ago [-]
> The only thing enforcing these invariants in production is the interplay between your database schema and the queries you execute against it."
I'm unsure that I agree. The two examples you gave, establishing that IDs are unique and that updates to entities in the system are serializable (and linearizable while we're here), are plenty doable without having to touch the real database. (In fact, as far as the former is concerned, this dual approach to testing is what made me adopt having a wholly separate "service"[0] in my applications for doling out IDs to things. I used to work in a big Kafka shop that you've almost certainly heard of, and they taught me how to deal with the latter.)
That said, I'd never advocate for just relying on one approach over the other. Do both. Absolutely do both.
> I'm pretty sure "don't repeat yourself" thinking has led to the vast majority of the bad ideas I've seen so far in my career. It's a truly crippling brainworm, and I wish computer schools wouldn't teach it.
I brought up WET mostly to comment that, if there's one place in software development where copying and pasting is to be encouraged, testing is it. I'd like to shelve the WET vs. DRY debate as firmly out of scope for this thread if that's alright.
0: It's a service inasmuch as an instance of a class implementing an interface can be a service, but it opens up the possibility of more easily refactoring to cross over into running against multiple databases later.
jcgrillo 19 days ago [-]
I've often been tempted to make an "id service" also because you can potentially get compact integer ids that are globally unique. That'll likely save you more than a factor of 2 in your ID fields given varint encoding, which could be very significant in overall throughput depending on what your data look like. Never actually tried it IRL though.
I agree both approaches are important, and it's totally ok if they overlap. If your unit tests have some overlap on your integration tests, that's nbd especially seeing as you can run your unit tests in parallel.
EDIT: actually I'll make a much bolder claim: even if your unit tests are making flawed assumptions about the underlying dependencies, it's still pretty much fine so long as you also exercise those dependencies in integration tests. That is, even somewhat bit-rotted unit tests with flawed mocks and assertions are still valuable because they exercise the code. More shots on goal is a great thing even if they're not 100% reliable.
nrr 19 days ago [-]
> If your unit tests have some overlap on your integration tests, that's nbd especially seeing as you can run your unit tests in parallel.
Exactly.
Another upside I've run into while doing things this way is that it gets me out of being relational database-brained. Sometimes, you really do not need the full-blown relational data model when a big blob of JSON will work just fine.
marcosdumay 19 days ago [-]
There's something very, very wrong in the way we write programs nowadays.
Because yeah, the database is you main source of invariants. But there is no good reason for you application environment not to query the invariants from there and test or prove your code around them.
We do DRY very badly, and the most vocal proponents are the worst... But I don't think this is a good example of the principle failing.
nrr 18 days ago [-]
> There's something very, very wrong in the way we write programs nowadays.
I largely agree, but...
> ... the database is you main source of invariants.
I guess my upbringing through strict typing discipline leaves me questioning this in particular. I'm able to encode these things in my types without consulting my database at build time and statically verify that my data are as they should be as they traverse my system with not really any extra ceremony.
Encoding that in the database is nice (and necessary), but in the interest of limiting network round-trips (particularly in our cloud-oriented world), I really would prefer that my app can get its act together first before crossing the machine boundary.
jcgrillo 19 days ago [-]
> no good reason for you application environment not to query the invariants from there and test or prove your code around them
As a developer who primarily builds backend web applications in high level languages like golang and java I run the risk of sounding ignorant talking like this but.. I'm led to believe lower level systems and embedded software has a lot more invariant preserving runtime asserts and such in it. The idea being that if an invariant is violated better to fail hard and fast than to attempt to proceed as if everything is alright.
marcosdumay 18 days ago [-]
Hum... I'm not sure we are talking about the same thing. Of course system and embedded software won't have invariants stored in a database, the comment isn't about them.
But, there isn't a faster way to fail to an invariant than to prove statically that your code fails it, or to test it before deploying. I don't really understand your criticism.
19 days ago [-]
pdimitar 19 days ago [-]
I thought this was common knowledge and that it became even easier after Docker became a thing?
Mocks are wishful thinking incarnate most of the time, though here and there they are absolutely needed (like 3rd party APIs without sandbox environments, or quite expensive API, or most of the time: both).
Just pick a task runner -- I use just[0] -- and make a task that brings up both Docker and your containers, then run your test task, done. Sure it's a bit fiddly the first time around but I've seen juniors conquer that in a day maximum and then your tests actually work with the real world 99% of the time.
Mocks in general are rarely worth it, the DB ones: 10x so.
testcontainers is great. I struggled a bit with testcontainers due to the nature of one container per test which just felt too slow for writing gray/blackbox tests. The startup time for postgres was > 10 seconds. After a bit of experimenting, I am now quite happy with my configuration which allows me to have a snappy, almost instant testing experience.
My current setup:
- generate a new psql testcontainer _or_ reuse an existing one by using a fixed name for the container
- connect to the psql container with no database selected
- create a new database using a random database name
- connect to the randomly generated database
- initialize the project's tables
- run a test
- drop the database
- keep the testcontainer up and running and reuse with next test
With this setup, most tests run sub-second;
addled 19 days ago [-]
If your table setup process starts to get slow like ours, checkout psql TEMPLATE (https://www.postgresql.org/docs/current/manage-ag-templatedb...). Do the setup once to a db with known name, then use it as the template when creating the db for each test.
paulryanrogers 19 days ago [-]
You can run Pg against a RAM disk too, if the container isn't effectively already doing that.
And tests can often use a DB transaction to rollback, unless the code under test already uses them.
kkarimi 19 days ago [-]
Yes it’s a shame that this is not leading the discussion, works very well and surprisingly efficient in our experience compared to manually setting up docker containers or docker compose
nerdponx 19 days ago [-]
I've had a nice experience using Docker Compose for this, although you might still want a task runner wrapper around it. Podman Compose should work fine as well.
sickblastoise 19 days ago [-]
My current setup on a personal project is docker compose with Postgres and pgadmin container.
I would never think of mocking a database in 2024/2025, just spin one up.
Also keep your lynch pin invariants in the database and not in your code.
pdimitar 18 days ago [-]
> Also keep your lynch pin invariants in the database and not in your code.
I do both. That is how you protect against stressed devs at 2:00 AM trying to insert crap in the production database and how you also make good apps that don't blow up with database exceptions and give you neat validation errors in forms and/or JSON `errors` keys (when it's an API).
bdangubic 19 days ago [-]
in 2024/2025 you just might be working with 985 petabyte database which just might have a few issues being “spun up” :)
nerdponx 18 days ago [-]
I am definitely not talking about a full copy of the data here. I more head in mind a small but representative sample, enough to run and exercise units of code with confidence.
sickblastoise 19 days ago [-]
Are you developing directly against your production database?
bdangubic 19 days ago [-]
replica in the staging environment
pdimitar 19 days ago [-]
Exactly what I do. Having a local `docker-compose.yml` that helps bring up a dev and test environments is one of my first action items when hired. Paid huge dividends, many times.
hinkley 19 days ago [-]
Too many people use fakes and call them mocks. Which makes the conversations quite tortured.
bluGill 19 days ago [-]
I've beconeea fan of good fakes. Good fakes closely resemble the real service but let me query useful things in tests. Mocks just tell me I called a function without concern for the 10 different ways to doethe thing.
hinkley 18 days ago [-]
If your mocks are hard to write you may have your fanout too wide and would do better with squaring up your dependency tree a bit. I have a rule of 5 for DI frameworks. If any module needs more than 5 other modules you should start looking to refactor to push dependencies down the stack.
What I see with Fakes too often is that they are difficult to write so people reuse them, end up writing too many tests that only test the fake, and couple a bunch of tests to each other which makes refactoring fraught. Anyone who makes refactoring difficult is an enemy to your project, not an asset.
Mocks make DAMP easier. You don’t care how the other function turns “foo” into “bar”, you just care what happens to this function when it gets “bar” as a response. That’s the surface area.
There’s a lot of people in this response chain who are very obviously conflating “unit tests” with all tests. Unit tests don’t care about integration. That’s integration tests. I liken testing to plumbing. Unit tests are similar to quality control at a pipe factory. You verify that the pipes aren’t damaged prior to assembly. The plumber inspects them before installing, and inspects their work as they bond or solder them together. But at the end of the day your plumber still has to turn on the faucet and make sure water 1) comes out the taps and 2) stays in the pipes everywhere but the sink, otherwise they haven’t done their job.
Unit testing the individual pipes is necessary but insufficient. You have to verify the connections and the end to end system still, but more work up front saves you some embarrassing rework at the end.
bluGill 17 days ago [-]
> Unit tests don’t care about integration.
NOBODY HAS DEFINED WHAT A UNIT IS! This needs to be shouted because all your arguments fall apart when you realize this. Until you get down to atoms (or something in quantum mechanics) you can go a level deeper. String.length is a integration test if you are the hardware engineer building the microcode on the CPU (do CPUs use microcode anymore?). If you are laying out the traces tests for the microcode are an integration tests for you. When I write a simple function that happens to use string.length I'm an integration test for that. Note that I skipped many many levels in both directions.
What people care about is that things work. The term unit test and integration test needs to die as they are not useful. Test at the right level. If you can write all your tests as full system tests (that is via UI input) that runs fast and are reliable then do so. We generally say don't test via UI input because painful experience is UI changes often in ways that break 1000s of tests all the time, which is good reason to inject tests below the UI. There are often other reasons to test break up tests into smaller areas, but that is trade offs.
Odds are you are not writing basic containers (ie string.length) and so your API is something that should be easy to refactor (you have no reason to think it won't change in the future as requirements change) and thus you should not test that API. Tests are an assertion that in the future this will not change, so if you are testing something that should change you are doing it wrong! (of course you will never get this 100% right, but that should be your aim!)
---
If your mocks are hard to write you may have your fanout too wide and would do better with squaring up your dependency tree a bit.
Mocks are not hard to write. They are the wrong solution to most testing problems though. Mocks assert that you call a specific function as expected - it is really easy (in most languages) to automate writing with a framework and so they are easy to write. The problem is calling a function is not what you want to test in most cases. I used write as an example because file IO APIs generally have many different forms of write. You have WriteWithBufferAndLength you have WriteAVector (or whatever your language calls the basic list container), then WriteWithSomeWeirdOptionsWeForgetToPutInOriginalAPIAndDidNotWantToRefactorEverythingSoWeAddedANewAPI... Some languages these are all named different, some allow overloading - it doesn't matter, the important part is mocks don't let you switch which one you use as 100 tests will break if you try even though switching might be better it isn't enough better to be worth trying to fix all the tests that happen to use that mock. Making the above worse, write often can be called with one large buffer or many smaller buffers and the end result is the same yet mocks assert how many times write is called which is wrong.
> What I see with Fakes too often is that they are difficult to write so people reuse them, end up writing too many tests that only test the fake
When I write a fake I write one fake that close for everyone to use that fake in their tests - I know how the thing it is faking works in production and model that behavior except for whatever it is that makes it unsuitable to use in real tests. I often do add verification helpers - you can ask what data was written at any point (including concatenating several writes) for example.
pdimitar 18 days ago [-]
Guilty as charged, I mix them both freely; I use fakes but also assert that a function has (not) been called [N times]. The latter is mostly testing implementation but there are rare cases when it's too much work to mock a quality 3rd party library (say, an HTTP client) but you still want to mock parts of it and ensure they get called.
Not sure why mixing up both things makes the conversations tortured though? Can you clarify?
shayonj 19 days ago [-]
+1 for this being common knowledge but there is still decent bit mocking that happens IME.
withinboredom 19 days ago [-]
Mocks are great for testing specific code paths, but if you need to do that, there is usually a better way of doing it. Mocks hide contractual incongruities. For example, a function that returns either "fizz" or "buzz" that gets mocked out. If it gets changed to also return "bar", and you forget to update the mock, you've got a time-bomb in your code.
9rx 19 days ago [-]
> If it gets changed to also return "bar", and you forget to update the mock, you've got a time-bomb in your code.
In that case the problem isn't so much that you forgot to update your mock as much as the changed function no longer satisfies the interface. Ensuring that an implementation properly satisfies the interface is not within the scope of tests where mocking would be useful.
withinboredom 19 days ago [-]
> Ensuring that an implementation properly satisfies the interface is not within the scope of tests where mocking would be useful.
As I mentioned, if you need mocks to test a code path, you are probably "doing it wrong" as there are much better ways to do that. Such as refactoring the code so you can test just that code path. Mocks are a code smell, IMHO. In other words, if you need mocks, then the interface/contract is a part of your code -- and you should test it.
9rx 19 days ago [-]
> if you need mocks to test a code path, you are probably "doing it wrong" as there are much better ways to do that.
Most code paths you want to test are error branches, and creating real error conditions is hard to completely unrealistic (e.g. causing hardware malfunction on each test run). What is the better way in your mind?
Mocks can only exist at the integration points, fundamentally. There is a natural interface there, an interface that you are not supposed to be testing. Those tests already exist around the module that provides the integration. You don't need to test it again.
If a mock standing in for the real thing causes ill-effects, you've done something wrong. You're not supposed to be testing the mock while using the mock. If your mocks need tests (they probably do!), they should be tested outside its use.
withinboredom 18 days ago [-]
Heh, when you are working on hardware, you usually build a real test device and yes, actually cause real hardware faults. Mocks or tests will not prepare you for the real thing; as the hardware fault you detect is usually just the surface of the problem. Let's examine a practical example where a disk becomes full. Suddenly, file creation will fail, as will writes, yet, how do you handle that? In isolation, you may mock that condition out so you can test it. You handle the error in isolation, but this is actually quite dangerous. Your application should fail explosively -- don't be like some popular databases that just continue as though nothing happened at all, corrupting its state so it will never start again.
Generally, if you can detect a hardware fault in your code, crash unless you know for certain that you can actually recover from it -- meaning you know what the problem is (can somehow detect the difference between a missing disk or a missing file). 99.9% of the time, you cannot recover from hardware issues with software, so it is pointless to actually test that case. Please, for the love of working software, just crash so the state doesn't get corrupted instead of trying to overengineer a solution.
9rx 18 days ago [-]
> Heh, when you are working on hardware, you usually build a real test device
All software works on hardware. Your web application doesn't need a test device, though. The hardware is already tested. You can treat it as an integration point. But even if you did create a test device for whatever reason, that's a mock! Which you say is to be avoided, and that there are better ways, without sharing what those better ways are...
> Please, for the love of working software, just crash so the state doesn't get corrupted instead of trying to overengineer a solution.
While you're not wrong, you need to test to ensure that it actually crashes. All defined behaviour needs to be tested, and you have defined behaviour here.
withinboredom 18 days ago [-]
> Which you say is to be avoided, and that there are better ways, without sharing what those better ways are...
That's because it would better fit in a book than an HN comment, not because I don't want to answer. Basically the gist is to write "obviously correct" code that doesn't need to be tested, along with an architecture that lends itself to being testable without mocks.
Most people tend to write an interface and then inject a concrete type that could also be a mock. I've seen tests written this way that need to mock out 20-60 things just to test the one thing they want to test.
In most web frameworks I've worked with, this is mostly unavoidable as most frameworks provide dependency injection that is natural to mock.
If you aren't using a framework that has an opinion on how things should work, mocks can be avoided completely through different techniques, such as test harnesses, replacing dependencies with alternative implementations (such as in-memory queues instead of cloud services, sqlite instead of heavy databases, etc), etc. Sometimes you don't have any choice but to not use mocks, for example, distributed systems usually avoid mocks for certain kinds of tests because they simply can't be emulated very well -- either due to latency, network partitioning, or network failures that are just too numerous to mock out (similar to the disk issue I was referring too earlier). You don't know if a node is down, or the cable got cut, and need to behave appropriately to avoid split-brain scenarios. In these cases, test harnesses that can emulate specific scenarios are much better.
9rx 18 days ago [-]
> Basically the gist is to write "obviously correct" code that doesn't need to be tested
I don't see how that follows. The purpose of testing is to document the code for the understanding of future developers, not to prove correctness. The only 'correctness' a test proves is that the documentation is true. Which is still incredibly useful, as I am sure you are painfully aware if you have ever dealt with legacy forms of documentation (e.g. plain text files, Word documents, HTML, etc.) that quickly become out of date, but is not a statement about the code itself.
> such as test harnesses, replacing dependencies with alternative implementations (such as in-memory queues instead of cloud services, sqlite instead of heavy databases, etc), etc.
These are all mocks, ultimately. Some desperately try to give them different names, but it is all the same at the end of the day.
withinboredom 18 days ago [-]
> The purpose of testing is to document the code for the understanding of future developers, not to prove correctness.
Hmm. I've never seen tests with that goal in mind, except for behavioral tests that test the acceptance critera.
> as I am sure you are painfully aware if you have ever dealt with legacy forms of documentation [...] that quickly become out of date
I have, but allowing that to happen is a culture-issue, not something that is guaranteed to happen. When I open PRs to open source software, I always include a PR to the docs if it changes anything. At work, updating the docs is part of the default acceptance criteria and is usually the thing we do before writing any code, and goes through a PR process just like the code. But, we service enterprise customers, so we aren't going to be giving them code or tests to understand how to use our product.
> These are all mocks, ultimately.
This is a software field and there are specific words with specific meaning; trying to shoehorn things that aren't those things to generalize a meaning is acceptable when teaching. It isn't acceptable when working on those things. In other words, I would accept this if trying to explain the concept to a junior engineer, but not from a senior engineer to a senior engineer.
> it is all the same at the end of the day.
No, not at all.
9rx 17 days ago [-]
> I've never seen tests with that goal in mind
Then you've never seen a test, I guess. That is the only goal they can serve, fundamentally.
> I have, but allowing that to happen is a culture-issue, not something that is guaranteed to happen.
Mistakes are guaranteed to happen given enough output/time. No matter how hard you try, you are going to make a mistake at some point. It is the human condition. In the olden days one might use a proofreader to try and catch the mistakes, but with the advent of testing a computer can do the "proofreading" automatically, leaving the human effort to be pointless.
Maybe in the age of LLMs we can go back to writing documentation in "natural" language while still using machines to do the validation work, but then again if you write code you probably would prefer to read code. I know I would! The best language is the one you are already using. Having to read code documentation in English is a horrible user experience.
> This is a software field and there are specific words with specific meaning
Sure, but in this case you won't find any real difference in meaning across the vast array of words we try to use here. The desperate attempts to try and find new words is to broach silly soundbites like "mocks are a code smell", so that one can say "I'm not mocking, I'm flabbergasting!", even though it is the same thing...
withinboredom 17 days ago [-]
> Then you've never seen a test, I guess. That [,not to prove correctness,] is the only goal they can serve, fundamentally.
I cannot wrap my head around this statement. It's literally in the name: "test" as in to prove something works... hopefully as designed.
> Mistakes are guaranteed to happen given enough output/time. No matter how hard you try, you are going to make a mistake at some point.
Yep, and they do. Its really easy to figure out which one is right: if the docs say that something happens, it happens. If the code doesn't do what the docs say, the code (and the tests) are wrong; and not the other way around.
> Having to read code documentation in English is a horrible user experience.
It's the difference between intention and action! I worked with a guy who opened PRs with totally empty descriptions. It was annoying. When I was reviewing his code, I had to first figure out his intention before I could understand why there was a PR in the first place. Was he fixing a bug, adding a new feature, or just writing code for the hell of it? ... nobody knew. Then, when you spotted a bug, you had to ask if it was a bug or on purpose, because you didn't know why the code was there in the first place.
Documentation is that living PR description. It doesn't just tell you WHAT exists, but WHY it exists, what purpose it serves, why that weird little line is the way it is, etc., etc.
9rx 16 days ago [-]
> It's literally in the name: "test" as in to prove something works...
The documentation is what is under test. It proves that what is documented is true. It does not prove that the implementation works. This should be obvious. Consider the simplest case: A passing test may not even call upon the implementation.
I have most definitely seen that in the wild before! More times than I wish I had. This is why TDD urges you to write tests first, so that you can be sure that the test fails without implementation support. But TDD and testing are definitely not synonymous.
> Its really easy to figure out which one is right: if the docs say that something happens, it happens.
Under traditional forms of documentation, you don't have much choice but to defer to the implementation. With modern documentation that is tested, typically the documentation is placed above the implementation. Most organizations won't deploy their code until the documentation is proven to be true. The implementation may not work, but the documentation will hold.
> I worked with a guy who opened PRs with totally empty descriptions.
I'm not sure PRs fit the discussion. PRs document human processes, not code. Human processes will already typically be in English (or similar natural language), so in the same vein the best language is the one you are already using. That is not what we were talking about earlier; but, granted, does do a good job of solidifying the premise.
pdimitar 19 days ago [-]
Completely agreed, that's why I have advocated for periodically (once a month) revisiting the test suites with them by doing re-captures of real data. There are frameworks that help with this.
Is it a chore? Absolutely. But peace of mind is important.
SvenL 19 days ago [-]
The person who changes it should write a test to verify his change, right? Ultimately the person should write the test in the same file as the previous tests, chances are high it will be seen there. And even if not, the person doing the change should write a test, so no time bomb there?
And yes, people forget to write tests, sure, but even then it would be a time bomb without mocks.
I‘m not a friend of mocks either, but most examples here are not really an issue with mocks.
withinboredom 19 days ago [-]
Generally, mocks are quite far away from what they are mocking. You might mock out a library, for example. I can guarantee you that the library author will not show up at your workplace to update your mocks.
btilly 19 days ago [-]
I've found that replacing the database with in memory SQLite for tests is a sweet spot. Almost as fast as a mock, catches a lot of database issues. And it's really easy to do if you're using something like Django that makes automatically generating database migrations easy.
It won't help you with database specific differences. But there should be very few of those if you're using a framework that abstracts away the database. Like Django.
rikthevik 19 days ago [-]
> But there should be very few of those if you're using a framework that abstracts away the database.
But I really want that database-specific behaviour. :) PostgreSQL does so many amazing things (recursive CTEs, jsonb, etc) that actively make our system better. If there was a fork of Django that optimized for leveraging advanced postgres features, I'd use it.
Unit tests work well with PGlite, at least in the TS/JS world where it natively sits at the moment. You can have a unique Postgres instance for each unit test that's up and running in just a few ms.
It's possible to use PGlite from any language using native Postgres clients and pg-gateway, but you lose some of the nice test DX when it's embed directly in the test code.
I'm hopeful that we can bring PGlite to other platforms, it's being actively worked on.
The other thing I hope we can look at at some point is instant forks of in memory databases, it would make it possible to setup a test db once and then reset it to a known state for each test.
(I work on PGlite)
smackeyacky 19 days ago [-]
You can just simply run postgres, why bother with pglite?
postgres installs easily on WSL2 or whatever Linux distribution you're using.
fweimer 19 days ago [-]
You don't even need to install it. It doesn't take a lot of code to run initdb to create a temporary instance, write a suitable configuration file, launch the postmaster process, and delete the temporary database directory tree after terminating the database after testing is complete. On a not-too-fast Linux system, the time for all that is around half a second. Too much overhead for individual unit tests, but fast enough to run once per test suite run.
rikthevik 19 days ago [-]
Agreed. Running docker-compose and postgres-alpine works just fine.
devoutsalsa 19 days ago [-]
I want test with the same API as my database, which means using the same database. Setting up an instance of Postgres/MySQL/whatever isn't hard. Maybe harder if you're using an online database that doesn't fit into a container, but that's a different problem.
lmm 19 days ago [-]
Embedded/in-memory Postgres for testing is easier than you might think.
zeroonetwothree 19 days ago [-]
Sqlite supports recursive CTE
rikthevik 19 days ago [-]
I think you're missing the point.
I want to develop on postgres and test on postgres because I run postgres in production and I want to take advantage of all of its features. I don't understand why a person would 1) develop/test on a different database than production or 2) restrict one's self to the lowest common denominator of database features.
Test what you fly, fly what you test.
maxbond 19 days ago [-]
And JSONB, but you can't insert or update in a CTE like you can with Postgres.
cryptonector 19 days ago [-]
SQLite does have recursive CTEs, but yeah, everything else that's all about types other than the ones SQLite supports...
pdimitar 19 days ago [-]
I don't know what kind of magic fairy dust Django is but I've found the differences between SQLite and PostgreSQL too big to be worth it, in at least 3 other programming languages that are not Python.
Sounded good at first but we were quickly overwhelmed with false positives and just opted for Postgres in a VM (this was before Docker was a thing).
herpdyderp 19 days ago [-]
I've discovered the same thing: can't use SQLite to mock PostgreSQL. PGLite looks super promising for this, but I haven't tried it yet at scale.
btilly 19 days ago [-]
Django is a Python-based ORM that works really well and has a large community.
If I have to do anything CRUD like, I'll use Django. For reporting apps, I prefer native SQL.
phendrenad2 19 days ago [-]
This is a good thing to be aware of when choosing a database, but I think most of the time people just reach for Postgres because it's the "standard".
pdimitar 19 days ago [-]
PostgreSQL stops many data bugs at the door due to being so strict -- something many programmers start rediscovering is a good thing by choosing stricter programming languages with time.
I love SQLite to bits but the test harness I have to put around my apps with it is a separate project in itself.
phendrenad2 19 days ago [-]
Somehow this is my first time hearing about that benefit of Postgres! (despite having googled for explanations of why people were switching to it). Being on HN proves more valuable than google once again...
pdimitar 19 days ago [-]
Glad to help!
On a philosophical / meta level it's all quite simple: do your damnedest for the computer to do as much of your work for you as possible, really. Nothing much to it.
Strict technologies slap you hard when you inevitably make a mistake so they do in fact do more of your work for you.
DanielHB 19 days ago [-]
The problem with this approach is that SQLite dialect is not the same as most production setups. Even if you use an ORM you often have manual queries or you are using a feature that is only supported in some databases (like geospatial queries)
isbvhodnvemrwvn 19 days ago [-]
And sqlite is extremely lax by default (and even non-defaults are not great).
3pt14159 19 days ago [-]
I'm surprised you say so btilly. I've found the differences in between SQLite and Postgres to be large enough to bring up issues that are hard to find. Timestamp handling, non-trivial indexes, etc.
Too 19 days ago [-]
As many others already stated, there are more than just small subtle differences that will bite you with this approach. Eventually forcing you to develop unnecessary compatibility abstractions or downgrade to less efficient lowest common denominator feature set to make it work. It's not worth the effort.
Especially since launching postgres is equally easy and fast as sqlite. Docker can help with sandboxing. What is left to gain? 100ms shorter startup time or keeping your unit test executables as single binaries? Irrelevant.
anarazel 19 days ago [-]
> Especially since launching postgres is equally easy and fast as sqlite
It's definitely not as fast to start postgres as it is to start sqlite. Pretty much inherently - postgres has to fork a bunch of processes, establishes network connectivity etc. And running trivial queries will always be faster with sqlite, because executing queries via postgres will require intra-process context switches.
That's not to say postgres is bad (I've worked on it for most of my career), but there just are inherent advantages and disadvantages of in-process databases vs out-of-process databases. And lower startup time and lower "dispatch" overhead are advantages of in-process databases.
jmmv 19 days ago [-]
Came here to say this: I’ve found that using SQLite for tests is a good sweet spot, and maintaining dual DB implementations ensures my business logic remains generic. But there is no replacement for running the tests agains the real database periodically (maybe only in CI for example).
Why is maintaining dual DB implementations worth it? Like what is the tangible benefit of "my business logic remains generic"? Do you have to deploy the app in different environments with different DBs?
19 days ago [-]
MetaWhirledPeas 19 days ago [-]
> I've found that replacing the database with in memory SQLite for tests is a sweet spot
Does that cover the first two examples brought up by the article? Constraint violations and default value handling.
chrisandchris 19 days ago [-]
I don't know...
How if you rely e.g. on CASCADE and foreign keys, which are not on by default kn SQLite? I think then things start getting complicated and testing that layer gets difficult.
btilly 19 days ago [-]
Django allows for the common use cases for that to be abstracted out. So the SQLite version will do things manually, while PostgreSQL can use the database feature.
That's the reason to use an ORM. It abstracts away things like that.
A hobby project of mine (in Elixir) uses SQLite as primary database. Each test runs in its own fully isolated SQLite database. No mocking (or transaction rolling back) needed. Most of these tests take less than 1ms to run (and when they take longer, it's because of something else).
This kind of setup makes the usual Ecto Sandbox approach feel slow, but I do agree that the way Elixir approaches this is great!
ryan-duve 19 days ago [-]
Do you have a link you can share that demonstrates the details of this approach?
napsterbr 19 days ago [-]
The entire project is open source, so sure! :D
I actually have two projects that use this approach, FeebDB (which is the library I wrote to manage a "one SQLite database per client" approach) and HackerExperience (a game under development that uses FeebDB).
The overall idea is simple:
1. Before tests start running, create a prop of each database.
2. The prop contains the "starting database" for each test. It may contain seed data (optional).
3. For each test, copy the prop and assign it a unique shard identifier (say, cp /props/user.db /test_data/user/874125.db).
4. The test knows the `shard_id` and can do whatever it wants with it; no one else will bother it.
5. Once ExUnit is finished, delete all shards.
Both projects follow a similar approach (I wrote it first in FeebDB and copied into HackerExperience, which has some sections commented out -- I need to clean up this part of the codebase).
For both projects, you will find steps 1/5 in `test/support/db.ex`, step 2 in `test/support/db/prop.ex` and steps 3/4 in `test/support/case/db.ex`.
Email is in profile in case you have follow up questions/comments :)
brightball 19 days ago [-]
Yea, I've been working on a side project for a while and I keep building it with Rails because there's so much about Rails that speeds up this particular type of project...
But the testing setup in Elixir is just exemplary compared to everything else I've worked with. I fight myself daily on just rebuilding the entire project in Elixir.
javcasas 19 days ago [-]
For tests, first use a real database. If you can't because the database just doesn't want to (ahem mongo ahem) then use a fake. If you can't use a fake, stop lying and use a fake.
Mocks for databases are extremely brittle and complicated.
btilly 19 days ago [-]
Ahem MongoDB? I must admit, I've never understood what MongoDB's real use case is supposed to be. Whenever I've looked at it, there has always been a better alternative. When I've had to use it, I've had to debug performance problems that wouldn't have existed with alternative solutions.
It sounds cool. But running software isn't about sounding cool.
A decade ago, it was really clear. As https://aphyr.com/posts/284-call-me-maybe-mongodb explains, MongoDB didn't really work. But they've fixed that. So now it runs acceptably accurately. I just don't know why I'd ever want to.
lelandbatey 19 days ago [-]
I worked full time on a pretty seriously-trafficed product based on MongoDB for 3 years and I still don't know of anywhere I'd want to use MongoDB. I'd basically always want either a DB with a schema or a super fast opaque-store style cache.
Also, their hosted offerings (MongoDB Atlas) were not well operated and they took down our company for 2 days. Our MongoDB instance stopped accepting new connections and restarting our instance via their control panel didn't bring it back up and then their support literally said "we don't know why this happened or how to fix it" for like a day and a half, while we were on their highest tier ultra platinum support contract. Ultimately, we had to do 24 hours of research and then tell their own support how to fix the problem using their special shell access. If I recall correctly, it was some issue with WiredTiger (an internal component of the Mongo version we were using).
After that experience, I'd never use anything produced by MongoDB for anything; we moved everything that we possibly could out of Mongo and into more traditional RDBMS (PostgreSQL) and never had to deal with issues like that again.
winrid 19 days ago [-]
This is a pretty common experience with mongo tools and support, sadly.
Recently I ran into a tool that spat out "you probably want to use this option!". We paid for enterprise support so I asked why this option was not documented, and they said because it is dangerous. Can you imagine if the "-f" for rm wasn't in the man pages? Ridiculous
computerfan494 19 days ago [-]
MongoDB ships with horizontal sharding out-of-the-box, has idiomatic and well-maintained drivers for pretty much every language you could want (no C library re-use), is reasonably vendor-neutral and can be run locally, and the data modeling it encourages is both preferential for some people as well as pushes users to avoid patterns that don't scale very well with other models. Whether these things are important to you is a different question, but there is a lot to like that alternatives may not have answers for. If you currently or plan on spending > 10K per month on your database, I think MongoDB is one of the strongest choices out there.
Also want to add that you can definitely use MongoDB (or any other database) in a way that doesn't scale well. I have personally run MongoDB at petabyte scale and had a relatively great experience.
btilly 19 days ago [-]
For most use cases, PostgreSQL is cheaper and faster to run.
A lot fewer are "web scale" than think they are. For ones who are, there are other competitors like Snowflake that work well.
As for scaling, it depends what you want to do. If you want to do things that look like joins, maybe it wasn't the right choice. Though I've definitely succeeded.
jcgrillo 19 days ago [-]
I prefer MangoDB from an operational standpoint. But I'll concede that, as you point out, MongoDB is after all WebScale.
nerdponx 19 days ago [-]
I used MongoDB at a company where engineering policy was strictly that MongoDB was our only allowable database, including in cases where it was clearly not the best choice.
Were there good aspects? Sure... kind of. It was super super easy to just throw data into the database. Need to add a field? Who cares, just add it. Need to remove a field? Who cares, just remove it -- as long as the calling code is null-safe. And it was likewise super easy to store denormalized data for fast lookups when it made sense to do so, as well as deeply-nested things and arbitrary JSON blobs synced from our CMS. And queries could be stored and manipulated as actual Python dicts instead of opaque strings, whereas you normally need an ORM or query builder to do that in SQL. And you could get the best-of-both-worlds-ish with the ODMantic framework, where you could spit out a warning (but not an error) and recover gracefully if there happened to be bad data in the database.
Basically it allows you to forego any foresight or serious design, and instead just throw shit together. Which is great for getting a prototype going really fast and giving the appearance of a highly productive team. That is, until you run out of new microservices to prototype and now you have to start adding features to existing code and fixing bugs. IMO it was completely "penny-wise and pound-foolish" with respect to developer time, but I can at least see the appeal if you're a particular type of engineer operating under a particular set of incentives.
As for using MongoDB for things it was actually meant for (writing a ton of schemaless JSON blobs really fast and figuring out reads later), I have no idea because we never used it for that and had nothing really resembling that use case.
karmakurtisaani 19 days ago [-]
In my company they use it to store a large number of varying sized jsons. You can create indexes to make search queries lighting fast. It's also been extremely stable.
I wasn't involved in setting it up tho, so can't say anything about how difficult it is to work with on the technical side.
dakiol 19 days ago [-]
I don’t know. Like any other tool, it depends on how you use it. I worked for a unicorn with presence in multiple countries and they were using Mongo as main db in multiple microservices. Around 1500 engineers. It worked fine. I’m not saying it was justified, but never had perf. issues.
nerdponx 19 days ago [-]
We had some pretty bad performance with $lookup queries but it all magically went away after adding some indexes. I have a lot of grievances with teams that use MongoDB, but relatively few grievances with MongoDB itself along these lines.
javcasas 19 days ago [-]
Sometimes you cannot choose. Sometimes you are handed down some decisions already made, and reverting them may have a cost the business doesn't want to pay.
LtWorf 19 days ago [-]
They used it in a startup where I used to work.
Their data was fully relational, and they were doing the one thing that really really kills performance in mongodb: growing documents.
Also it had no constraints so the data was all fucked up by the various bugs that were in the code over the years.
Ah yes they used sharding. Probably there wouldn't have been a need for it if they had just used postgres, since the data was not that much.
winrid 19 days ago [-]
PG and Mongo will vertically scale about the same depending on your queries. They were probably using sharding with tiny instances which is dumb. Also, large documents doesn't really hurt performance with mongo, except maybe with writes, or with large array fields due to replication implication.
LtWorf 19 days ago [-]
Not large documents, but documents where the most common operation is to add a number to a list of numbers.
winrid 18 days ago [-]
Yeah that's a bad idea. You should instead just create a doc for each number. Each time you add a number the entire list needs copied to disk and to each secondary, so the cost grows quickly with each write.
LtWorf 18 days ago [-]
I know, but the "seniors" in that place didn't
cryptonector 19 days ago [-]
It's trivial to spin up a PG instance on an AF_LOCAL (Unix domain) socket, create the DB, populate it with schema and test data, then run your app's tests. If all the tests have different non-overlapping data then you can even share the one instance for all the tests.
a57721 19 days ago [-]
> If you can't because the database just doesn't want to (ahem mongo ahem)
Could you elaborate what is the problem with mongo here? You can run it with testcontainers like any other database.
nerdponx 19 days ago [-]
This is one of many reasons why "isolating" I/O within your application is so important, whether that means pushing it to the edges or otherwise carefully corralling it in known-critical areas where it can't be removed.
A static type system that assists with that design style is nice as well. It's really useful to be able to look at a function and know right away what I/O it performs, if any. This is probably my #1 missing feature in Python, now that static type hints work well enough for most purposes.
That said, setting up and maintaining a database test harness in the developer environment can be really annoying and time consuming. Even moreso when your database is a proprietary cloud thing (e.g. Snowflake). But I have never ever regretted spending the time to set it up, whereas I have definitely regretted not having it.
Sometimes it's unavoidable that your "unit" tests need to access the database to be able to properly work out the "unit" under test. Watching those tests go from failing to passing is such a sweet feeling, it makes the pain of maintaining the test harness feel worthwhile. And then of course when you hit the actual integration tests in CI, everything almost always passes, because your unit tests are that good.
lopatin 19 days ago [-]
Some valid points here. I’m awaiting a response post in 2 days called “Database mocks are worth it” which will also have some valid points. Perhaps I’m a little burned out by tech blogging lately.
breckenedge 19 days ago [-]
It’s healthy to be skeptical of patterns you read about. Don’t lose that. It’s worse to cargo-cult them.
diggan 19 days ago [-]
I mean, that's kind of engineering for you, choosing between different tradeoffs. I guess reading blogposts with various points is kind of the point, so you can easier know when to apply what, and what tradeoffs they come with.
jaza 19 days ago [-]
You should do, and I usually do, both "pure" unit tests (with all I/O - database calls, reading/writing local files, 3rd party API calls, etc - being mocked), and integration tests (with ideally all I/O really happening). More of the former, and less of the latter, in line with the "testing pyramid" approach.
There is value in testing "assuming that the database returns 3 rows with Foo IDs and Foo Descriptions, my code should serve a JSON response containing the Foo IDs and the Foo Descriptions concatenated together". And there is also value in testing "when my code executes SELECT id, description FROM foo LIMIT 3, the real database should return 3 rows with Foo IDs and Foo Descriptions". Granted, there's also a big advantage in that the former can run much faster, with zero environment setup / teardown complexity required. But my main point is, different test suites should test different things.
However, if you're unable or unwilling to write two different test suites, and if you're able to easily and reliably have a database available in dev / CI, then ok, I concede, just write integration tests.
peterldowns 19 days ago [-]
Well said, agreed. One point from my experience — once you can do "as fast as unit" or "as fast as pure" tests that exercise your database, you can basically stop doing integration tests, because using the real database gives you so much confidence in the correctness of your system.
jokesterfr 19 days ago [-]
We do both at work (~300k sloc, >70% unit test coverage).
There are unit tests with mocks on repository calls, and integration testing with a fresh pg database bootstraped with docker and .sql seeds.
One is fast to code, one gives more realistic feedbacks on real life execution. Both solve issues.
This is not the end of the game tho, as performance and back pressure issues arises whith a successful project... Testing is a long and interesting process.
nerdponx 19 days ago [-]
Mocking out a repository is IMO a lot less bad than mocking out an actual database connection. That's probably one of the biggest arguments in favor of the repository pattern: the ability to replace the repository with a test double of some kind.
senex 19 days ago [-]
Does anyone have experience making tests against real databases fast?
I resonate with the sentiment of this article, but have struggled to find an alternative that’s fast enough as the test suite grows, isn’t flakey in CI, and is able to share the production schema definition for relevant relations.
I’d love to hear more from anyone that’s solved for some of these constraints!
TobiasBales 19 days ago [-]
I think one helpful thing is what rails calls transactional tests (https://guides.rubyonrails.org/testing.html#transactions). It basically does the database setup (migrations and seeds) and then executes the tests in a transaction that is rolled back at the end (and thus never committed). This helps with speed and also ensuring that tests don't have as much accidental codependence as they might otherwise.
If you use read replicas in production code this can become tricky though since the transactions don't commit they never become visible to the reader or even different connections to the same database
lbriner 19 days ago [-]
A real database should not be slow. Even with our tests running against a hosted SQL Server on a separate server, the database is never the slow part. For other tests, we run with the same database in a local Docker container with Docker Compose and it is fast and isolated/resettable.
Most tests should be unit tests, which are super fast. Integration and UI tests that might use the database should be fewer and if the database is slow, it might be related to your specific application or unoptimized database queries, our database calls are usually < 10ms
nerdponx 19 days ago [-]
What I've done is make a clone of the real database, with a sample of data that has enough variety/size to test whatever it is you need to test, but no bigger. It definitely takes some thinking, planning, and writing of code, but it's worth doing.
Unfortunately I maintain an app where the database (read-only) is Snowflake, and being more of a "warehouse" than "database" there's always a lot of overhead in running any query at all. Even just `select 1` can take a few seconds. So there's only so much you can do with that, but setting up your data so that tests can be parallelized helps as well.
However your tests against a proper OLTP database should be plenty fast, unless your app itself is slow or your test fixtures require some really complicated setup.
pdimitar 19 days ago [-]
On my machine -- which is quite middle range at this point, not even high end -- I get by just fine up to 3000 tests, with Elixir at least. When I was at that contract the ~3200 tests ran in something like 40 seconds.
What kinds of troubles do you have with using a real DB for testing?
If you happen to use Postgres, the approach is ultimately portable: it uses Pg database templates (also, regarding perf, the author recommends using a ramdisk and turning off fsync on your test DBs; you'll see this in the project readme). But you’ll have to write the code yourself.
peterldowns 19 days ago [-]
Author here, thanks for linking my project — I hope it's been working well for you!
jph 19 days ago [-]
This is so great Peter-- first I've heard of pgtestdb and it's immediately useful for me. How can people donate money to the pgtestdb project? Or hire you for consulting for pgtestdb? I'm joel@joelparkerhenderson.com and would love to help fund your work.
peterldowns 19 days ago [-]
I sincerely appreciate the sentiment and the offer — but pgtestdb is MIT license, actual, for real, not kidding, open source. No payment necessary; please enjoy.
(I'm always open to discuss potential contracts or consulting opportunities. If you have one that you think might be a good fit, my email is in my profile here and on github and on my homepage.)
avg_dev 18 days ago [-]
How much of a commission do I make? ;)
thom 19 days ago [-]
This is what I do, it has an overhead of about 10-20ms per test and I’ve had zero flakiness. Absolute no brainier from my point of view.
peterldowns 19 days ago [-]
Really glad to hear it's been working for you with zero flakiness! If you ever do run into any trouble, or have any suggestions for improvements, come on over to the github issues page :)
t-writescode 19 days ago [-]
My app: Kotlin, Ktor, Exposed
Databases:
- Production/Dev: Postgresql
- Test suite/CI: SQLite
Performance:
- ~1000 tests
- ~5 seonds
For testing anything below the ktor layer, I create and roll back transactions for everything, including db table creation (though I should probably fix that, just bigger fish to fry in this project)
For the SQLite / PostgreSQL differences that Exposed doesn't naturally handle, namely jsonb, during the CREATE TABLE phase, I have my test harness create a regular json version of the table rather than a jsonb one. Exposed's ORM seemlessly handles that swap out after table creation. There's a bit of manual upkeep in making sure the *Json test version of the tables are kept up-to-date to the production non-Json version; but that's the sort of thing that's caught on the very first test and fixed in minutes, most of the time (sometimes cascading table creation bites me).
I will eventually probably add a flag or something so the test suite can run against a separate testing partition in my PostgreSQL docker container, but I haven't done that yet.
jasonjmcghee 18 days ago [-]
I've been bitten by using an in memory database instead of the one production uses. If you're using GitHub Actions it's as easy as using "hoverkraft-tech/compose-action" and providing a docker compose yaml with a postgres image a step.
t-writescode 17 days ago [-]
Oh certainly! The problem is that it's yet another thing to add to the pile of things I need to configure, etc, at this point and it's of lower priority than many, many, many other things - and will easily take as long as so many other of the steps, which are all taking a long time due to learning curve.
maxmcd 19 days ago [-]
Tests are usually embarrassingly parallel. Instead of creating one test db (app_test) create many (app_test_0, app_test_1, ...). Run tests in many threads/processes, db per thread.
This works in a lot of cases. In some cases this might not address your bottleneck.
Also someone should write a real, performant, in-memory postgres storage driver. Then we can all be happy (with pg at least).
pitah1 19 days ago [-]
I've taken a stab at making a solution for it via https://github.com/data-catering/data-caterer. It focuses on making integration tests easier by generating data across batch and real-time data sources, whilst maintaining any relationships across the datasets. You can automatically set it to pick up the schema definition from the metadata in your database to generate data for it. Once your app/job/data consumer(s) use the data, you can run data validations to ensure it runs as expected. Then you can clean up the data at the end (including data pushed to downstream data sources) if run in a shared test environment or locally. All of this runs within 60 seconds.
It also gives you the option of running other types of tests such as load/performance/stress testing via generating larger amounts of data.
theogravity 19 days ago [-]
We use Testcontainers (https://testcontainers.com/) in our node.js / vitest / kysely stack. Really fast to spin up a temporary postgres instance, and we use kysely migrations to init and seed the db.
fatbird 19 days ago [-]
Not sure if this is still a valid approach, but on a large Django site with a lot of unit tests, the cumulative setup/teardown/reset cycles was killing us. We found that setting each test to be wrapped with a transaction that was aborted on teardown, caused the per test cleanup to drop radically. We also kept a canned database for testing so that running the test suite didn't have a large startup penalty to populate the database with test fixtures. Keeping that ready db between runs also sped things up a lot for devs.
shayonj 19 days ago [-]
Doesn’t directly answer your question but at least in Postgres I am curious about UNLOGGED mode and see if it results in faster specs. Trade off being, crash recovery doesn’t work but that’s fine in CI.
There is also something to be said about keeping database transactions atomic (no 3rd party network calls, etc) to keep flakey specs to none. I have some ad hoc thoughts on this, will try to frame it proper in a post.
If you come up with any better options please let me know so I can update this readme!
peibye 19 days ago [-]
Depends on how fast you need them to be. I've had success using docker postgres , set up the same as production, where each test creates a connection to it's own test specific database, or is written to run inside a transaction that can clean up after itself using a shared database. In my experience, slower tests that use a real db almost always save net positive dev time.
3pt14159 19 days ago [-]
Tests are embarrassingly parallel. If you can split up the load for CI then at the limit the slowest response is your slowest running test. I wish more tooling would expose this out of the box, but with some effort it is possible.
This is better than nothing but if you have to run migrations before your database is usable, you want an approach like the one I use in pgtestdb — use one server for all your tests, but give them each different databases by cloning from one that's already set up with all the schemas correctly migrated.
I did something similar with pg_tmp as well. I also had a lot of tests around migrations, so there was a lot of tests that would 1) ask for a DB of version x 2) insert things into the DB, etc 3) migrate to version y 4) ensure the DB is appropriate (note: this was not a SaaS).
jayd16 19 days ago [-]
Setup tooling to start postgres in a local docker container and point at that. Works very well and there are some libraries that abstract it for you.
I think it was Testcontainers.
worik 19 days ago [-]
This risks unleashing a bit of a "religious" war.
I agree that tests should occur in the environment they are to be run in, when ever possible. Otherwise not...
There is no science in testing. There are no standardised definitions. We agree it is good, we often ignore the downsides (they exist - there are costs to tests)
I have struck examples where the testing frameworks of a language (Dart in this case) ruled out using a live HTTPS client to test. I kept getting 404 errors running it against my (test) site. I made a bug report. The good folks at Google (who I am sure are very smart - if a little stupid) had decided that the HTTPS client would always return 404 when run as part of the testing framework.
I argued, but apparently I am not so smart (I am less stupid, IMO). The HTTPS client testing had to either occur outside the test framework, or I rewrite the HTTPS client, or I wrote a mock up of some sort, all things I would have to pay for with time and effort - for no gain.
I get the point about mocking. But I am in business. I am not interested in always testing in isolation, (sometimes, sure), I want my tests to depend on the other parts of my business.
If the database goes wrong and my Gui tool test discovers it, that is a good thing, not a bad thing
calmbonsai 19 days ago [-]
Replace "database mocks" with ANY <X> distributed API service calls in your execution flow and there's still ZERO excuse to NOT to be using SQLite or LocalStack mocks in your basic unit and pre-integration testing.
Sure, there's no substitute for optimizing things for a particular query engine and EVERY database engine has SUBSTANTIAL quirks (cough, cough Redshift), but you should be crawling and walking before you try running.
mrkeen 19 days ago [-]
> Replace "database mocks" with ANY <X> distributed API service calls in your execution flow and ...
so far so good - you're inviting the comparison of "use a real database" to "use a real X", where X could be anything - a bank, or a nuke.
> ... there's still ZERO excuse to NOT to be using SQLite or LocalStack mocks in your basic unit and pre-integration testing.
And I'm lost. Do we hope for a BankLite and NukeLite implementations to exist?
calmbonsai 17 days ago [-]
> so far so good - you're inviting the comparison of "use a real database" to "use a real X", where X could be anything - a bank, or a nuke.
I should've used different angle-brackets. "ANY <X distributed API service call>" would've been better.
> And I'm lost. Do we hope for a BankLite and NukeLite implementations to exist?
No, but it's still worth mocking BankLite and NukeLite with the same call interface and suitable test data, variance, and jitter.
The overriding point is nothing will ever replace "the real thing", but that's not what mocks are for and it's even worth mocking 3rd party network services which can be, if not more, "quirky" than SQL query engines.
dgunay 19 days ago [-]
In addition to what I've read here in the comments (that mocks are inaccurate, etc) I'll say that mocks also tend to just be plain overly verbose, at least in my stack. We do a lot of our heaviest testing on our usecase logic, which is almost all very imperative and calling the DB + 3rd party APIs in various ways.
Mocked tests here tend to be many times as long as the actual implementation, grow with the size of the implementation, and also tend to break immediately if you change the internals of your code (because now new things are being called, or not being called). Having hundreds of such tests can make refactors and serving new requirements a lot more annoying.
The little "trick" of using SQLite to speed up integration tests has also bit me multiple times. Turns out the various RDBMSes are not in fact, all the same, and the details start to matter a lot very quickly.
astine 19 days ago [-]
Testing against a real database is an example of integration testing. Using mocks is for unit testing. Ideally, you want to do both. Unit testing entails isolating particular components for testing which is important because it let's you be more precise in what exactly you're testing. Using mocks also makes it easier to run automated tests because it means you don't need to have a database or credentials handy during the build process.
Integration testing is also important because the closer you get to running things in a production environment, the more likely you are to detect issues. Some things just won't be apparent until you start running code in production or near production. The thing to understand though, is that you want to do both if you can, not either-or.
nwatson 19 days ago [-]
Then just use a database for unit testing as well.
globular-toast 19 days ago [-]
Then you get a slow test suite. It's important to have a fast test suite to be able to do proper test driven development (which I still believe is the most efficient and effective way to write software, in general). Unit tests should be near 100% coverage. That means a lot of tests.
databases can run hundreds of test in a ms. Sure without you could get to thousands in that ms - who cares.
astine 19 days ago [-]
Because then you wouldn't be doing unit testing; you'd be doing integration testing. You'd also probably not be testing the database in a realistic configuration and thereby missing the whole point.
temporallobe 19 days ago [-]
The one thing I greatly appreciate about Rspec (in RoR apps) is its ability to fully test the data layer. I maintain several such apps and database mocks are completely unnecessary, even going so far as to doing complete db setup and teardown between cases. Other apps I work on don’t really have this paradigm, so we resort to a combination of unit testing (JUnit) — which does involve significant mocking — and behavioral testing (Selenium) which of course indirectly tests database interaction. In the end, both methods accomplish the same goal of database testing but ideally you want “all of the above” to uncover edge cases, especially with critical infrastructure apps.
tpoacher 17 days ago [-]
> Mocking is faster and often feels more straightforward.
This makes it a unit test.
> However, testing against a real database uncovers hidden pitfalls that can appear as the application matures.
This makes real-database testing not a unit test. Maybe it's integration testing.
So basically the author is against unit tests (or isn't quite sure of the difference from integration testing)
Lvl999Noob 19 days ago [-]
I faced problems with flaky unit tests that used a common unit test database for the whole project. Since all the tests ran at once, the tests would sometimes fail because of concurrency issues. I never got the time to look too deeply into it.
I wonder if there's a database that works sorta like git, giving one "commit" point at the start of every test and then "branching off" so each test can do its own thing, make transactions, commit them, whatever, and finally be able to roll the whole thing back.
If that can be done then each unit test can use a full database clone with actual valid data and not just test data that was fast enough to set up before the test ran.
peterldowns 19 days ago [-]
Yes, with Postgres at least you can do this by using a "template" database — run your migrations on it, then create new databases from that template. It's a very fast operation. For more information, read the Postgres docs:
Dupe the database. That's what I do in one of my test suites (though that's in part because the tests happen across 4 different processes so it's just not possible to keep it within one transaction).
Creating a db from a template is not free, but if there's not too much stuff in the db it's not horrendous either. And if you're using sqlite, you might be able to backup an on-disk template to an in-memory test db.
shayonj 19 days ago [-]
Starting each test in its own transaction and then rolling back at the end, and ensuring no 3rd party network calls in application logic in rails and Postgres apps have worked with a lot of success in my experience.
Volundr 19 days ago [-]
Datomic more or less works like this. I wasn't terribly impressed in general, but it did make tests a breeze.
Unless you use Clojure though it's probably not a real option.
whatnotests2 19 days ago [-]
Transaction per test is how that works in ruby on rails. Also, the rspec parallel gem will create N test databases for running tests - one per process.
masklinn 19 days ago [-]
That doesn't work if you need to commit for one reason or an other. And while in some cases you might be able to substitute subtransactions via some sort of test mode, it's not always an option depending what you're doing / testing exactly.
whatnotests2 19 days ago [-]
Nested transactions handle commits. At the beginning of a test, a new transaction is begun, and it is rolled back at the end of the test. As others have pointed out in this discussion, it's pretty standard with Ruby on Rails apps, especially those which use rspec for testing.
ajuc 19 days ago [-]
I worked on a 2-layer C++ + PL/SQL system with lots of legacy business logic in PL/SQL that was slowly migrated to 3-layer architecture with Java EJB business logic using Hibernate for database operations.
The Java business code still often had to call the old PL/SQL stored procedures, and the state of the application was serialized on the same database using Hibernate. It made for a pretty delicate db transaction handling - in Java it was mostly declarative EJB3 transactions + some custom AOP code, and in PL/SQL we had strict guidelines of never commiting/rollbacking the transaction globally - instead every stored procedure had a savepoint at the beginning and only rollbacked to that savepoint in case of a problem.
Ensuring the system adhered to these guidlines with mocks would be very hard. Instead we wrote a special test-mode-only AOP code that was wrapping around every java -> PL/SQL call in test mode, and checked if the transactions were correctly handled (it added a programatic savepoint before the call and checked if it's still there after the call). It caught some transaction handling errors that would be VERY hard to find otherways.
It also let us find possible deadlocks during testing.
BTW writing tests for PL/SQL made me love Postgres for its transactional DDL. It's very frustrating to make repeatable tests when you can't rollback DDL changes and have to undo them one by one after the test or re-create the whole database.
fcmgr 19 days ago [-]
I've had some good experience with a mix of those approaches, maybe not using mocks per se, but an "in-memory database implementation" (just a wrapper around the hash map that implements the same behaviors as a repository that deals with a real database) on one hand, and testcontainers on the other. (Still, using an in-memory db is way better than mocking, the tests are not coupled to the implementation details/the underlying model).
For simple use cases where I mostly just read/write from the database and don't expect any of those issues mentioned in the article (constraint violations or concurrency issues, because the application is quite simple tbh, plus I already have some testcontainers based integration tests for the components that deal with a real db and I reuse them) writing tests with an in-memory db implementation is quite nice - the setup is simple and running the whole test suite is instantaneous (literally something like 1-2s for a couple thousand tests, I don't need any framework for those kind of tests).
And on the other hand if I'm relying on something like an optimistic/pesimisstic lock or a framework feature, I will write a test with the real thing, using test containers. Also have a pretty good coverage with the components that deal with queues and databases specifically with testcontainers. And on top of that just a few e2e flows written using testcontainers as well.
mrkeen 19 days ago [-]
My take is that your business logic shouldn't know about your storage tech. (Like, dependency inversion 101 right?)
> Still, using an in-memory db is way better than mocking, the tests are not coupled to the implementation details/the underlying model.
Isn't this backwards? The fact that you've backed your storage with a HashMap (which is 100% what I shoot for too) means your service-under-test cannot know if it's talking to an SQL database.
fcmgr 19 days ago [-]
I think we're talking about the same thing :D? The service/business logic/whatever you might want to call it interacts with the storage via an interface - in prodcution that interface is implemented by a component that can talk to a real SQL database, in tests I can just create a wrapper around a hash map and use that.
EDIT: What I meant when I wrote that tests are not coupled to the underlying model/details is that with a mock you have to explicitly specify "when called with this return that". With an in memory database implementation you don't need to do anything, the code will just use the interface methods like "getX" or "saveY".
mrkeen 19 days ago [-]
> with a mock you have to explicitly specify "when called with this return that".
Riiiight, no I've always hated that Mockito way of doing things.
I find wrapping a hashmap avoids the need for explicit when-this-then-that bindings, because a hashmap already does the expected behaviour natively.
You can even 'integrate' your tests as deeply as you like, all standing on top of your poor hashmap. I.e. Http tests with unit test speed.
var controller = new Controller(new Service(new Repo(new HashMap())))
controller.POST(...);
assert(controller.GET(...));
brettgriffin 19 days ago [-]
I'd be really surprised if most applications that rely on databases can't just use a containerized image of their production database in any environment, including CI. In fact, the only example I can really think of is when the production database can't be Dockerized (e.g. a proprietary SaaS database like BigQuery or Snowflake).
I'm working on a project that has ~22,000 tests that operate on Docker container of the same version of Postgres used in production. In CI it completes all the database tests in under a minute.
thom 19 days ago [-]
Are these parallelised? <3ms overhead for a clean database and test code to interact with it is very good!
peterldowns 19 days ago [-]
How are you doing it? Asking because I wrote an open source project (see me ~shilling~ linking it elsewhere in this comment section) for doing this, but it sounds like you've maybe come up with an even faster way.
kerblang 19 days ago [-]
Was dealing with mocking voodoo garbage this morning as a result of fixing a bug. What a horrible mess just to claim a few lines of "test coverage" while ignoring critical db functionality. I wouldn't mind if the mocking frameworks weren't absurdly complex in their own right.
Unfortunately our "integration" tests are even worse, corrupting data all over and flaky as hell.
elevatedastalt 19 days ago [-]
This article seems to be implying that there's something special about DBs that makes them only testable directly instead of through mocks.
There's a continuum between extreme integration testing and extreme unit testing, each with their strengths and weaknesses.
There's a reason we don't do all testing through integration testing: it's slower, more flaky, and cripples developer velocity to a near standstill.
There's a reason we don't do all testing through pure unit testing: you have no idea whether the e2e plumbing works as expected.
DBs aren't special in this regard. That being said, it is true that investing in a light weight unit-testable setup for your DB is a good idea.
theogravity 19 days ago [-]
Just use Testcontainers (https://testcontainers.com/). We use it for quickly spinning up a temporary postgres instance to run our db tests against.
hamandcheese 19 days ago [-]
My #1 rule: tests should "just work". A new contributor should not have to run any setup, just a single test command. I don't care what kind of tests they are or what you prefer to call them as long as they work (in CI AND locally).
The vast majority of projects fail at this, often because some external dependency like a database than needs some manual set up.
To me, mocking out the db is a relatively cheap way to achieve rule #1. Yes, it has pitfalls, but I am unwilling to compromise on rule #1 so it's purely a question of cost vs fully automating db setup and teardown.
arp242 19 days ago [-]
I agree on the "just work" principle, but stuff like "docker compose" is easy enough to use, and "standard enough". Typing "docker compose up -d" once isn't that much effort. And make sure the connection string is configurable through an environment variable so people can use $something_else if they really want/need to.
Practicality beats purity, and all of that. Especially when it saves quite of complexity in the code itself – new contributors are going to struggle with that, too.
In short, the setup doesn't need to be manual.
hamandcheese 18 days ago [-]
The problem with this attitude (and please don't take this as a personal attack) is once a few people with this attitude come through a codebase and make their "practicality" compromises, the codebase devolves into mediocrity.
For a little more context, I work on dev tools for an org of ~400 engineers. There is always someone trying to do things the quick and dirty way and move on without consideration for how the other 399 people may be impacted.
Threadbare 19 days ago [-]
Tell me you've never heard of test containers without telling me you've never heard of..
perrygeo 19 days ago [-]
Mocking the database can be done, but only if you use the database as a generic storage layer. It's not trivial to keep the mocks updated in most applications.
Some ORMs and frameworks provide magic to do this, but as soon as you deviate from the framework it gets complex. You can build geospatial applications, time series, document, and full text search - iow leverage advanced features of Postgres extensions that most ORMs can't touch. Their behaviors can be complex and are part of the application, not simply a storage layer.
isotropy 19 days ago [-]
I've seen a db mock work when 1) there was a small subteam in charge of the OR-mapping, schema structure, and included a DBA; and 2) also a design policy from the architect that all objects had to come out of factories. Under those specific circumstances, having the mock - used solely as a per-developer object cache imitating the factory interface - was critical for unblocking the people working on business logic and front-end.
I wouldn't structure a team that way now.
kapperchino 19 days ago [-]
Just mock the repository which calls the db, and to test the queries themselves, use the real db as a docker container, or integration test with an existing db
binary132 19 days ago [-]
I think the sweet spot is simple relational abstractions that do not necessarily depend on a SQL implementation. Those are extremely quick and easy to write unit tests against and to use, and don’t burden the developer. Then the library implementing them can itself be well-tested and straightforward, and even portable to other backing stores. I’m not saying “use an ORM”.
parasense 19 days ago [-]
No... for pete sake no!
The whole point of mocks for data base is not to test schema changes, it's to test a unit of code. These most certainly do not change the additional requirement of integration testing with real database tables. That usually happens after all the unit tests, obviously, because why spin up a big heavy database test pipeline if the code logic fails.
bluGill 19 days ago [-]
Nobody cares about your precious units they care that the system works. Thus you should test units that are as large as possible. That way if you add a feature you know the old ones didn't break. I've seen too many bugs where all the unit tests passed. If you are doing TDD 1000 integration test failures are no problem - the last thing you changed broke them all: either undo or fix it.
the only reason to write a unit test is if the integration test would run too long.
viccis 19 days ago [-]
Can anyone here who might know more than me explain the difference between dependency injection and the author's suggested pattern at the end? Is it just the same thing? Seems like that's the most common way I see to isolate external services and stub them for testing.
globular-toast 19 days ago [-]
Dependency injection is where functions explicitly receive abstract dependencies that they need to do their job. This is usually like a abstract repository or factory or something. In statically typed languages you'll see things like AbstractRepository etc. When the function is called it is passed a concrete implementation, like a PostgresRepository, while in testing it might be passed a fake like FakeRepository.
This is opposed to when modules have implicit dependencies via importing concrete dependencies directly. In this case the only way to mock is to "patch" the import somehow. The trouble is software written this way usually has very strong ties to the concrete implementation and doesn't expect to be using a mock, which is why the author argues against it.
So, no, the article doesn't do dependency injection. It's gone out of fashion for some reason even though I think it's a much better way to write software (you depend on abstractions not concrete implementations).
viccis 18 days ago [-]
Makes sense. I do have a working understanding of DI and strongly prefer to use it (especially after having to write some unit tests for a external dependency heavy code base not designed with it earlier in the month), but I was just not quite understanding how those RSpec tests at the bottom were working with the instance_double. But looking at it closer now, there's no providing that stub to a function, it's just being created and used immediately.
It seems like you could use instance_double with DI though, as basically an alternative factory for your dependencies when you want to give a faked implementation to your function.
It's been ages since I've done much Ruby, but I recall them being particularly prone to monkeypatching and similar metaprogramming mocking strategies over DI or inversion of control in generation though...
elevatedastalt 19 days ago [-]
Like you explained, DI simply means that the function would depend on an abstract dependency which is supplied at runtime or test time.
In runtime you can supply the actual DB, in test, you can supply a mock DB, OR an actual DB. Both are possible approaches. It's about how you want to wire up your tests. The author seems to be arguing for using the actual DB.
elevatedastalt 19 days ago [-]
Dependency Injection is orthogonal to what the author is talking about.
Like the other commenter explained, DI simply means that the function would depend on an abstract dependency which is supplied at runtime or test time.
In runtime you can supply the actual DB, in test, you can supply a mock DB, OR an actual DB. Both are possible approaches. It's about how you want to wire up your tests.
mrkeen 19 days ago [-]
Two points:
1) The subtle issues with constraint violations, default values and indexes are part of the RDBMS, not the testing strategy. The article suggests stepping on these rakes in test so that you hopefully don't step on them in prod. Another way to avoid them in prod (and to also regain the speed/reliability/simplicity of your test suite) is to not use an RDBMS in prod.
2) If you want to persist something, ask yourself if what you really have is tabular data. Do you have a lot of nulls and/or default values? Catch-all json fields? You probably have objects, not relations. Postgres may have a lot of wonderful features that relate to relations, and an ORM might have a lot of wonderful features (which appear to save you time). But if you stop requiring your objects to be relations at rest, you don't need the features, which is even faster.
Volundr 19 days ago [-]
> constraint violations
You call these rakes and suggest removing them from prod, I call these vital safeguards. Defining constraints at the database layer gives me confidence that these will truly be invariants, and I won't one day discover that I've got millions of rows in my database missing critical data, with no way of knowing what should be there. Been there, done that, no desire to revisit it.
mrkeen 19 days ago [-]
My coworkers call it "data integrity". At best it's a local maximum for consistency. Right now it's how we reject true facts about the business. Our partner bank tells us a customer deactivated? Our database says "no", someone gets a 500, and we get a support ticket when the customer can't sign up again (because they're still in our system).
oasisaimlessly 19 days ago [-]
If this is a common problem, you should have some sort of "dead letter queue" where events like this can be held until your data model is updated to accept them.
mrkeen 19 days ago [-]
Yep, I keep the dead events around, for when the constraints behave incorrectly.
I also keep the live events around, in case there isn't a constraint exception, but something else goes wrong - upstream/user error causing deletion, or programmer error causing the event to have the wrong effect.
Now I have the events, I can just throw away the schemas and constraints. I mean, if the events and the DB disagree, the DB is wrong.
selimnairb 19 days ago [-]
At least in the Python world, it’s trivial to bring up an isolated DB instance using a docker compose file and a simple pytest plugin. Hardly seems worth the effort to mock the DB in cases like this.
p0w3n3d 19 days ago [-]
of course one can do the query testing separated, but I have run once or twice into problems covered by hibernate, and showing up on sql dialect on real database. Hence as always the answer is: 'it depends'
seeknotfind 19 days ago [-]
This is not only about databases. Mocks are almost always a compromise.
esafak 19 days ago [-]
I suggest using real databases with testcontainers and dagger.
Deukhoofd 19 days ago [-]
Yeah, I ran in very much the same issues. Setting up a database for unit tests can be even more of a bother though (especially for CI pipelines, I don't want to have to run a full database there), so I took a middle road, and use my ORM to throw up a temporary in-memory SQLite database that is mostly similar to our actual database. Each unit tests scaffolds its own database, and deletes it when it's done.
That allows me to somewhat mock the database within unit tests, while still being able to test database-specific errors that might occur, like foreign key errors.
pdimitar 19 days ago [-]
Why a separate DB for each test? Just have only one DB and each test opens a transaction and then rolls it back when it's done. That way you also achieve isolation of tests so they don't interfere with each other.
Deukhoofd 19 days ago [-]
The code being tested also uses transactions internally at times, so it'd mean additional complexity in the code being tested to allow for unit testing, which is not great. In my experience throwing up a database including all tables in an in-memory SQLite db is extremely fast, so it's not really a major concern.
pdimitar 19 days ago [-]
OK, but is your production DB also SQLite? If not, I would not. I found the differences between it and PostgreSQL too big and was getting too many false positives.
Also code complexity on a few levels of recursion of transactions is an easy thing to abstract away with almost zero performance penalty -- depending on your programming language of choice.
Deukhoofd 19 days ago [-]
The production DB is MSSQL, and we scaffold it in SQLite through EF Core. The resulting SQLite DB is close enough to our production DB that we are able to catch invalid defaults, missing foreign keys, etc. in unit tests instead of later on in our testing pipeline, which helps massively in accelerating our development. It could be even better if SQLite would actually tell you which foreign key constraint failed instead of its somewhat unhelpful 'foreign key constraint failed' error, but as it is we at least know something is wrong in our code.
And sure, we could probably refactor it to use transactions to shave a few seconds of running our test suite, but it'd add some additional mental complexity to our codebase to do so. In general, I prefer to keep the actual running code as simple as possible, and have any complexity that is required for unit tests be handled by the unit tests. By just recreating the database the unit tests currently handle all that complexity.
The idea is definitely not a bad one though, and if your scaffolds are big enough to actually cause performance issues with your unit tests it might definitely be a consideration to not recreate the database every time.
pdimitar 19 days ago [-]
Well, if you found a productive workflow then who am I to judge, right?
However, if I was hired into your team tomorrow you'll have to fiercely fight with me over this:
> And sure, we could probably refactor it to use transactions to shave a few seconds of running our test suite, but it'd add some additional mental complexity to our codebase to do so.
Various languages and frameworks demonstrate that abstracting this away and using convenient wrappers is more or less trivial, and definitely a solved problem. And the resulting code is simple; you are (mostly) none the wiser that you are actually using a temporary transaction that would be ultimately rolled back.
...Though that is extremely sensitive to which programming language and DB library you are using, of course.
Your approach works, so keep using it but IMO it's a temporary one and it could stop being fit-for-purpose Soon™. You yourself are already aware of its limitations so just keep an eye out for this moment and be decisive when the time comes to migrate away from it.
My $0.02.
AlexSW 19 days ago [-]
Could you give an example or two of languages/frameworks that have demonstrated abstracting the transaction blocks away? I'm not sure I'm following so I think this will help.
pdimitar 19 days ago [-]
I lost the link to one Golang library that I liked very much but here's the link to Elixir's Ecto: https://hexdocs.pm/ecto_sql/Ecto.Adapters.SQL.Sandbox.html (not the perfect one, sorry, but can't be bothered to look for a better resource). The TL;DR is as above: it uses the same DB but does parallel connections to it and in each you have a transaction that's ultimately rolled back.
nabn 19 days ago [-]
yes, integration tests cover more surface area than unit tests
jerryandtom 19 days ago [-]
[dead]
diddid 19 days ago [-]
The whole point of the mocking the database is to not test the database! If you need to test the database then test the database! Just like mocking an API can hide hidden issues with the API… which is again the exact point of mocking the API. This article should really be named “mocking your database isn’t testing your database” which seems like it should be obvious.
bccdee 19 days ago [-]
The point of mocking the database is to avoid the hassle of spinning up a database to test against. If we ignore that, why not test our use of the database? Are we disinterested in the schema and query errors we might catch?
Unit testing is useful because (a) it's easier to test all the code paths in a unit when that unit is isolated, and (b) unit tests don't need to be modified in response to changes in unrelated modules. But (a) is irrelevant since we can easily put dummy data in the DB, and (b) isn't a concern since the relevant DB schemas and queries are tightly coupled to the module in question anyway.
wesselbindt 19 days ago [-]
> why not test our use of the database
Primarily slowness and brittleness. Tests with side effects are much more likely to be flaky, and flaky tests are the worst. Especially if they're slow and hence hard to debug.
Of course, you do test your use of the database, but in a much smaller test suite. For the business logic tests you just spin up a copy of your app with fake adapters, and you're done. Quick, deterministic tests.
bccdee 19 days ago [-]
I agree that integrated tests tend to be brittle, but if we need database tests either way, the database harness can't be flaky. So any flakiness there is something that has to be fixed regardless. Slowness, I agree, but unless you're spinning up new containers for every test, the overhead of each query being real is going to be small.
If the database interactions are trivial, I agree—just use stubs. But if you've got non-trivial joins, then you'll need a separate database test suite anyway. And if there's stateful logic involving the database and you want to use fakes in unit tests, you need a whole extra set of tests to validate that the fakes behave the same way that the real database does.
I do actually prefer to avoid relying on the database for unit tests—it's cleaner, it's faster—but often testing with the database is simpler & reliable, and I can't justify the extra lines of code required for fully isolated unit tests. That's extra surface area for tech debt that could lead to stuff like fakes behaving differently from their production equivalent.
diddid 18 days ago [-]
The point of mocking the DB isn’t to avoid hassle, it’s to not test your dependencies. Technically the calling code might not even know it’s a DB, and it might not even care, or it might be a DB sometimes or an API other times, or even a command line once in awhile. They are only tightly coupled if you write it that way. And yes, we would be disinterested in the schema and query errors we might catch because that’s not the point of the test.
liontwist 19 days ago [-]
Which is why any testing involving mocking should have low purity.
You get 90% of benefit from two kinds of tests:
- verifying components work (unit tests)
- verifying the system works from its interface (automation)
Mocks come from a place where logic is heavily intertwined with IO. It means your code is so coupled that you can’t test logic without touching IO.
IO should be a dumb layer as much as possible. It should be Extremely general and as simple as fetching and requesting in the stupidest way possible. Then everything else should be covered by unit tests while IO is fundamentally not unit testable.
If you have a lot of complex sql statements or logic going on in the IO layer (for performance reasons don’t do this if you can’t help it) it means your IO layer needs to be treated like a logical layer. Make sure you have stored procedures and unit tests written in your database IO language. You code should interface only with stored procedures as IO and those stored procedures should be tested as unit tests in the IO layer.
Another way to put this is to ignore all the cool problems your database can solve for you and hand roll it instead. This also is a recipe for riddling your codebase with data races.
You're trading fairly easy problems for very hard problems. No thanks.
When running tests, optimize your database configuration for running tests against it, and make it possible to run those tests in parallel.
Whatever you use, in your personal toolbox changes the results of the tradeoffs.
I'm not sure what your definition of mocks is here.
Even if you follow a principle where all IO and mutations is not unit testable, you still need test doubles to plug in the interface of your testable functional core. Because you need to pass data through to exercise code paths. How do you call the component that implements the interface of your dumb IO layer?
Clearly this is harder to get right than expected since there aren’t many places doing stuff like that, that I’m aware of at least. Maybe if some of these database vendors built out better API’s for things like that with the expectation other language runtimes would be calling them the situation would be better. But it seems a lot of the old school db world is perfectly happy to build their products as if they were a black box from the application’s perspective
As a lead for mostly green field applications over the past decade, my one no compromise architectural edict is no stored procedures for online production code.
I might use it for background utilities or one off maintenance.
Would splitting off some of the logic into a separate service (that does its own database accesses) have the same issues?
How do you roll back a dozen stored procedures quickly? How do you just create a new branch or test a regression?
With all of the logic in code, it’s a simple matter of checking out a previous commit.
Triggers are even worse - “spooky actions at a distance”.
Liquibase rollback feature. The SQL changesets and rollback strategy is all defined in my .sql liquibase files and goes inside my git repo alongside my scala/python code the associated CICD automated integration tests + deployments.
Blue Green deployment can handled via bidirectional active-active database replication across your two disaster recovery databases; you deploy the newer stored procedures on one but not the other. Or you have sharded databases for some but not all customers and deploy blue/green with appropriate blue/green sharding. Or you could have separate version numbers for the stored procedure names (perhaps with a wrapper stored procedure that works differently on different shards/databasenames to isolate the impact from your non-SQL codebase.) Or just put a form of feature flags in your stored procedures; it's not 100% blue green but with minor integration test coverage it can be quite safe and allow you to run two code bases at once even in a single database.
Agree with you on triggers (and I agree to a point that stored procedures are often not the right approach despite my above rebuttal... but it is quite straightforward with the right tooling/mindset. For high-volume analytics with aggregations you might be better off leaving the data in columnar database SQL and doing some stored procedures if the code was complex enough; but for CRUD apps I would strongly discourage stored procedures.)
But you don’t see how much harder this is for developers with feature flags in stored procedures, etc over standard git? There is tooling around feature flags for code
I think we are in violent agreement though, for OLAP and analytics, I wouldn’t care as much. Because of the way that Redshift works (columnar store) and how it compiles and optimizes queries across a cluster, I wouldn’t care about stored procedures.
I bet the same can be done for SP, even if is a fork and add suffix approach
The calling code can call either
On a phone hope I’m clear enough
And then do I need to change all of my stored procedure references in my code? Do I merge my changes back to the main one when done?
Isn’t just doing a “git branch” easier?
Rolling back, is just doing a git revert in your separate Kubernetes repository (you are doing proper Gitops aren’t you?), recommitting your code and your cluster is now using your old Docker container.
If you aren’t using K8s, you just push your old code.
I'm not really following what you are saying. Sprocs are in the same repo as any other code, a branch is a branch, why do you think you can't create a branch that contains sprocs?
If I have logic that returns customer data in code, I can branch the code, make and test my changes locally and other developers can do the same without any conflicts.
Now imagine the next major release requires 50 different sql changes. Would that be easier to push and maintain one git branch or 50 stored procedures?
Again how do you rollback easily and effectively when your release depends on multiple store procedures?
Adding a nullable field shouldn’t be a breaking change, removing a field would be.
Why aren't those 50 stored procedure changes also in your one branch?
It kind of sounds like you only put some types of code in git, as opposed to everything. Is that correct?
Are you going to use a shared dev database? What happens when other developers are also making changes to other branches?
I’m assuming you need to test with a representative size data set with a representative size database server.
Just like with traditional Kubernetes Gitops where you have your source code in one repo and a pipeline to create your docker containers and your K8s configuration in another repo to manage your cluster and update the referenced Docker images, even if you don’t have stored procs, you would keep your schema change sql files in another repo and “version” your database separately with a separate pipeline.
Valid questions, but there are also valid answers, the primary one is of course "it depends".
Sometimes a shared DB works fine, as long as you manage other work being done to avoid conflicts.
Sometimes an isolated DB is needed for either just this work, or this work plus related project work isolated from other projects.
We do all of the above, and yes it takes effort, but that effort exists regardless of sprocs or not due to the nature of most of our projects and changes, rarely do we have small isolated work.
But I'm guessing our environment is not the same as your based on your statements, this is our environment:
1-Medium sized enterprise which means we run many apps from vendors, some on prem, some in cloud, as well as internally developed apps - in other words it's not a homogeneous environment and we don't control all aspects.
2-Functionality that frequently either sits on the side of operational systems, or in between the operational systems or as a layer on top that is unifying the business activity across systems and realizing a higher level of abstraction and activity.
3-Focus on end to end testing due to higher bug detection rate (i.e. around 70% bug detection) vs unit testing (i.e. around 30% bug detection), and due to work flowing through multiple apps frequently.
> even if you don’t have stored procs, you would keep your schema change sql files in another repo
Our schema changes are typically in the branch being merged and deployed. Our deployment process has pre and post DB actions sections.
But some changes are too big for that type of thing and require multiple day deployments with various business and technical actions happening in a specific sequence across multiple different apps (e.g. hold new activity for set of skus, stores and web site complete final transactions, clear DCs, switch to new process where those skus have a new unique flow through the systems and the enterprise).
I think OP issue is that he didn't understand what mock tests actually tested against. They are called mock for a reason. There should be a better way to provision a database when running tests and have a full simulated environment.
I would never have my DB trigger a webhook for me but I would have it act as a message broker back to my own code that triggers a webhook. I don't want my DB ever initiating an outbound connection.
The whole point of writing maintainable software is to throw away most of what technologies can do, because you are way better off if you do not make the mistake of following that path.
For example, at a first glance doing auth at the database layer is a huge mistake. You design your systems with a few nodes having exclusive access to the DB to ensure your access is secure without having to waste the overhead of checking credentials in each db access. The main design trait is speed. But you do you.
> There are a tons of other things a DB can do, like trigger a webhook.
There are plenty of reasons and hard learned lessons behind principles such as separation of concerns.
Unit tests dont test impure functions all they do is test pure functions. Impure functions aren’t unit tested period. So when someone writes an impure function I don’t like what happens? Nothing. It wasn’t part of unit tests annyway, understand?
If you’re referring to integration tests well a newly written impure function isn’t part of integration tests anyway so nothing happens in this case either.
I found your last comment about how I’m triggered to write such a comment rude and lacking in even understanding what I wrote. Please try to understand my point before saying something like that… if you can’t please refrain from replying in such a way as it’s against the rules here. Thanks for reading.
The problem with write tests, not too many, mostly integration is that unit tests are too damn good at giving you instant feedback and a tight dev loop. So we turned integration tests into unit tests. Combined with SQLite :memory: for unit tests and Postgres for the final integration tests we don't have to mock anything.
Source: Currently use this method at $dayjob with great success.
I was challenging the absolutism of "never use mocks", it's just another technique and can be applied easily in integration tests if the guidelines are well set in the team on how to not use them.
Also, I do not do external calls in the CI/CD pipeline, it inevitably makes tests brittle and flaky.
https://www.manning.com/books/unit-testing
Think about it. This IO layer should be so thin that all you’re doing is fetching and getting so this layer does a direct external call. When you mock this entire thing is basically what is mocked.
So if you mock or don’t mock if you write your code with the pattern of functional core and imperative shell the imperative shell is what is mocked and is never tested via unit tests regardless. You cannot test this.
Another way to think about it, is like this. Write your code in a highly decoupled way such that all logic or transformations done in your code can operate independently from things you have to mock. Then for the things you have to mock, what’s the point of mocking them in the first place when your logic can be tested independently from the mocks?
What’s the point of unit testing code that touches IO and is supposed to be mocked? You’d just be writing mocks and testing mocks.
Don’t write functions like this:
fetchNumberFromIOAndAdd2
Do write two functions like this:
The later function can be unit tested. The former can’t be unit tested and there’s no point in unit testing it. The former function is what lives in the imperative shell.What you see in a lot of books are authors promoting patterns like this:
This kind of pattern forces you mock by coupling IO with logic.suppose you are writing some software that interfaces with personal finance SAAS. except you want your users to be able to interact with more than one. so you sure as shit better be testing that your abstractions are making the correct calls, especially if customer money is involved.
you also really want to be testing for unreliability, for example, if your saas call is taking too long, you dont double tap, if a tx looks like it fails but it didnt actually, you know that on the second attempt, etc, etc.
"the shell is untestable" is not an acceptable answer.
and you cant wriggle out of this by saying "well that's an integration test". you said in your op "don't write mocks". thar is terrible advice in general. im generally a big fan of imperative shell functional core, but you MUST mock your shell, especially if absolute correctness is required (people's lives or money at stake)
Mocks are a utility for unit tests hence the topic of this post. We are talking about unit tests and mocking aspects of the unit test that can’t be tested via unit tests.
Integration tests are another topic. With integration tests or end to end tests you test everything. You don’t mock. Mocking is not really a valid concept in the context of integration tests.
that is not what you said. you wrote: "Don’t code with mocks period"
> I differentiated between unit tests and integration tests.
you did not. your original comment doesnt mention integration tests at all. i quote:
"Structure your code such that it has a functional core and imperative shell. All logic is unit testable (functional core). All IO and mutation Is not unit testable (imperative shell). Mocks come from a place where logic is heavily intertwined with IO. It means your code is so coupled that you can’t test logic without touching IO.
IO should be a dumb layer as much as possible. It should be Extremely general and as simple as fetching and requesting in the stupidest way possible. Then everything else should be covered by unit tests while IO is fundamentally not unit testable.
If you have a lot of complex sql statements or logic going on in the IO layer (for performance reasons don’t do this if you can’t help it) it means your IO layer needs to be treated like a logical layer. Make sure you have stored procedures and unit tests written in your database IO language. You code should interface only with stored procedures as IO and those stored procedures should be tested as unit tests in the IO layer
> Mocking is not really a valid concept in the context of integration tests.
what the fresh hell are you talking about? its a common practice to set up mocks (using dependency injection with expects) for an external API call in an integration test. in 10 years of professional software development I've mocked in integration tests at five empolyers.
Rarely used unless necessary is what I meant. The point of the integration test is to avoid mocking and I don’t want to argue a pedantic point with you and I also don’t like your attitude so I’m ending this section of the thread don’t bother replying.
it started with literally bad professional advice. junior developers reading these forums need to not have professional habits influenced by the sort of behavior on display by you here.
and a ChatGPT transcript? really? just do a simple search, you will find tons of articles advocating for mocking external apis in integration tests.
Is that a mock? I don't know. It sure does resemble one, but it doesn't use dependency injection or other runtime tricks which I'd associate with classic "mocks"; it's just another program you write to support the tests. The imperative shell doing I/O needs to be configurable to support this (think a "widgets API base URL"), but that's it.
Once you've done that, you can just run a simple integration or end-to-end test to cover that API. You never need to manually mock it (like you suggested), or use a mocking tool.
It's a fake, not a mock.
Mocks and fakes are both test doubles.
> And thr point is to structure your code in such a way that
Sometimes it's not your code. Sometimes you want to change legacy code and want tests to prove that your changes don't break anything.
Life would be so much easier if there were just one way to test, wouldn't it?
--
following in that theme, just doing some digging [1] turns up the following for those that might have been confused like i was:
[1] https://www.martinfowler.com/articles/mocksArentStubs.htmlIt can work very well in practice.
I use both the mocked and real database approaches illustrated here because they ultimately focus on different things: the mocked approach validates that the model is internally consistent with itself, and the real database approach validates that the same model is externally consistent with the real world.
It may seem like a duplication of effort to do that, but tests are where you really should Write Everything Twice in a world where it's expected that you Don't Repeat Yourself.
I'm pretty sure "don't repeat yourself" thinking has led to the vast majority of the bad ideas I've seen so far in my career. It's a truly crippling brainworm, and I wish computer schools wouldn't teach it.
I'm unsure that I agree. The two examples you gave, establishing that IDs are unique and that updates to entities in the system are serializable (and linearizable while we're here), are plenty doable without having to touch the real database. (In fact, as far as the former is concerned, this dual approach to testing is what made me adopt having a wholly separate "service"[0] in my applications for doling out IDs to things. I used to work in a big Kafka shop that you've almost certainly heard of, and they taught me how to deal with the latter.)
That said, I'd never advocate for just relying on one approach over the other. Do both. Absolutely do both.
> I'm pretty sure "don't repeat yourself" thinking has led to the vast majority of the bad ideas I've seen so far in my career. It's a truly crippling brainworm, and I wish computer schools wouldn't teach it.
I brought up WET mostly to comment that, if there's one place in software development where copying and pasting is to be encouraged, testing is it. I'd like to shelve the WET vs. DRY debate as firmly out of scope for this thread if that's alright.
0: It's a service inasmuch as an instance of a class implementing an interface can be a service, but it opens up the possibility of more easily refactoring to cross over into running against multiple databases later.
I agree both approaches are important, and it's totally ok if they overlap. If your unit tests have some overlap on your integration tests, that's nbd especially seeing as you can run your unit tests in parallel.
EDIT: actually I'll make a much bolder claim: even if your unit tests are making flawed assumptions about the underlying dependencies, it's still pretty much fine so long as you also exercise those dependencies in integration tests. That is, even somewhat bit-rotted unit tests with flawed mocks and assertions are still valuable because they exercise the code. More shots on goal is a great thing even if they're not 100% reliable.
Exactly.
Another upside I've run into while doing things this way is that it gets me out of being relational database-brained. Sometimes, you really do not need the full-blown relational data model when a big blob of JSON will work just fine.
Because yeah, the database is you main source of invariants. But there is no good reason for you application environment not to query the invariants from there and test or prove your code around them.
We do DRY very badly, and the most vocal proponents are the worst... But I don't think this is a good example of the principle failing.
I largely agree, but...
> ... the database is you main source of invariants.
I guess my upbringing through strict typing discipline leaves me questioning this in particular. I'm able to encode these things in my types without consulting my database at build time and statically verify that my data are as they should be as they traverse my system with not really any extra ceremony.
Encoding that in the database is nice (and necessary), but in the interest of limiting network round-trips (particularly in our cloud-oriented world), I really would prefer that my app can get its act together first before crossing the machine boundary.
As a developer who primarily builds backend web applications in high level languages like golang and java I run the risk of sounding ignorant talking like this but.. I'm led to believe lower level systems and embedded software has a lot more invariant preserving runtime asserts and such in it. The idea being that if an invariant is violated better to fail hard and fast than to attempt to proceed as if everything is alright.
But, there isn't a faster way to fail to an invariant than to prove statically that your code fails it, or to test it before deploying. I don't really understand your criticism.
Mocks are wishful thinking incarnate most of the time, though here and there they are absolutely needed (like 3rd party APIs without sandbox environments, or quite expensive API, or most of the time: both).
Just pick a task runner -- I use just[0] -- and make a task that brings up both Docker and your containers, then run your test task, done. Sure it's a bit fiddly the first time around but I've seen juniors conquer that in a day maximum and then your tests actually work with the real world 99% of the time.
Mocks in general are rarely worth it, the DB ones: 10x so.
[0] https://github.com/casey/just
My current setup:
- generate a new psql testcontainer _or_ reuse an existing one by using a fixed name for the container - connect to the psql container with no database selected - create a new database using a random database name - connect to the randomly generated database - initialize the project's tables - run a test - drop the database - keep the testcontainer up and running and reuse with next test
With this setup, most tests run sub-second;
And tests can often use a DB transaction to rollback, unless the code under test already uses them.
I would never think of mocking a database in 2024/2025, just spin one up.
Also keep your lynch pin invariants in the database and not in your code.
I do both. That is how you protect against stressed devs at 2:00 AM trying to insert crap in the production database and how you also make good apps that don't blow up with database exceptions and give you neat validation errors in forms and/or JSON `errors` keys (when it's an API).
What I see with Fakes too often is that they are difficult to write so people reuse them, end up writing too many tests that only test the fake, and couple a bunch of tests to each other which makes refactoring fraught. Anyone who makes refactoring difficult is an enemy to your project, not an asset.
Mocks make DAMP easier. You don’t care how the other function turns “foo” into “bar”, you just care what happens to this function when it gets “bar” as a response. That’s the surface area.
There’s a lot of people in this response chain who are very obviously conflating “unit tests” with all tests. Unit tests don’t care about integration. That’s integration tests. I liken testing to plumbing. Unit tests are similar to quality control at a pipe factory. You verify that the pipes aren’t damaged prior to assembly. The plumber inspects them before installing, and inspects their work as they bond or solder them together. But at the end of the day your plumber still has to turn on the faucet and make sure water 1) comes out the taps and 2) stays in the pipes everywhere but the sink, otherwise they haven’t done their job.
Unit testing the individual pipes is necessary but insufficient. You have to verify the connections and the end to end system still, but more work up front saves you some embarrassing rework at the end.
NOBODY HAS DEFINED WHAT A UNIT IS! This needs to be shouted because all your arguments fall apart when you realize this. Until you get down to atoms (or something in quantum mechanics) you can go a level deeper. String.length is a integration test if you are the hardware engineer building the microcode on the CPU (do CPUs use microcode anymore?). If you are laying out the traces tests for the microcode are an integration tests for you. When I write a simple function that happens to use string.length I'm an integration test for that. Note that I skipped many many levels in both directions.
What people care about is that things work. The term unit test and integration test needs to die as they are not useful. Test at the right level. If you can write all your tests as full system tests (that is via UI input) that runs fast and are reliable then do so. We generally say don't test via UI input because painful experience is UI changes often in ways that break 1000s of tests all the time, which is good reason to inject tests below the UI. There are often other reasons to test break up tests into smaller areas, but that is trade offs.
Odds are you are not writing basic containers (ie string.length) and so your API is something that should be easy to refactor (you have no reason to think it won't change in the future as requirements change) and thus you should not test that API. Tests are an assertion that in the future this will not change, so if you are testing something that should change you are doing it wrong! (of course you will never get this 100% right, but that should be your aim!)
---
If your mocks are hard to write you may have your fanout too wide and would do better with squaring up your dependency tree a bit.
Mocks are not hard to write. They are the wrong solution to most testing problems though. Mocks assert that you call a specific function as expected - it is really easy (in most languages) to automate writing with a framework and so they are easy to write. The problem is calling a function is not what you want to test in most cases. I used write as an example because file IO APIs generally have many different forms of write. You have WriteWithBufferAndLength you have WriteAVector (or whatever your language calls the basic list container), then WriteWithSomeWeirdOptionsWeForgetToPutInOriginalAPIAndDidNotWantToRefactorEverythingSoWeAddedANewAPI... Some languages these are all named different, some allow overloading - it doesn't matter, the important part is mocks don't let you switch which one you use as 100 tests will break if you try even though switching might be better it isn't enough better to be worth trying to fix all the tests that happen to use that mock. Making the above worse, write often can be called with one large buffer or many smaller buffers and the end result is the same yet mocks assert how many times write is called which is wrong.
> What I see with Fakes too often is that they are difficult to write so people reuse them, end up writing too many tests that only test the fake
When I write a fake I write one fake that close for everyone to use that fake in their tests - I know how the thing it is faking works in production and model that behavior except for whatever it is that makes it unsuitable to use in real tests. I often do add verification helpers - you can ask what data was written at any point (including concatenating several writes) for example.
Not sure why mixing up both things makes the conversations tortured though? Can you clarify?
In that case the problem isn't so much that you forgot to update your mock as much as the changed function no longer satisfies the interface. Ensuring that an implementation properly satisfies the interface is not within the scope of tests where mocking would be useful.
As I mentioned, if you need mocks to test a code path, you are probably "doing it wrong" as there are much better ways to do that. Such as refactoring the code so you can test just that code path. Mocks are a code smell, IMHO. In other words, if you need mocks, then the interface/contract is a part of your code -- and you should test it.
Most code paths you want to test are error branches, and creating real error conditions is hard to completely unrealistic (e.g. causing hardware malfunction on each test run). What is the better way in your mind?
Mocks can only exist at the integration points, fundamentally. There is a natural interface there, an interface that you are not supposed to be testing. Those tests already exist around the module that provides the integration. You don't need to test it again.
If a mock standing in for the real thing causes ill-effects, you've done something wrong. You're not supposed to be testing the mock while using the mock. If your mocks need tests (they probably do!), they should be tested outside its use.
Generally, if you can detect a hardware fault in your code, crash unless you know for certain that you can actually recover from it -- meaning you know what the problem is (can somehow detect the difference between a missing disk or a missing file). 99.9% of the time, you cannot recover from hardware issues with software, so it is pointless to actually test that case. Please, for the love of working software, just crash so the state doesn't get corrupted instead of trying to overengineer a solution.
All software works on hardware. Your web application doesn't need a test device, though. The hardware is already tested. You can treat it as an integration point. But even if you did create a test device for whatever reason, that's a mock! Which you say is to be avoided, and that there are better ways, without sharing what those better ways are...
> Please, for the love of working software, just crash so the state doesn't get corrupted instead of trying to overengineer a solution.
While you're not wrong, you need to test to ensure that it actually crashes. All defined behaviour needs to be tested, and you have defined behaviour here.
That's because it would better fit in a book than an HN comment, not because I don't want to answer. Basically the gist is to write "obviously correct" code that doesn't need to be tested, along with an architecture that lends itself to being testable without mocks.
Most people tend to write an interface and then inject a concrete type that could also be a mock. I've seen tests written this way that need to mock out 20-60 things just to test the one thing they want to test.
In most web frameworks I've worked with, this is mostly unavoidable as most frameworks provide dependency injection that is natural to mock.
If you aren't using a framework that has an opinion on how things should work, mocks can be avoided completely through different techniques, such as test harnesses, replacing dependencies with alternative implementations (such as in-memory queues instead of cloud services, sqlite instead of heavy databases, etc), etc. Sometimes you don't have any choice but to not use mocks, for example, distributed systems usually avoid mocks for certain kinds of tests because they simply can't be emulated very well -- either due to latency, network partitioning, or network failures that are just too numerous to mock out (similar to the disk issue I was referring too earlier). You don't know if a node is down, or the cable got cut, and need to behave appropriately to avoid split-brain scenarios. In these cases, test harnesses that can emulate specific scenarios are much better.
I don't see how that follows. The purpose of testing is to document the code for the understanding of future developers, not to prove correctness. The only 'correctness' a test proves is that the documentation is true. Which is still incredibly useful, as I am sure you are painfully aware if you have ever dealt with legacy forms of documentation (e.g. plain text files, Word documents, HTML, etc.) that quickly become out of date, but is not a statement about the code itself.
> such as test harnesses, replacing dependencies with alternative implementations (such as in-memory queues instead of cloud services, sqlite instead of heavy databases, etc), etc.
These are all mocks, ultimately. Some desperately try to give them different names, but it is all the same at the end of the day.
Hmm. I've never seen tests with that goal in mind, except for behavioral tests that test the acceptance critera.
> as I am sure you are painfully aware if you have ever dealt with legacy forms of documentation [...] that quickly become out of date
I have, but allowing that to happen is a culture-issue, not something that is guaranteed to happen. When I open PRs to open source software, I always include a PR to the docs if it changes anything. At work, updating the docs is part of the default acceptance criteria and is usually the thing we do before writing any code, and goes through a PR process just like the code. But, we service enterprise customers, so we aren't going to be giving them code or tests to understand how to use our product.
> These are all mocks, ultimately.
This is a software field and there are specific words with specific meaning; trying to shoehorn things that aren't those things to generalize a meaning is acceptable when teaching. It isn't acceptable when working on those things. In other words, I would accept this if trying to explain the concept to a junior engineer, but not from a senior engineer to a senior engineer.
> it is all the same at the end of the day.
No, not at all.
Then you've never seen a test, I guess. That is the only goal they can serve, fundamentally.
> I have, but allowing that to happen is a culture-issue, not something that is guaranteed to happen.
Mistakes are guaranteed to happen given enough output/time. No matter how hard you try, you are going to make a mistake at some point. It is the human condition. In the olden days one might use a proofreader to try and catch the mistakes, but with the advent of testing a computer can do the "proofreading" automatically, leaving the human effort to be pointless.
Maybe in the age of LLMs we can go back to writing documentation in "natural" language while still using machines to do the validation work, but then again if you write code you probably would prefer to read code. I know I would! The best language is the one you are already using. Having to read code documentation in English is a horrible user experience.
> This is a software field and there are specific words with specific meaning
Sure, but in this case you won't find any real difference in meaning across the vast array of words we try to use here. The desperate attempts to try and find new words is to broach silly soundbites like "mocks are a code smell", so that one can say "I'm not mocking, I'm flabbergasting!", even though it is the same thing...
I cannot wrap my head around this statement. It's literally in the name: "test" as in to prove something works... hopefully as designed.
> Mistakes are guaranteed to happen given enough output/time. No matter how hard you try, you are going to make a mistake at some point.
Yep, and they do. Its really easy to figure out which one is right: if the docs say that something happens, it happens. If the code doesn't do what the docs say, the code (and the tests) are wrong; and not the other way around.
> Having to read code documentation in English is a horrible user experience.
It's the difference between intention and action! I worked with a guy who opened PRs with totally empty descriptions. It was annoying. When I was reviewing his code, I had to first figure out his intention before I could understand why there was a PR in the first place. Was he fixing a bug, adding a new feature, or just writing code for the hell of it? ... nobody knew. Then, when you spotted a bug, you had to ask if it was a bug or on purpose, because you didn't know why the code was there in the first place.
Documentation is that living PR description. It doesn't just tell you WHAT exists, but WHY it exists, what purpose it serves, why that weird little line is the way it is, etc., etc.
The documentation is what is under test. It proves that what is documented is true. It does not prove that the implementation works. This should be obvious. Consider the simplest case: A passing test may not even call upon the implementation.
I have most definitely seen that in the wild before! More times than I wish I had. This is why TDD urges you to write tests first, so that you can be sure that the test fails without implementation support. But TDD and testing are definitely not synonymous.
> Its really easy to figure out which one is right: if the docs say that something happens, it happens.
Under traditional forms of documentation, you don't have much choice but to defer to the implementation. With modern documentation that is tested, typically the documentation is placed above the implementation. Most organizations won't deploy their code until the documentation is proven to be true. The implementation may not work, but the documentation will hold.
> I worked with a guy who opened PRs with totally empty descriptions.
I'm not sure PRs fit the discussion. PRs document human processes, not code. Human processes will already typically be in English (or similar natural language), so in the same vein the best language is the one you are already using. That is not what we were talking about earlier; but, granted, does do a good job of solidifying the premise.
Is it a chore? Absolutely. But peace of mind is important.
And yes, people forget to write tests, sure, but even then it would be a time bomb without mocks.
I‘m not a friend of mocks either, but most examples here are not really an issue with mocks.
It won't help you with database specific differences. But there should be very few of those if you're using a framework that abstracts away the database. Like Django.
But I really want that database-specific behaviour. :) PostgreSQL does so many amazing things (recursive CTEs, jsonb, etc) that actively make our system better. If there was a fork of Django that optimized for leveraging advanced postgres features, I'd use it.
It's possible to use PGlite from any language using native Postgres clients and pg-gateway, but you lose some of the nice test DX when it's embed directly in the test code.
I'm hopeful that we can bring PGlite to other platforms, it's being actively worked on.
The other thing I hope we can look at at some point is instant forks of in memory databases, it would make it possible to setup a test db once and then reset it to a known state for each test.
(I work on PGlite)
postgres installs easily on WSL2 or whatever Linux distribution you're using.
I want to develop on postgres and test on postgres because I run postgres in production and I want to take advantage of all of its features. I don't understand why a person would 1) develop/test on a different database than production or 2) restrict one's self to the lowest common denominator of database features.
Test what you fly, fly what you test.
Sounded good at first but we were quickly overwhelmed with false positives and just opted for Postgres in a VM (this was before Docker was a thing).
See https://www.djangoproject.com/.
If I have to do anything CRUD like, I'll use Django. For reporting apps, I prefer native SQL.
I love SQLite to bits but the test harness I have to put around my apps with it is a separate project in itself.
On a philosophical / meta level it's all quite simple: do your damnedest for the computer to do as much of your work for you as possible, really. Nothing much to it.
Strict technologies slap you hard when you inevitably make a mistake so they do in fact do more of your work for you.
Especially since launching postgres is equally easy and fast as sqlite. Docker can help with sandboxing. What is left to gain? 100ms shorter startup time or keeping your unit test executables as single binaries? Irrelevant.
It's definitely not as fast to start postgres as it is to start sqlite. Pretty much inherently - postgres has to fork a bunch of processes, establishes network connectivity etc. And running trivial queries will always be faster with sqlite, because executing queries via postgres will require intra-process context switches.
That's not to say postgres is bad (I've worked on it for most of my career), but there just are inherent advantages and disadvantages of in-process databases vs out-of-process databases. And lower startup time and lower "dispatch" overhead are advantages of in-process databases.
I wrote about this and some more in https://jmmv.dev/2023/07/unit-testing-a-web-service.html
Does that cover the first two examples brought up by the article? Constraint violations and default value handling.
How if you rely e.g. on CASCADE and foreign keys, which are not on by default kn SQLite? I think then things start getting complicated and testing that layer gets difficult.
That's the reason to use an ORM. It abstracts away things like that.
This kind of setup makes the usual Ecto Sandbox approach feel slow, but I do agree that the way Elixir approaches this is great!
I actually have two projects that use this approach, FeebDB (which is the library I wrote to manage a "one SQLite database per client" approach) and HackerExperience (a game under development that uses FeebDB).
The overall idea is simple:
1. Before tests start running, create a prop of each database.
2. The prop contains the "starting database" for each test. It may contain seed data (optional).
3. For each test, copy the prop and assign it a unique shard identifier (say, cp /props/user.db /test_data/user/874125.db).
4. The test knows the `shard_id` and can do whatever it wants with it; no one else will bother it.
5. Once ExUnit is finished, delete all shards.
Both projects follow a similar approach (I wrote it first in FeebDB and copied into HackerExperience, which has some sections commented out -- I need to clean up this part of the codebase).
For both projects, you will find steps 1/5 in `test/support/db.ex`, step 2 in `test/support/db/prop.ex` and steps 3/4 in `test/support/case/db.ex`.
- FeebDB: https://github.com/renatomassaro/FeebDB/
- HackerExperience: https://github.com/HackerExperience/HackerExperience/
Email is in profile in case you have follow up questions/comments :)
But the testing setup in Elixir is just exemplary compared to everything else I've worked with. I fight myself daily on just rebuilding the entire project in Elixir.
Mocks for databases are extremely brittle and complicated.
It sounds cool. But running software isn't about sounding cool.
A decade ago, it was really clear. As https://aphyr.com/posts/284-call-me-maybe-mongodb explains, MongoDB didn't really work. But they've fixed that. So now it runs acceptably accurately. I just don't know why I'd ever want to.
Also, their hosted offerings (MongoDB Atlas) were not well operated and they took down our company for 2 days. Our MongoDB instance stopped accepting new connections and restarting our instance via their control panel didn't bring it back up and then their support literally said "we don't know why this happened or how to fix it" for like a day and a half, while we were on their highest tier ultra platinum support contract. Ultimately, we had to do 24 hours of research and then tell their own support how to fix the problem using their special shell access. If I recall correctly, it was some issue with WiredTiger (an internal component of the Mongo version we were using).
After that experience, I'd never use anything produced by MongoDB for anything; we moved everything that we possibly could out of Mongo and into more traditional RDBMS (PostgreSQL) and never had to deal with issues like that again.
Recently I ran into a tool that spat out "you probably want to use this option!". We paid for enterprise support so I asked why this option was not documented, and they said because it is dangerous. Can you imagine if the "-f" for rm wasn't in the man pages? Ridiculous
Also want to add that you can definitely use MongoDB (or any other database) in a way that doesn't scale well. I have personally run MongoDB at petabyte scale and had a relatively great experience.
A lot fewer are "web scale" than think they are. For ones who are, there are other competitors like Snowflake that work well.
As for scaling, it depends what you want to do. If you want to do things that look like joins, maybe it wasn't the right choice. Though I've definitely succeeded.
Were there good aspects? Sure... kind of. It was super super easy to just throw data into the database. Need to add a field? Who cares, just add it. Need to remove a field? Who cares, just remove it -- as long as the calling code is null-safe. And it was likewise super easy to store denormalized data for fast lookups when it made sense to do so, as well as deeply-nested things and arbitrary JSON blobs synced from our CMS. And queries could be stored and manipulated as actual Python dicts instead of opaque strings, whereas you normally need an ORM or query builder to do that in SQL. And you could get the best-of-both-worlds-ish with the ODMantic framework, where you could spit out a warning (but not an error) and recover gracefully if there happened to be bad data in the database.
Basically it allows you to forego any foresight or serious design, and instead just throw shit together. Which is great for getting a prototype going really fast and giving the appearance of a highly productive team. That is, until you run out of new microservices to prototype and now you have to start adding features to existing code and fixing bugs. IMO it was completely "penny-wise and pound-foolish" with respect to developer time, but I can at least see the appeal if you're a particular type of engineer operating under a particular set of incentives.
As for using MongoDB for things it was actually meant for (writing a ton of schemaless JSON blobs really fast and figuring out reads later), I have no idea because we never used it for that and had nothing really resembling that use case.
I wasn't involved in setting it up tho, so can't say anything about how difficult it is to work with on the technical side.
Their data was fully relational, and they were doing the one thing that really really kills performance in mongodb: growing documents.
Also it had no constraints so the data was all fucked up by the various bugs that were in the code over the years.
Ah yes they used sharding. Probably there wouldn't have been a need for it if they had just used postgres, since the data was not that much.
Could you elaborate what is the problem with mongo here? You can run it with testcontainers like any other database.
A static type system that assists with that design style is nice as well. It's really useful to be able to look at a function and know right away what I/O it performs, if any. This is probably my #1 missing feature in Python, now that static type hints work well enough for most purposes.
That said, setting up and maintaining a database test harness in the developer environment can be really annoying and time consuming. Even moreso when your database is a proprietary cloud thing (e.g. Snowflake). But I have never ever regretted spending the time to set it up, whereas I have definitely regretted not having it.
Sometimes it's unavoidable that your "unit" tests need to access the database to be able to properly work out the "unit" under test. Watching those tests go from failing to passing is such a sweet feeling, it makes the pain of maintaining the test harness feel worthwhile. And then of course when you hit the actual integration tests in CI, everything almost always passes, because your unit tests are that good.
There is value in testing "assuming that the database returns 3 rows with Foo IDs and Foo Descriptions, my code should serve a JSON response containing the Foo IDs and the Foo Descriptions concatenated together". And there is also value in testing "when my code executes SELECT id, description FROM foo LIMIT 3, the real database should return 3 rows with Foo IDs and Foo Descriptions". Granted, there's also a big advantage in that the former can run much faster, with zero environment setup / teardown complexity required. But my main point is, different test suites should test different things.
However, if you're unable or unwilling to write two different test suites, and if you're able to easily and reliably have a database available in dev / CI, then ok, I concede, just write integration tests.
There are unit tests with mocks on repository calls, and integration testing with a fresh pg database bootstraped with docker and .sql seeds.
One is fast to code, one gives more realistic feedbacks on real life execution. Both solve issues.
This is not the end of the game tho, as performance and back pressure issues arises whith a successful project... Testing is a long and interesting process.
I resonate with the sentiment of this article, but have struggled to find an alternative that’s fast enough as the test suite grows, isn’t flakey in CI, and is able to share the production schema definition for relevant relations.
I’d love to hear more from anyone that’s solved for some of these constraints!
If you use read replicas in production code this can become tricky though since the transactions don't commit they never become visible to the reader or even different connections to the same database
Most tests should be unit tests, which are super fast. Integration and UI tests that might use the database should be fewer and if the database is slow, it might be related to your specific application or unoptimized database queries, our database calls are usually < 10ms
Unfortunately I maintain an app where the database (read-only) is Snowflake, and being more of a "warehouse" than "database" there's always a lot of overhead in running any query at all. Even just `select 1` can take a few seconds. So there's only so much you can do with that, but setting up your data so that tests can be parallelized helps as well.
However your tests against a proper OLTP database should be plenty fast, unless your app itself is slow or your test fixtures require some really complicated setup.
What kinds of troubles do you have with using a real DB for testing?
If you happen to use Postgres, the approach is ultimately portable: it uses Pg database templates (also, regarding perf, the author recommends using a ramdisk and turning off fsync on your test DBs; you'll see this in the project readme). But you’ll have to write the code yourself.
(I'm always open to discuss potential contracts or consulting opportunities. If you have one that you think might be a good fit, my email is in my profile here and on github and on my homepage.)
For the SQLite / PostgreSQL differences that Exposed doesn't naturally handle, namely jsonb, during the CREATE TABLE phase, I have my test harness create a regular json version of the table rather than a jsonb one. Exposed's ORM seemlessly handles that swap out after table creation. There's a bit of manual upkeep in making sure the *Json test version of the tables are kept up-to-date to the production non-Json version; but that's the sort of thing that's caught on the very first test and fixed in minutes, most of the time (sometimes cascading table creation bites me).
I will eventually probably add a flag or something so the test suite can run against a separate testing partition in my PostgreSQL docker container, but I haven't done that yet.
This works in a lot of cases. In some cases this might not address your bottleneck.
Also someone should write a real, performant, in-memory postgres storage driver. Then we can all be happy (with pg at least).
It also gives you the option of running other types of tests such as load/performance/stress testing via generating larger amounts of data.
There is also something to be said about keeping database transactions atomic (no 3rd party network calls, etc) to keep flakey specs to none. I have some ad hoc thoughts on this, will try to frame it proper in a post.
If you come up with any better options please let me know so I can update this readme!
https://github.com/peterldowns/pgtestdb
I think it was Testcontainers.
I agree that tests should occur in the environment they are to be run in, when ever possible. Otherwise not...
There is no science in testing. There are no standardised definitions. We agree it is good, we often ignore the downsides (they exist - there are costs to tests)
I have struck examples where the testing frameworks of a language (Dart in this case) ruled out using a live HTTPS client to test. I kept getting 404 errors running it against my (test) site. I made a bug report. The good folks at Google (who I am sure are very smart - if a little stupid) had decided that the HTTPS client would always return 404 when run as part of the testing framework.
I argued, but apparently I am not so smart (I am less stupid, IMO). The HTTPS client testing had to either occur outside the test framework, or I rewrite the HTTPS client, or I wrote a mock up of some sort, all things I would have to pay for with time and effort - for no gain.
I get the point about mocking. But I am in business. I am not interested in always testing in isolation, (sometimes, sure), I want my tests to depend on the other parts of my business.
If the database goes wrong and my Gui tool test discovers it, that is a good thing, not a bad thing
Sure, there's no substitute for optimizing things for a particular query engine and EVERY database engine has SUBSTANTIAL quirks (cough, cough Redshift), but you should be crawling and walking before you try running.
so far so good - you're inviting the comparison of "use a real database" to "use a real X", where X could be anything - a bank, or a nuke.
> ... there's still ZERO excuse to NOT to be using SQLite or LocalStack mocks in your basic unit and pre-integration testing.
And I'm lost. Do we hope for a BankLite and NukeLite implementations to exist?
I should've used different angle-brackets. "ANY <X distributed API service call>" would've been better.
> And I'm lost. Do we hope for a BankLite and NukeLite implementations to exist?
No, but it's still worth mocking BankLite and NukeLite with the same call interface and suitable test data, variance, and jitter.
The overriding point is nothing will ever replace "the real thing", but that's not what mocks are for and it's even worth mocking 3rd party network services which can be, if not more, "quirky" than SQL query engines.
Mocked tests here tend to be many times as long as the actual implementation, grow with the size of the implementation, and also tend to break immediately if you change the internals of your code (because now new things are being called, or not being called). Having hundreds of such tests can make refactors and serving new requirements a lot more annoying.
The little "trick" of using SQLite to speed up integration tests has also bit me multiple times. Turns out the various RDBMSes are not in fact, all the same, and the details start to matter a lot very quickly.
Integration testing is also important because the closer you get to running things in a production environment, the more likely you are to detect issues. Some things just won't be apparent until you start running code in production or near production. The thing to understand though, is that you want to do both if you can, not either-or.
This makes it a unit test.
> However, testing against a real database uncovers hidden pitfalls that can appear as the application matures.
This makes real-database testing not a unit test. Maybe it's integration testing.
So basically the author is against unit tests (or isn't quite sure of the difference from integration testing)
I wonder if there's a database that works sorta like git, giving one "commit" point at the start of every test and then "branching off" so each test can do its own thing, make transactions, commit them, whatever, and finally be able to roll the whole thing back.
If that can be done then each unit test can use a full database clone with actual valid data and not just test data that was fast enough to set up before the test ran.
https://www.postgresql.org/docs/current/manage-ag-templatedb...
Creating a db from a template is not free, but if there's not too much stuff in the db it's not horrendous either. And if you're using sqlite, you might be able to backup an on-disk template to an in-memory test db.
Unless you use Clojure though it's probably not a real option.
The Java business code still often had to call the old PL/SQL stored procedures, and the state of the application was serialized on the same database using Hibernate. It made for a pretty delicate db transaction handling - in Java it was mostly declarative EJB3 transactions + some custom AOP code, and in PL/SQL we had strict guidelines of never commiting/rollbacking the transaction globally - instead every stored procedure had a savepoint at the beginning and only rollbacked to that savepoint in case of a problem.
Ensuring the system adhered to these guidlines with mocks would be very hard. Instead we wrote a special test-mode-only AOP code that was wrapping around every java -> PL/SQL call in test mode, and checked if the transactions were correctly handled (it added a programatic savepoint before the call and checked if it's still there after the call). It caught some transaction handling errors that would be VERY hard to find otherways.
It also let us find possible deadlocks during testing.
BTW writing tests for PL/SQL made me love Postgres for its transactional DDL. It's very frustrating to make repeatable tests when you can't rollback DDL changes and have to undo them one by one after the test or re-create the whole database.
For simple use cases where I mostly just read/write from the database and don't expect any of those issues mentioned in the article (constraint violations or concurrency issues, because the application is quite simple tbh, plus I already have some testcontainers based integration tests for the components that deal with a real db and I reuse them) writing tests with an in-memory db implementation is quite nice - the setup is simple and running the whole test suite is instantaneous (literally something like 1-2s for a couple thousand tests, I don't need any framework for those kind of tests).
And on the other hand if I'm relying on something like an optimistic/pesimisstic lock or a framework feature, I will write a test with the real thing, using test containers. Also have a pretty good coverage with the components that deal with queues and databases specifically with testcontainers. And on top of that just a few e2e flows written using testcontainers as well.
> Still, using an in-memory db is way better than mocking, the tests are not coupled to the implementation details/the underlying model.
Isn't this backwards? The fact that you've backed your storage with a HashMap (which is 100% what I shoot for too) means your service-under-test cannot know if it's talking to an SQL database.
EDIT: What I meant when I wrote that tests are not coupled to the underlying model/details is that with a mock you have to explicitly specify "when called with this return that". With an in memory database implementation you don't need to do anything, the code will just use the interface methods like "getX" or "saveY".
Riiiight, no I've always hated that Mockito way of doing things.
I find wrapping a hashmap avoids the need for explicit when-this-then-that bindings, because a hashmap already does the expected behaviour natively.
You can even 'integrate' your tests as deeply as you like, all standing on top of your poor hashmap. I.e. Http tests with unit test speed.
var controller = new Controller(new Service(new Repo(new HashMap()))) controller.POST(...); assert(controller.GET(...));
I'm working on a project that has ~22,000 tests that operate on Docker container of the same version of Postgres used in production. In CI it completes all the database tests in under a minute.
Unfortunately our "integration" tests are even worse, corrupting data all over and flaky as hell.
There's a continuum between extreme integration testing and extreme unit testing, each with their strengths and weaknesses.
There's a reason we don't do all testing through integration testing: it's slower, more flaky, and cripples developer velocity to a near standstill.
There's a reason we don't do all testing through pure unit testing: you have no idea whether the e2e plumbing works as expected.
DBs aren't special in this regard. That being said, it is true that investing in a light weight unit-testable setup for your DB is a good idea.
The vast majority of projects fail at this, often because some external dependency like a database than needs some manual set up.
To me, mocking out the db is a relatively cheap way to achieve rule #1. Yes, it has pitfalls, but I am unwilling to compromise on rule #1 so it's purely a question of cost vs fully automating db setup and teardown.
Practicality beats purity, and all of that. Especially when it saves quite of complexity in the code itself – new contributors are going to struggle with that, too.
In short, the setup doesn't need to be manual.
For a little more context, I work on dev tools for an org of ~400 engineers. There is always someone trying to do things the quick and dirty way and move on without consideration for how the other 399 people may be impacted.
Some ORMs and frameworks provide magic to do this, but as soon as you deviate from the framework it gets complex. You can build geospatial applications, time series, document, and full text search - iow leverage advanced features of Postgres extensions that most ORMs can't touch. Their behaviors can be complex and are part of the application, not simply a storage layer.
I wouldn't structure a team that way now.
The whole point of mocks for data base is not to test schema changes, it's to test a unit of code. These most certainly do not change the additional requirement of integration testing with real database tables. That usually happens after all the unit tests, obviously, because why spin up a big heavy database test pipeline if the code logic fails.
the only reason to write a unit test is if the integration test would run too long.
This is opposed to when modules have implicit dependencies via importing concrete dependencies directly. In this case the only way to mock is to "patch" the import somehow. The trouble is software written this way usually has very strong ties to the concrete implementation and doesn't expect to be using a mock, which is why the author argues against it.
So, no, the article doesn't do dependency injection. It's gone out of fashion for some reason even though I think it's a much better way to write software (you depend on abstractions not concrete implementations).
It seems like you could use instance_double with DI though, as basically an alternative factory for your dependencies when you want to give a faked implementation to your function.
It's been ages since I've done much Ruby, but I recall them being particularly prone to monkeypatching and similar metaprogramming mocking strategies over DI or inversion of control in generation though...
In runtime you can supply the actual DB, in test, you can supply a mock DB, OR an actual DB. Both are possible approaches. It's about how you want to wire up your tests. The author seems to be arguing for using the actual DB.
Like the other commenter explained, DI simply means that the function would depend on an abstract dependency which is supplied at runtime or test time.
In runtime you can supply the actual DB, in test, you can supply a mock DB, OR an actual DB. Both are possible approaches. It's about how you want to wire up your tests.
1) The subtle issues with constraint violations, default values and indexes are part of the RDBMS, not the testing strategy. The article suggests stepping on these rakes in test so that you hopefully don't step on them in prod. Another way to avoid them in prod (and to also regain the speed/reliability/simplicity of your test suite) is to not use an RDBMS in prod.
2) If you want to persist something, ask yourself if what you really have is tabular data. Do you have a lot of nulls and/or default values? Catch-all json fields? You probably have objects, not relations. Postgres may have a lot of wonderful features that relate to relations, and an ORM might have a lot of wonderful features (which appear to save you time). But if you stop requiring your objects to be relations at rest, you don't need the features, which is even faster.
You call these rakes and suggest removing them from prod, I call these vital safeguards. Defining constraints at the database layer gives me confidence that these will truly be invariants, and I won't one day discover that I've got millions of rows in my database missing critical data, with no way of knowing what should be there. Been there, done that, no desire to revisit it.
I also keep the live events around, in case there isn't a constraint exception, but something else goes wrong - upstream/user error causing deletion, or programmer error causing the event to have the wrong effect.
Now I have the events, I can just throw away the schemas and constraints. I mean, if the events and the DB disagree, the DB is wrong.
That allows me to somewhat mock the database within unit tests, while still being able to test database-specific errors that might occur, like foreign key errors.
Also code complexity on a few levels of recursion of transactions is an easy thing to abstract away with almost zero performance penalty -- depending on your programming language of choice.
And sure, we could probably refactor it to use transactions to shave a few seconds of running our test suite, but it'd add some additional mental complexity to our codebase to do so. In general, I prefer to keep the actual running code as simple as possible, and have any complexity that is required for unit tests be handled by the unit tests. By just recreating the database the unit tests currently handle all that complexity.
The idea is definitely not a bad one though, and if your scaffolds are big enough to actually cause performance issues with your unit tests it might definitely be a consideration to not recreate the database every time.
However, if I was hired into your team tomorrow you'll have to fiercely fight with me over this:
> And sure, we could probably refactor it to use transactions to shave a few seconds of running our test suite, but it'd add some additional mental complexity to our codebase to do so.
Various languages and frameworks demonstrate that abstracting this away and using convenient wrappers is more or less trivial, and definitely a solved problem. And the resulting code is simple; you are (mostly) none the wiser that you are actually using a temporary transaction that would be ultimately rolled back.
...Though that is extremely sensitive to which programming language and DB library you are using, of course.
Your approach works, so keep using it but IMO it's a temporary one and it could stop being fit-for-purpose Soon™. You yourself are already aware of its limitations so just keep an eye out for this moment and be decisive when the time comes to migrate away from it.
My $0.02.
Unit testing is useful because (a) it's easier to test all the code paths in a unit when that unit is isolated, and (b) unit tests don't need to be modified in response to changes in unrelated modules. But (a) is irrelevant since we can easily put dummy data in the DB, and (b) isn't a concern since the relevant DB schemas and queries are tightly coupled to the module in question anyway.
Primarily slowness and brittleness. Tests with side effects are much more likely to be flaky, and flaky tests are the worst. Especially if they're slow and hence hard to debug.
Of course, you do test your use of the database, but in a much smaller test suite. For the business logic tests you just spin up a copy of your app with fake adapters, and you're done. Quick, deterministic tests.
If the database interactions are trivial, I agree—just use stubs. But if you've got non-trivial joins, then you'll need a separate database test suite anyway. And if there's stateful logic involving the database and you want to use fakes in unit tests, you need a whole extra set of tests to validate that the fakes behave the same way that the real database does.
I do actually prefer to avoid relying on the database for unit tests—it's cleaner, it's faster—but often testing with the database is simpler & reliable, and I can't justify the extra lines of code required for fully isolated unit tests. That's extra surface area for tech debt that could lead to stuff like fakes behaving differently from their production equivalent.
You get 90% of benefit from two kinds of tests: - verifying components work (unit tests) - verifying the system works from its interface (automation)