Hmm, I found it quite debatable. IMO, they are too bonded to the opinion that everything will remain almost the same in database UI and use cases. Looking around at how eagerly analytics employ AI-generated queries, I wonder if they have any arguments to support this standpoint.
refset 43 days ago [-]
Good overview, although a rather important aspect of the 'resilience' that's not really covered here is the way that SQL technologies navigate and sustain performance trends. Relational databases are sticky in large part because the implementations are always getting faster and more sophisticated.
SQL being a declarative language means that applications built on top of it can relatively easily take advantage of new hardware and increasing parallelization without changing any code (in addition to all manner of new software tricks, compression/optimization/joins/etc).
j-pb 43 days ago [-]
Reading this gave me a weird idea. What if SQL is so successful exactly because its syntax makes JOIN operations cumbersome. Other languages where joins are syntactically convenient (e.g. datalog) make it much easier to write slow queries, whereas SQL forces you to denormalise tables tables from a syntactical standpoint already, which later translates to performance gains when it comes to semantics.
In a sense SQL is the (typewriter) QWERTY of query languages. Inconvenient by design.
hipadev23 42 days ago [-]
join purchase using (customer_id)
They only get cumbersome when you have schemas without consistent key names, complex join conditions, insist on uppercasing reserved words, or do all sorts of formatting acrobatics.
So I guess I agree with you. A well designed schema makes reading and writing SQL a pleasure
arkh 42 days ago [-]
> customer_id
It's hard to fight for those when surrounded by ORM first people. "Why call it customer_id in the customer table, call it id". Because I like using using. "But we don't care we rarely do SQL queries, why optimize for the rare cases?".
Enjoy your N+1 ORM only problem then.
euroderf 38 days ago [-]
This one is on the ground floor of my Dept of Pet Peeves.
hipadev23 42 days ago [-]
Yeah exactly. A very different way of thinking. While there are exceptions always (friend table composed of user_id pairs), I view my database/schema as a namespace. So customer_id refers to the same entity across all my tables.
rbanffy 42 days ago [-]
> insist on uppercasing reserved words,
It makes code a lot more readable
williamdclt 42 days ago [-]
As someone who does uppercase SQL, I’m not really convinced it improves readability. We don’t do it for other languages after all, syntax coloration is deemed enough.
(On the other hand, sql is often embedded in other languages where it might not get properly coloured so… maybe)
rbanffy 42 days ago [-]
Not all languages allow it, and most sexy languages forbid it by being case sensitive with their keywords.
I've used this feature in Pascal and Basic and my feeling is that it improves readability a lot, as it better highlights structure than just coloring. OTOH, I've used it mostly on monochrome 1-bit-per-pixel systems where coloring wasn't available. I believe it could be a matter of taste.
danolivo 42 days ago [-]
For me, the key idea of this paper was that relational model & SQL are successful because of the key feature: 'compact and accessible form' of a query.
gregw2 42 days ago [-]
How/when/why did the word "base" get added to "data" to form the concept of a "Data Base"?
The article mentioned CODASYL-era the term was used but wondering where it started...
refset 42 days ago [-]
> The origins of the term “data base” and subsequently “database” go back a long way. The first sighting of the term was its use in 1963 by the System Development Corporation who sponsored a symposium with the title “Development and Management of a Computer-centered Data Base”. The term “data base” was picked up by the contributors to the symposium in the titles of their papers.
Huh. "Development and Management of a Computer-centered Data Base" was published in November 1963 while a Google Books search finds the 1962 DoD "annual report of the Office of Civil Defense" at https://www.google.com/books/edition/Annual_Report_of_the_Of... (so one year older) with cover letter date "November 24, 1962" with:
> Necessary weather forecasts are furnished by the U.S. Weather Bureau. The so-called "data base" used in the computer assessment process is stored at computer locations on magnetic tapes.
On page 64 we can read about "Data Base Improvements"
> Resource data are obtained manually and by use of automatic data processing systems. ... During fiscal year 1963, OCD, through contracts and agreements with other agencies, completed or continued projects to materially strengthen the data base for these purposes.
(FY 1963 was July 1 1962-June 30 1963.)
And apparently there are three uses of "data base" in "Automation and Scientific Communication", Short Papers Contributed to the Theme Sessions of the 26th Annual Meeting of the American Documentation Institute at Chicago, Pick-Congress Hotel, October 6-11, 1963, including
> The data base for the current experimental system is derived from "Current Research and Development in Scientific Documentation, No. 10" (10).
On the other hand, https://www.google.com/books/edition/Air_Force_AFM/J8f73BV6e... tells me that in December of 1963 (after the November citation you gave), "Data Base" was defined as an Air Force term for "A group of data elements or related features arranged in a logical sequence", and an "Intelligence Data Base" is "An aggregation of finished or initially processed intelligence data ... which can be exploited to provide information to augment specific analyses and validate decisions."
That may explain why "Organization and Presentation of Environmental Data for Office of Civil Defense Use (A Feasibility Study)" dated April 1963, at https://www.google.com/books/edition/Organization_and_Presen... has a few uses of "data base" including, on page 19, "A reference library is another type of nonautomated data base."
This makes it seem like "data base" was a term in use predating that 1963 publication.
refset 42 days ago [-]
> "data base" was a term in use predating that 1963 publication
I agree, and actually reading through some of it only reinforces the feeling. I suppose the real answer could well still be CLASSIFIED, if it was even recorded in the first place.
The term "base" used with data would correlate well with the military term for "base" and perhaps adds some color to its original context.
Perhaps a "data base" originated as "A safe refuge of facts from which you would learn, train, and launch a defense or an attack?"
Or perhaps its a "base" (basis) of facts from which you operate in a military operation?
Or perhaps someone else non-military used the term first but it picked up early resonance in the military community due to its additional connotations for them or their management?
Anyway, TIL. I wonder whether a FOIA request to NSA or DoD would turn up anything.
gregw2 42 days ago [-]
Counterargument:
Actually, Oxford English Dictionary says: "OED's earliest evidence for [the term] database is from 1953, in Sociometry."
I could also believe many groups arrived at the term independently, deriving naturally from conversation along the lines of: "These are the data on which decisions are based"
See also Rich Hickey's definition of "basis". I bet he spent a good while digging into the etymology.
42 days ago [-]
abetaha 43 days ago [-]
A very enjoyable read of the history of databases through the lens of query language evolution, and the research done at IBM and Berkeley for System R and Ingres.
theanonymousone 42 days ago [-]
The concept of a "table" is probably one of the more underrated great achievements of humanity, judging by its pervasiveness in physical world and computing.
euroderf 38 days ago [-]
If only we had an equally clarified model for hierarchical data...
> Two of the important figures of System R were Leonard Liu, who was the person who hired me into IBM, and Frank King, who was the manager of the Relational Database project. ... One of the things that Frank thought was important was for our project to have a name so that we could make slides about it, write papers about it, and get some recognition. In order to get recognition for something it's good for it to have a name. So Frank called a meeting and said, "You guys are going to have to think of a name for your project." We thought, "Well, that's a waste of time. Don't bother us." But he persisted and said, "You guys are going to have to come up with a name." So for lack of anything else we said, "Well, we'll call ourselves System R." R stood for relations, or maybe it stood for research, or Franco Putzolu even thought it stood for Rufus, which was the name of his dog. It was a little bit of artful ambiguity what the R stood for, but that was the name of our project.
Also his earliest interaction with Larry Ellison:
> I’d been seeing some things in the trade press once in a while about a company called Software Development Laboratories that claimed to be developing a relational database system. I hadn’t paid much attention to it, but in the summer of 1978, I got a phone call. It was from a guy named Larry Ellison, and he said he was the president of Software Development Laboratories, and they were developing an implementation of the SQL language. Since we were in the research division of IBM, our philosophy of research was to publish our results in the open literature. As you know, many papers 22 came out of the System R project that were published in conferences and journals, describing the language and the internal interfaces of the system and some of the optimization technology and so on. The project was not a secret and, in fact, we’d been telling everybody about it that would listen. And one of the people that had listened and had read some of our papers was Larry. So he called me up and said that he was interested in implementing the SQL language in the UNIX environment. IBM wasn’t interested in UNIX at all. We were primarily a mainframe company at that time. We had some minicomputer products, but they really weren’t robust enough to manage a relational database, and there was little, if any, attention being paid to the UNIX platform. But Larry was really interested in the UNIX platform. He had a PDP-11, I think, that he was using as the basis for his SQL implementation, and he wanted to exchange visits with us and learn whatever he could about what we were doing, and in particular, he wanted to make sure that his implementation was consistent with ours so that there would be a common interface with compatible error codes and everything else. I was very pleased to get this call. I thought, “Terrific. This is somebody in the world who is interested in our work.” But I had some constraints on what I could do because of my position in IBM. I had to get management approval to talk to somebody on the outside, even though there was nothing secret about our project. Everything that Larry had access to was perfectly available in the open literature. So I went to talk to my boss, Frank King, and Frank talked to some lawyers—there were always plenty of lawyers in IBM that could think of a reason not to do most anything. Sure enough, they said, “You better not talk to other companies who are building products that are competitive with ours. We really don’t want you exchanging visits with these guys, so just tell them ‘Thanks for your interest, and have a nice day.’” So that’s what I did. I told Larry that, unfortunately, due to the constraints of the company, we wouldn’t be able to exchange information other than in the public literature. But that didn’t slow down Software Development Laboratories. They released their implementation of SQL. In fact, it was the first commercial implementation of SQL to go on the market. It was delivered by Larry Ellison’s company, initially called Software Development Laboratories, which later changed its name, I think, to Relational Software Incorporated, and later took on the name of the product, which was called Oracle. As you know, if you drive along Highway 101 in Redwood City and look at the giant 100-foot-tall disk drives over there on the edge of the bay, Larry’s had some success with these ideas. And rightly so. He brought a lot of energy and marketing expertise and a completely independent implementation, and was very successful with it, and had a major impact on the industry.
pmcjones 42 days ago [-]
More early history about System R, DB2, and some related systems here:
I hadn't seen the name Jim Melton in quite some time. I read one of his books, probably Understanding the New SQL: A Complete Guide something like thirty years ago. (If that was the book, Alan Simon was the co-author, I see.)
jll29 42 days ago [-]
Wishful thinking: I can't wait until 2050 when the ACM will post an article entitled "50 Years of Queries", which will be about the public release of the first 50 years of ACM Digital Library search queries.
It was written 20 years ago, but even today, the relational model and SQL are still the prevailing choices.
SQL being a declarative language means that applications built on top of it can relatively easily take advantage of new hardware and increasing parallelization without changing any code (in addition to all manner of new software tricks, compression/optimization/joins/etc).
In a sense SQL is the (typewriter) QWERTY of query languages. Inconvenient by design.
So I guess I agree with you. A well designed schema makes reading and writing SQL a pleasure
It's hard to fight for those when surrounded by ORM first people. "Why call it customer_id in the customer table, call it id". Because I like using using. "But we don't care we rarely do SQL queries, why optimize for the rare cases?".
Enjoy your N+1 ORM only problem then.
It makes code a lot more readable
(On the other hand, sql is often embedded in other languages where it might not get properly coloured so… maybe)
I've used this feature in Pascal and Basic and my feeling is that it improves readability a lot, as it better highlights structure than just coloring. OTOH, I've used it mostly on monochrome 1-bit-per-pixel systems where coloring wasn't available. I believe it could be a matter of taste.
The article mentioned CODASYL-era the term was used but wondering where it started...
From Section 1 of "Nineteen Sixties History of Data Base Management" https://dl.ifip.org/db/conf/ifip3/histedu2006/Olle06.pdf
> Necessary weather forecasts are furnished by the U.S. Weather Bureau. The so-called "data base" used in the computer assessment process is stored at computer locations on magnetic tapes.
On page 64 we can read about "Data Base Improvements"
> Resource data are obtained manually and by use of automatic data processing systems. ... During fiscal year 1963, OCD, through contracts and agreements with other agencies, completed or continued projects to materially strengthen the data base for these purposes.
(FY 1963 was July 1 1962-June 30 1963.)
And apparently there are three uses of "data base" in "Automation and Scientific Communication", Short Papers Contributed to the Theme Sessions of the 26th Annual Meeting of the American Documentation Institute at Chicago, Pick-Congress Hotel, October 6-11, 1963, including
> The data base for the current experimental system is derived from "Current Research and Development in Scientific Documentation, No. 10" (10).
On the other hand, https://www.google.com/books/edition/Air_Force_AFM/J8f73BV6e... tells me that in December of 1963 (after the November citation you gave), "Data Base" was defined as an Air Force term for "A group of data elements or related features arranged in a logical sequence", and an "Intelligence Data Base" is "An aggregation of finished or initially processed intelligence data ... which can be exploited to provide information to augment specific analyses and validate decisions."
That may explain why "Organization and Presentation of Environmental Data for Office of Civil Defense Use (A Feasibility Study)" dated April 1963, at https://www.google.com/books/edition/Organization_and_Presen... has a few uses of "data base" including, on page 19, "A reference library is another type of nonautomated data base."
This makes it seem like "data base" was a term in use predating that 1963 publication.
I agree, and actually reading through some of it only reinforces the feeling. I suppose the real answer could well still be CLASSIFIED, if it was even recorded in the first place.
The term "base" used with data would correlate well with the military term for "base" and perhaps adds some color to its original context.
Perhaps a "data base" originated as "A safe refuge of facts from which you would learn, train, and launch a defense or an attack?"
Or perhaps its a "base" (basis) of facts from which you operate in a military operation?
Or perhaps someone else non-military used the term first but it picked up early resonance in the military community due to its additional connotations for them or their management?
Anyway, TIL. I wonder whether a FOIA request to NSA or DoD would turn up anything.
Actually, Oxford English Dictionary says: "OED's earliest evidence for [the term] database is from 1953, in Sociometry."
(Source: https://www.oed.com/dictionary/database_n#:~:text=Where%20do....)
(Perhaps this Sociometry? : https://www.jstor.org/stable/i329357 ?)
See also Rich Hickey's definition of "basis". I bet he spent a good while digging into the etymology.
He talks about how System R got its name:
> Two of the important figures of System R were Leonard Liu, who was the person who hired me into IBM, and Frank King, who was the manager of the Relational Database project. ... One of the things that Frank thought was important was for our project to have a name so that we could make slides about it, write papers about it, and get some recognition. In order to get recognition for something it's good for it to have a name. So Frank called a meeting and said, "You guys are going to have to think of a name for your project." We thought, "Well, that's a waste of time. Don't bother us." But he persisted and said, "You guys are going to have to come up with a name." So for lack of anything else we said, "Well, we'll call ourselves System R." R stood for relations, or maybe it stood for research, or Franco Putzolu even thought it stood for Rufus, which was the name of his dog. It was a little bit of artful ambiguity what the R stood for, but that was the name of our project.
Also his earliest interaction with Larry Ellison:
> I’d been seeing some things in the trade press once in a while about a company called Software Development Laboratories that claimed to be developing a relational database system. I hadn’t paid much attention to it, but in the summer of 1978, I got a phone call. It was from a guy named Larry Ellison, and he said he was the president of Software Development Laboratories, and they were developing an implementation of the SQL language. Since we were in the research division of IBM, our philosophy of research was to publish our results in the open literature. As you know, many papers 22 came out of the System R project that were published in conferences and journals, describing the language and the internal interfaces of the system and some of the optimization technology and so on. The project was not a secret and, in fact, we’d been telling everybody about it that would listen. And one of the people that had listened and had read some of our papers was Larry. So he called me up and said that he was interested in implementing the SQL language in the UNIX environment. IBM wasn’t interested in UNIX at all. We were primarily a mainframe company at that time. We had some minicomputer products, but they really weren’t robust enough to manage a relational database, and there was little, if any, attention being paid to the UNIX platform. But Larry was really interested in the UNIX platform. He had a PDP-11, I think, that he was using as the basis for his SQL implementation, and he wanted to exchange visits with us and learn whatever he could about what we were doing, and in particular, he wanted to make sure that his implementation was consistent with ours so that there would be a common interface with compatible error codes and everything else. I was very pleased to get this call. I thought, “Terrific. This is somebody in the world who is interested in our work.” But I had some constraints on what I could do because of my position in IBM. I had to get management approval to talk to somebody on the outside, even though there was nothing secret about our project. Everything that Larry had access to was perfectly available in the open literature. So I went to talk to my boss, Frank King, and Frank talked to some lawyers—there were always plenty of lawyers in IBM that could think of a reason not to do most anything. Sure enough, they said, “You better not talk to other companies who are building products that are competitive with ours. We really don’t want you exchanging visits with these guys, so just tell them ‘Thanks for your interest, and have a nice day.’” So that’s what I did. I told Larry that, unfortunately, due to the constraints of the company, we wouldn’t be able to exchange information other than in the public literature. But that didn’t slow down Software Development Laboratories. They released their implementation of SQL. In fact, it was the first commercial implementation of SQL to go on the market. It was delivered by Larry Ellison’s company, initially called Software Development Laboratories, which later changed its name, I think, to Relational Software Incorporated, and later took on the name of the product, which was called Oracle. As you know, if you drive along Highway 101 in Redwood City and look at the giant 100-foot-tall disk drives over there on the edge of the bay, Larry’s had some success with these ideas. And rightly so. He brought a lot of energy and marketing expertise and a completely independent implementation, and was very successful with it, and had a major impact on the industry.
The 1995 SQL Reunion: People, Projects, and Politics https://mcjones.org/System_R/SQL_Reunion_95/
- CJ Date: https://www.computerhistory.org/collections/catalog/10265816...
- Don Chamberlin: https://www.computerhistory.org/collections/catalog/10270211...