What would SQL look like if...

For a change of pace, lets do something more practical.

I have little doubt that whatever solution I push for CRUD, it will be based on a language that behaves like SQL.

But SQL has a large share of issues, some coming from semantics, and some from syntax. This article will focus on syntax, and experiment with a few alternatives.

The SQL syntax has some well known problems, most of them common in commercial languages from the 80s:

  • it's way too complex and inflexible

  • it's way too verbose

  • it's not composed of small blocks

  • small blocks do not compose into large instructions

  • the order it's written is inconsistent, doesn't match the semantics, doesn't work for partial program analysis, and allows for common and very harmful mistakes

  • extending the semantics requires adding complexity to the syntax

So, let's try changing it and see what it becomes.

Here is a simple select query (on a Postgres idiom), an update, an insertion and a deletion:

select person.id as person_id, person.name, email.id as email_id, email.address as email
from person join email on person.id = email.person_id
where person.is_active;

update person
set is_active = false
where id = 4;

insert into email (person_id, address)
values (3, 'totally_not_fake@example.com');

delete from email
where person_id = 3;

Let's write those queries in a language that resembles the Haskell syntax:

select person
	`join` email `on` (person.id = email.person_id)
	`where` person.is_active
	`values` [
		  person.id `as` person_id
		, person.name
		, email.id `as` email_id
		, email.address `as` email
		]
	
update person 
	[is_active = False]
	`where` id = 4
	
insert email
	[person_id = 3, address = "totallynot_fake@example.com"]
	
delete email
	`where` person_id = 3

It may not look like much changed, but this syntax avoids the problems from the begining of this article.

For example, it's implicit there that select person is a query by itself, that is being further composed. And at the end of each line, it has a clear meaning that only depends on the previous lines.

On the other hand, the backticks are horrible. They would go away if the words were replaced by what the language understands as operators.

Allowing for completely custom operators does lead into an incredibly bad developer experience, where people must check the properties of each and every symbol they use. Adding Haskell-like operators would lead to a very Haskell-like code, that IMO, is not an improvement either. A middle-way would be to use operator to compose text-led definitions:

select person
	<> join email `on` person.id = email.person_id
	<> where person.is_active
	<> values [person.id `as` person_id]

That is still very Haskell-like (rename the (=), (.), and where, and it could be Haskell), has very little operator noise (rename (<>) and it would be even better), and very little backticks.

Create some operators for on and as and the backticks would be gone at the expensive of a small increase in operator noise.

Still, those are many more operators than the minimum. Besides, the Haskell implementation of both semantic whitespace and baces/semicolon works very badly in shell-like REPLs (and database access really need those).


There's a niche data encoding language created exactly to be usable in a shell-like environment, named KDL. Let's try to write those queries as data (the Haskell one is data too) in a syntax that would be the result of adding expressions to it:

select person {
	join email on = (person.id = email.person_id)
	where person.is_active
	values {
		get person.id as = person_id
		get person.name
		get email.id as = email_id
		get email.address as = email
	}
}

update person {
	set is_active = false name = "DELETED"
	where (id = 4)
}

insert email {
	set person_id = 3 address = "tottally_not_fake@example.com"
}

delete email {
	where (person_id = 3)
}

That's almost good. The language is missing bare values and inline collections. The result becomes verbose as a consequence. So let's add those.

The attribute-value separator and value equality being the same symbol is also a problem. Let's pick another separator and use colon as term separator (instead of the orignal semicolon):

select person {
	join email on: (person.id = email.person_id)
	where person.is_active
	values [
		person.id as: person_id
		person.name
		email.id as: email_id
		email.address as: email
	]
}

update person {
	set [is_active = false, name = "DELETED"]
	where (id = 4)
}

insert email {
	set [person_id = 3, address = "tottally_not_fake@example.com"]
}

delete email {
	where (person_id = 3)
}

That loses compatibility with the original language, but is quite good.

There are more parenthesis than it should, but I see no way to fix those right now.


Another common data representation language is Lisp (the dots have to compose the symbols, otherwise it would be awful):

(select
	(join
		person
		(on email (= person.id email.person_id))
	)
	(where person.is_active)
	(values
		(as person.id 'person_id)
		person.name
		(as email.id 'email_id)
		(as email.address 'email)
	)
)

(update person
	(set
		(is_active false)
	)
	(where (= id 4))
)

(insert email
	(set
		(person_id 3)
		(address "tottally_not_fake@example.com")
	)
)

(delete email
	(where (= person_id 3))
)

And well, parenthesis abound.

Otherwise it's quite nice. Except for the dots thing, this is valid Lisp. (Should I just use Lisp?)