vx company
terug naar overzicht


Insight insight

rob van den berg, vx company

Rob van den Berg

+31 6 22 98 69 27


Humble SQL

For those who love maths, there’s no escape. In his pleasantly entertaining style, Matt Parker has conquered the world’s stages. Many an after-dinner talk has been cheered up by his show which mixes up jokes on math with actual mathematical knowledge. On a more abstract level, the book is about sharing what can be learned from mistakes.

Do find more about him on his Stand-Up Maths Youtube channel.

Humble PI Transparent BG.pngHumble Pi by Matt Parker

Also, have a look at his recent book Humble Pi. Maybe even buy yourself a copy. I did so and can’t stop reading. OK, well, enough of this free advertising… #notspon. You now should have picked up the reason why I went for the remarkable blog post title though.

I guess I love both maths and SQL. And coming to think of it, when visiting Oracle conferences like Tech Experience or UKOUG Technology Conference, sessions tending to a comedy act on real-world SQL problems do appear to stick the best. And luckily, the SQL-subject presenters of these conferences have been doing quite well  on that!

What does it take to present on a SQL subject in a comedian style? Besides the obvious presenter skills, you would need real world SQL experience, like actually having been involved in a SQL problem, preferably in context of a bug in an application in production. Also, you would need to be able to camouflage more than a bit. That’s because problem owners obviously don’t like te be discomforted by being confronted with mistakes in their organisation in public. This is not an easy task. Let me try with one realistic example though.

The customer demanded a report with a preface mentioning the total number of “inhabitant registrations” (note the camouflage) followed by a more detailed paragraph. Which had to feature a table of subtotals representing the amount of people in a group of age ranges. Just like the example below:

The research includes 127,015,265 inhabitant registrations. These can be divided into the following age ranges:

Ages Subtotal
0-1 170023
2-5 2396859
6-10 3605589
11-20 8248278
21-50 48913234
51-100 52634295
101-200 5337274

The number mentioned in the first paragraph had been calculated by counting all the records in the REGISTRATIONS table. The subsequent table had been calculated by a query like this

SELECT  rng.range
,       COUNT(*) subtotal
FROM    registrations reg
JOIN    ranges rng
ON      (MONTHS_BETWEEN(SYSDATE, prs.date_of_birth)/12)
        BETWEEN rng.lower_bound AND rng.upper_bound
WHERE   rng.range_type = 'Age ranges'
GROUP   BY rng.range

At first it seemed there were no problems. But soon it turned out that the numbers in the table didn’t add up to the total mentioned in the preface. In fact it was off by an amount of 5,709,713 ‘missing values’, which couldn’t sufficiently be explained by the usual “bad data” excuse (like nullified or out of range values for the referenced date_of_birth column).

To give you a hint on what had happened: suppose one calculates the age of a person born just over fifty years ago. That would result in a number between 50 and 51. But wait, that individual would not fall into the range of ages between 21 and 50, nor between 51 and 100… Of course, the query join expression calculating the age suffered from a missing TRUNCATE function.

ON      TRUNC(MONTHS_BETWEEN(SYSDATE, prs.date_of_birth)/12)
        BETWEEN rng.lower_bound AND rng.upper_bound

Not so much a hilarious example, however it does demonstrate how sharing your mistakes can have a learning effect on others!


Meer weten over dit onderwerp?

rob van den berg, vx company
Neem contact op met Rob van den Berg
gang van het kantoor, vx company