This article is about sorting and min-max aggregations on character data. When sorting matters, you end your query with an order by. You might think that if you’re sorting on character data, the first value listed would equal the aggregated minimum, and the last value listed would equal the aggregated maximum.
Sorting on character data is different from sorting on numeric data, in that there are a lot less assumptions involved in the latter. Numeric sorting is based on simple mathematic conventions which are the same anywhere on the planet. Character sorting is someting else, which will be demonstrated below.
Maximum tree height manifestation in Redwood CA
Consider this select statement:
alter session set nls_language = 'AMERICAN'; prompt American select char_value "Chr" , min (char_value) over () "Min" , max (char_value) over () "Max" , first_value(char_value) over (order by char_value) "First" , last_value(char_value) over (order by char_value rows between unbounded preceding and unbounded following) "Last" from ( select cast(column_value as varchar2(5)) char_value from table(dbmsoutput_linesarray('A','3','B','a')) ) order by char_value / Chr Min Max First Last ----- ----- ----- ----- ----- 3 3 a 3 a A 3 a 3 a B 3 a 3 a a 3 a 3 a select parameter, value from v$nls_parameters where parameter in ('NLS_SORT') order by 1 / PARAMETER VALUE ---------- ------ NLS_SORT BINARY
Now, do note I started my session with setting my NLS_LANGUAGE to America, which happens to default my sorting method to ‘BINARY’. This might not be the case for other languages, like it won’t for Dutch. Dutch sorting means that a an uppercase “B” comes after lowercase “a”, and numbers follow alfabetic characters.
As a sidenote, I converted the character found in the array [A,3,B,a] to VARCHAR2(5) just for the convenience of listing the results in an equally “spaced” fashion. No magic intended there.
To make sure that “Last_value” considers all values found in the array, I added the “unbounded both ways” analytical window.
Now, if your development environment is set up to initialize your NLS settings in “DUTCH” and your production environment in (for example) “American”, you’re in for a surprise.
When setting the language to “DUTCH”, the script above returns:
Chr Min Max First Last ----- ----- ----- ----- ----- A 3 a A 3 a 3 a A 3 B 3 a A 3 3 3 a A 3 PARAMETER VALUE --------- ------ NLS_SORT DUTCH
What we see is that the order of the rows listed confirm that in “DUTCH” sorting, numbers follow alfabetic characters, and lowercase characters and uppercase characters appear ‘mixed up’. Consistent with that order are the values return for the first_value and last_value functions. However, the aggregated maximum and minumum remain the same as in the previous example! It means BINARY SORTING remains in effect with respect to aggregation, regardless of the language setting.
To conclude, this article has been set up to stress the importance of consistent language settings within your development infrastructure.