The Winner Takes It All from A3Ba

Door Rob van den Berg In Oracle

This article is about sorting and min-max aggregations on character data. When sorting matters, you end your query with an order by. You might think that if you’re sorting on character data, the first value listed would equal the aggregated minimum, and the last value listed would equal the aggregated maximum.

Sorting on character data is different from sorting on numeric data, in that there are a lot less assumptions involved in the latter. Numeric sorting is based on simple mathematic conventions which are the same anywhere on the planet. Character sorting is someting else, which will be demonstrated below.

California-Redwoods

Maximum tree height manifestation in Redwood CA

Consider this select statement:

alter session set nls_language = 'AMERICAN';

prompt American
select char_value "Chr"
, min (char_value) over () "Min"
, max (char_value) over () "Max"
, first_value(char_value) over (order by char_value) "First"
, last_value(char_value) over (order by char_value 
                               rows between unbounded preceding
                               and unbounded following) "Last"
from   ( select cast(column_value as varchar2(5)) char_value 
         from table(dbmsoutput_linesarray('A','3','B','a'))
       )
order by char_value
/

Chr   Min   Max   First Last 
----- ----- ----- ----- -----
3     3     a     3     a    
A     3     a     3     a    
B     3     a     3     a    
a     3     a     3     a    

select parameter, value
from v$nls_parameters
where parameter in ('NLS_SORT')
order by 1
/

PARAMETER  VALUE 
---------- ------
NLS_SORT   BINARY 

Now, do note I started my session with setting my NLS_LANGUAGE to America, which happens to default my sorting method to ‘BINARY’. This might not be the case for other languages, like it won’t for Dutch. Dutch sorting means that a an uppercase “B” comes after lowercase “a”, and numbers follow alfabetic characters.

As a sidenote, I converted the character found in the array [A,3,B,a] to VARCHAR2(5) just for the convenience of listing the results in an equally “spaced” fashion. No magic intended there.

To make sure that “Last_value” considers all values found in the array, I added the “unbounded both ways” analytical window.

Now, if your development environment is set up to initialize your NLS settings in “DUTCH” and your production environment in (for example) “American”, you’re in for a surprise.

When setting the language to “DUTCH”, the script above returns:

Chr   Min   Max   First Last 
----- ----- ----- ----- -----
A     3     a     A     3    
a     3     a     A     3    
B     3     a     A     3    
3     3     a     A     3    

PARAMETER VALUE                                                           
--------- ------
NLS_SORT  DUTCH

What we see is that the order of the rows listed confirm that in “DUTCH” sorting, numbers follow alfabetic characters, and lowercase characters and uppercase characters appear ‘mixed up’. Consistent with that order are the values return for the first_value and last_value functions. However, the aggregated maximum and minumum remain the same as in the previous example! It means BINARY SORTING remains in effect with respect to aggregation, regardless of the language setting.

To conclude, this article has been set up to stress the importance of consistent language settings within your development infrastructure.

Meer informatie

rob-van-den-berg1

Rob van den Berg

Oracle Database Consultant

+31 6 22 98 69 27 Stuur Rob een e-mail

Reacties

Er zijn nog geen reacties op dit bericht.

Plaats een reactie

Dit veld is verplicht.

Vul een geldig e-mailadres in.

Dit veld is verplicht.