Understanding SQL Server Collation Sequences

A DBA Primer : Collations

Collation is concerned with how character data is interpreted by SQL Server. Until you run into a problem concerning them, you’re probably blissfully unaware of their existence. The following error is easy to generate by joining columns of different collations.

Msg 468, Level 16, State 9, Line 1

Cannot resolve the collation conflict between "SQL_Latin1_General_CP1_CI_AS" and "Latin1_General_CI_AS" in the equal to operation.

A collation ties together :

1) Code Page

This is a single-byte character set that represents the alphabet, punctuation & symbols of a language. Data types using a code page are char, varchar & ntext.

Code page does not apply to double-byte (Unicode) characters.

2) Sort Order for Unicode data types.

Unicode file types require double byte storage. Data types are nchar , nvarchar and ntext.

3) Sort order for code page characters

Sort order itself is affected by :

Case Sensitivity

In comparison and sorting operations, are uppercase and lowercase characters considered equal? In a Case Sensitive collation 'A' <> 'a', 'Cat' <> 'CAT' etc.

Accent Sensitivity

In comparison and sorting operations, are accented and unaccented characters considered equal? Does 'a' = '?', 'Bronte' = 'Brontë'?

Kana sensitivity

Japan has 2 types of characters Hiragana and Katakana.

In comparison and sorting operations, are Kana sensitivite collation treated them as equal?

Width sensitivity

In comparison and sorting operations, is a single byte character equal to it's double-byte representation?

Collation naming reveals all this information.

For example, the Latin1_General_CI_AS collation is a Latin code page that is Case Insensitive (CI) and Accent Sensitive (AS).

Exploring Collations

Step 1 :  What Collation is SQL installed under?

SELECT SERVERPROPERTY('COLLATION')

Step 2 : Which collations are available to me?

SELECT Name, Description FROM fn_helpcollations()

Step 3 : Which databases have a different collation to the server default?

SELECT

 NAME AS DATABASE_NAME

 , DATABASEPROPERTYEX(NAME,'COLLATION') AS DBCOLLATION

 , SERVERPROPERTY('COLLATION') AS SERVERCOLLATION

FROM SYS.DATABASES

WHERE CONVERT(SYSNAME,DATABASEPROPERTYEX(NAME,'COLLATION')) <> SERVERPROPERTY('COLLATION')

 Step 4 : Show me the collation for each column in my database

SELECT

 C.TABLE_CATALOG AS DATABASE_NAME

 ,C.TABLE_SCHEMA

 ,C.TABLE_NAME

 ,C.COLUMN_NAME

 ,DATA_TYPE

 ,SERVERPROPERTY('COLLATION') AS SERVER_COLLATION

 ,CONVERT(SYSNAME,DATABASEPROPERTYEX(D.NAME,'COLLATION')) AS DATABASE_COLLATION

 ,C.COLLATION_NAME AS COLUMN_COLLATION

FROM INFORMATION_SCHEMA.COLUMNS C

INNER JOIN SYS.DATABASES D

ON DB_ID(C.TABLE_CATALOG) = DB_ID(D.NAME)

WHERE DATA_TYPE IN ('VARCHAR' ,'CHAR','NVARCHAR','NCHAR','TEXT','NTEXT')

 Step 5 : Show me differences in collation settings on my server.

This produces two results sets :

1) Databases where collation is different from the server setting

2) Columns where collation is different from the database setting

 IF EXISTS (SELECT * FROM TEMPDB.DBO.SYSOBJECTS WHERE ID = OBJECT_ID('tempdb.dbo.#CollationComparison')) DROP TABLE #CollationComparison

 CREATE TABLE #CollationComparison

(Database_Name SYSNAME

,Table_Schema SYSNAME

,Table_Name SYSNAME

,Column_Name SYSNAME

,Server_Collation SYSNAME

,Database_Collation SYSNAME

,Column_Collation SYSNAME)

 DECLARE @SQL NVARCHAR(MAX)

 DECLARE @dbname NVARCHAR(200)

DECLARE dbcursor CURSOR FOR

select name from sys.databases

OPEN dbcursor

FETCH NEXT FROM dbcursor INTO @dbname

WHILE @@FETCH_STATUS = 0

BEGIN

print @dbname

 SET @SQL = 'INSERT INTO #CollationComparison

 (Database_Name

 ,Table_Schema

 ,Table_Name

 ,Column_Name

 ,Server_Collation

 ,Database_Collation

 ,Column_Collation)

 SELECT

 C.TABLE_CATALOG AS DATABASE_NAME

 ,C.TABLE_SCHEMA

 ,C.TABLE_NAME

 ,C.COLUMN_NAME

 ,CONVERT(VARCHAR,SERVERPROPERTY(''COLLATION'')) AS SERVER_COLLATION

 ,CONVERT(SYSNAME,DATABASEPROPERTYEX(D.NAME,''COLLATION'')) AS DATABASE_COLLATION

 ,C.COLLATION_NAME AS COLUMN_COLLATION

FROM [' + @dbname + '].INFORMATION_SCHEMA.COLUMNS C

INNER JOIN SYS.DATABASES D

ON DB_ID(C.TABLE_CATALOG) = DB_ID(D.NAME)

WHERE DATA_TYPE IN (''VARCHAR'' ,''CHAR'',''NVARCHAR'',''NCHAR'',''TEXT'',''NTEXT'')

'

exec sp_executesql @SQL

print @sql

FETCH NEXT FROM dbcursor INTO @dbname

END

 CLOSE dbcursor

DEALLOCATE dbcursor

 SELECT DISTINCT Server_Collation,Database_Collation,Database_Name FROM #CollationComparison WHERE Server_Collation <> Database_Collation

SELECT DISTINCT * FROM #CollationComparison WHERE Column_Collation <> Database_Collation

So, you've got collation differences. Now what?

Well you can deal with them at the query level, e.g;

 in a WHERE clause ...

SELECT columnlist

FROM table

WHERE searchedcolumn COLLATE SQL_Latin1_General_CP1_CS_AS = 'Searched Text'

 Or  in a JOIN

 SELECT columnlist

FROM table1

LEFT OUTER JOIN table2

ON table1.textid = table2.textid COLLATE SQL_Latin1_General_CP1_CI_AI

NB : Using Collate in this way will prevent the query optimiser from using optimal indexes and creating an efficient execution plan.

If you can change the database schema,  you can ....

Change Collation of a column -

ALTER TABLE tablename

ALTER COLUMN columnname datatype

COLLATE collationname

e.g,

ALTER TABLE Person.Contact

ALTER COLUMN Lastname

COLLATE Latin1_General_CS_AS

Change Collation of a database -

There are 2 approaches to this problem

1) Create a second database by scripting the first and transferring the data. Knowledgebase Article 325335 describes this.

2) IF you are Using Sql 2005 SP2+, you can change your existing database directly.

Firstly change the collation setting –

ALTER DATABASE [adventureworks] COLLATE Latin1_General_CS_AS

Then change each column individually using the ALTER TABLE ... ALTER COLUMN... syntax.

Thankfully, there are plenty of scripts around to help automate this  e.g. here

Change the default collation of a server

You really don't want to do this (though instructions are here). Reinstalling Sql Server is less time consuming in my opinion.

Hopefully this will help you get an insight into Collation settings in your environment and go some way towards locating the cause of collation issues.

End of Technical Article

SQL Server Club is a free community service from Norb Technologies - Making SQL Server Faster - www.norbtechnologies.com