Understanding SQL Server Collation Sequences
A DBA Primer : Collations
Collation is concerned with how character data is interpreted by SQL Server. Until you run into a problem concerning them, you’re probably blissfully unaware of their existence. The following error is easy to generate by joining columns of different collations.
Msg 468, Level 16, State 9, Line 1
Cannot resolve the collation conflict
between "SQL_Latin1_General_CP1_CI_AS" and "Latin1_General_CI_AS" in the equal
to operation.
A collation ties together :
1) Code Page
This is a single-byte character set that represents the alphabet, punctuation & symbols of a language. Data types using a code page are char, varchar & ntext.
Code page does not apply to double-byte (Unicode) characters.
2) Sort Order for
Unicode data types.
Unicode file types require double byte storage. Data types are nchar , nvarchar and ntext.
3) Sort order for code
page characters
Sort order itself is affected by :
Case SensitivityIn comparison and sorting operations, are uppercase and lowercase characters considered equal? In a Case Sensitive collation 'A' <> 'a', 'Cat' <> 'CAT' etc.
Accent Sensitivity
In comparison and sorting operations, are accented and unaccented characters considered equal? Does 'a' = '?', 'Bronte' = 'Brontë'?
Kana sensitivity
Japan has 2 types of characters Hiragana and Katakana.
In comparison and sorting operations, are Kana sensitivite collation treated them as equal?
Width sensitivity
In comparison and sorting operations, is a single byte character equal to it's double-byte representation?
Collation naming reveals all this information.
For example, the Latin1_General_CI_AS collation is a Latin code page that is Case Insensitive (CI) and Accent Sensitive (AS).
Exploring CollationsStep 1 : What Collation is SQL installed under?
SELECT SERVERPROPERTY('COLLATION')
Step 2 : Which collations are available to me?
SELECT Name, Description FROM fn_helpcollations()
Step 3 : Which databases have a different collation to the server default?
SELECT
NAME
AS DATABASE_NAME
,
DATABASEPROPERTYEX(NAME,'COLLATION') AS DBCOLLATION
,
SERVERPROPERTY('COLLATION') AS SERVERCOLLATION
FROM SYS.DATABASES
WHERE
CONVERT(SYSNAME,DATABASEPROPERTYEX(NAME,'COLLATION')) <>
SERVERPROPERTY('COLLATION')
SELECT
C.TABLE_CATALOG
AS DATABASE_NAME
,C.TABLE_SCHEMA
,C.TABLE_NAME
,C.COLUMN_NAME
,DATA_TYPE
,SERVERPROPERTY('COLLATION')
AS SERVER_COLLATION
,CONVERT(SYSNAME,DATABASEPROPERTYEX(D.NAME,'COLLATION'))
AS DATABASE_COLLATION
,C.COLLATION_NAME
AS COLUMN_COLLATION
FROM INFORMATION_SCHEMA.COLUMNS C
INNER JOIN SYS.DATABASES D
ON DB_ID(C.TABLE_CATALOG) =
DB_ID(D.NAME)
WHERE DATA_TYPE IN ('VARCHAR'
,'CHAR','NVARCHAR','NCHAR','TEXT','NTEXT')
This produces two results sets :
1) Databases where collation is different from the server setting
2) Columns where collation is different from the database setting
(Database_Name SYSNAME
,Table_Schema SYSNAME
,Table_Name SYSNAME
,Column_Name SYSNAME
,Server_Collation SYSNAME
,Database_Collation SYSNAME
,Column_Collation SYSNAME)
DECLARE dbcursor CURSOR FOR
select name from sys.databases
OPEN dbcursor
FETCH NEXT FROM dbcursor INTO @dbname
WHILE @@FETCH_STATUS = 0
BEGIN
print @dbname
(Database_Name
,Table_Schema
,Table_Name
,Column_Name
,Server_Collation
,Database_Collation
,Column_Collation)
C.TABLE_CATALOG
AS DATABASE_NAME
,C.TABLE_SCHEMA
,C.TABLE_NAME
,C.COLUMN_NAME
,CONVERT(VARCHAR,SERVERPROPERTY(''COLLATION''))
AS SERVER_COLLATION
,CONVERT(SYSNAME,DATABASEPROPERTYEX(D.NAME,''COLLATION''))
AS DATABASE_COLLATION
,C.COLLATION_NAME
AS COLUMN_COLLATION
FROM [' + @dbname +
'].INFORMATION_SCHEMA.COLUMNS C
INNER JOIN SYS.DATABASES D
ON DB_ID(C.TABLE_CATALOG) =
DB_ID(D.NAME)
WHERE DATA_TYPE IN (''VARCHAR''
,''CHAR'',''NVARCHAR'',''NCHAR'',''TEXT'',''NTEXT'')
'
exec sp_executesql @SQL
print @sql
FETCH NEXT FROM dbcursor INTO @dbname
END
DEALLOCATE dbcursor
SELECT DISTINCT * FROM
#CollationComparison WHERE Column_Collation <> Database_Collation
So, you've got collation differences. Now what?
Well you can deal with them at the query level, e.g;
SELECT columnlist
FROM table
WHERE searchedcolumn COLLATE
SQL_Latin1_General_CP1_CS_AS = 'Searched Text'
FROM table1
LEFT OUTER JOIN table2
ON table1.textid = table2.textid
COLLATE SQL_Latin1_General_CP1_CI_AI
NB : Using Collate in this way will
prevent the query optimiser from using optimal indexes and creating an efficient
execution plan.
If you can change the database schema, you can ....
Change Collation of a
column -
ALTER TABLE tablename
ALTER COLUMN columnname datatype
COLLATE collationname
e.g,
ALTER TABLE Person.Contact
ALTER COLUMN Lastname
COLLATE Latin1_General_CS_AS
Change Collation of a
database -
There are 2 approaches to this problem
1) Create a second database by scripting the first and transferring the data. Knowledgebase Article 325335 describes this.
2) IF you are Using Sql 2005 SP2+, you can change your existing database directly.
Firstly change the collation setting –
ALTER DATABASE [adventureworks] COLLATE
Latin1_General_CS_AS
Then change each column individually using the ALTER TABLE ... ALTER COLUMN... syntax.
Thankfully, there are plenty of scripts around to help
automate this e.g.
here
Change the default
collation of a server
You really don't want to do this (though instructions are here). Reinstalling Sql Server is less time consuming in my opinion.
Hopefully this will help you get an insight into Collation settings in your environment and go some way towards locating the cause of collation issues.
End of Technical Article
SQL Server Club is a free community service from Norb Technologies - Making SQL Server Faster - www.norbtechnologies.com