Friday, March 9, 2012

NEWSEQUENTIALID sample code

We're currently converting the database of our asset management system to a SQL2005-database. The SQL2000 database uses an INT32 as the primary key with a clustered index on that column on most tables. Needless to say that the current environment is not ideal for replication, so we're investigating the use of GUIDs as primary keys.

We know the usual recommendations as far as GUIDs are concerned (4 times bigger, slower, not good for clustered indexes, yet better for replication). However, on one of the ascend-program training days one of my colleagues talked to one of the instructors and that person suggested that the use and performance of GUIDs is significantely improved in SQL2005 and that they have become the preferred datatype for primary keys if the database is used in replication.

I have tried to find some information about how GUIDs are improved in SQL2005 but I don't find any documents containing anything relevant. The only thing I've found is the documentation of the NEWSEQUENTIALID function which doesn't really tell me much. It just looks like this function improves the performance during an insert if a GUID is used on a clustered index. I have noticed for example that SharePoint only uses GUIDs for most tables; so the performance should be all that bad.

So I was wondering:

1. has the performance of GUIDs improved?

2. are there new suggestions concerning the datatype of the primary key column if the database might be used in a merge replication environment (all sites can do modifications) on SQL2005?

3. what about the performance of NEWSEQUENTIALID? Is there a bigger chance of running into duplicates using NEWSEQUENTIALID?

Regards,

Michael

NEWSEQUENTIALID is derived based on several hashes, one of which is the network card. Since no two network cards are identical, duplicates should not occur. However should you remove your network card, there's a slight chance you can get the same value as another machine whose network card has been removed, but this is highly unlikely.

Since NEWSEQUENTIALID values are incrementing as they're created, clustered index performance should improve as they'll be inserted ascending and in sort order. You also have less page splits, thus less fragmentation.

So, is NEWSEQUENTIALID a good candidate for a primary key? NEWSEQUENTIALID can only be used as a default constraint. that means if this is your primary key, and you have to update it, you can only use NEWID as the new value. Or, you can do delete/insert. If you do a lot of updates to your primary key, you'll be doing a lot of page splits introducing fragmentation.

|||How does the program get the NEWSEQUENTIALID value? In other words, the code has:
sqlCmd = "insert ..."
ExecuteNonQuery
If you have foreign key references, you'll need to retrieve your NEWSEQUENTIALID value and pass it as the foreign key value into the child table(s).|||Just use the OUTPUT clause of the INSERT statement.|||

For a parent child relationship, you can do this:

CREATE TABLE Employee(
EmployeeID uniqueidentifier NOT NULL DEFAULT NEWSEQUENTIALID(),
EmployeeName nchar(10) NOT NULL
)

insert Employee (EmployeeName)
output Inserted.EmployeeID
values ('Ima Person')

The OUTPUT clause essentially works like a SELECT statement. You can retrieve multiple values with an OUTPUT clause.

Alternatively, you may wish to use this syntax for CREATE TABLE:

CREATE TABLE Employee(
EmployeeID uniqueidentifier NOT NULL DEFAULT NEWSEQUENTIALID() ROWGUIDCOL,
EmployeeName nchar(10) NOT NULL
)

No comments:

Post a Comment