Monday, January 6, 2014

TSQL - Indexing Notes

These are a few ideas I wanted to jot down while designing a snowflake data mart.

A clustered index determines the physical order of data in a table. Ordered values received non-contiguously can cause fragmentation.

A clustered index is efficient when included columns are searched for ranges of values, or when finding a specific row when the indexed value is unique.

There can only be one clustered index to rule the table.

A clustered index cannot be applied to a table variable.

Recall that all columns in the clustered index are always included in all other indexes on that table. The columns defined in the clustered index are appended to the end if the other indices.

The PK creates a clustered index by default, as long as no other clustered index resides on the table at the time of the PK creation. You can use the NONCLUSTERED key word in the PK definition to override this.

In a snow-flake data mart, the fact PK can be comprised of the combination of dimensional FKs. A fact can have up to 20 dimensions generally. You would not want to have the PK clustered in this case, as it would be too wide. It would take too much space and may limit other indexing strategies for other query requirements.

Create the PK as such

CONSTRAINT [PK_FacilityFact] PRIMARY KEY NONCLUSTERED ([ClientID], [EfftiveDate], [FacilityID], [AddressID], [PatientID])

And add the clustered index as such
CREATE CLUSTERED INDEX [IX_FacilityFact_DateClient] ON [dbo].[FacilityFact] ([EffectiveDate], [ClientID])

Avoid adding the columns in the clustered index to additional indices.
--will include [EffectiveDate], [ClientID] from clustered index
CREATE INDEX [IX_FacilityFact_Address] ON [dbo].[FacilityFact] (AddressID)