Friday, February 24, 2012

Index Column Order of Operations

I am trying to understand proper index column ordering for a covering index
when the columns are used in a WHERE, JOIN, and SELECT clauses. I understand
the need for proper index column ordering, but I do not understand if there
is an order of operations, meaning do the WHERE columns are first, followed
by JOIN columns, and lastly by SELECT list columns? Consider the following:
For example, let's say this is my table in it's original form, and I want to
create a nonclustered covering index for the SQL Statement below.
CREATE TABLE dbo.MyTable(
MyTable_UID int IDENTITY(1,1) NOT NULL,
Note varchar(25) NOT NULL,
Display int NOT NULL,
DetailFID int NOT NULL,
CodeFID int NOT NULL,
Sequence int NULL
) ON [PRIMARY]
GO
CREATE CLUSTERED INDEX IDX_CL_MyTable ON dbo.MyTable(MyTable_UID) ON
PRIMARY
GO
SQL STATEMENT:
SELECT mt.Note
FROM dbo.MyTable mt
JOIN dbo.ChargeCodes cc
ON cc.Detail_UID = mt.DetailFID
AND cc.Display = mt.Display
JOIN dbo.MasterCodes mc
ON mc.Code_UID = mt.CodeFID
AND mc.Display = mt.Display
WHERE
mt.Sequence = 1
When determining column ordering for a nonclustered covering index, should
you:
1. Make the WHERE clause column first, followed by JOIN columns, followed by
columns in the SELECT list, resulting in
CREATE NONCLUSTERED INDEX IDX_NCL_MyTable_Sequence
ON dbo.MyTable(Sequence, DetailFID, Display, CodeFID, Note) ON PRIMARY
2. Or, should it be by the most selective column to the least selective?
3. Or, does it depend? And in that case, what are the dependencies?
Message posted via droptable.com
http://www.droptable.com/Uwe/Forums...erver/200610/1
cbrichards via droptable.com wrote:
> I am trying to understand proper index column ordering for a covering index
> when the columns are used in a WHERE, JOIN, and SELECT clauses. I understand
> the need for proper index column ordering, but I do not understand if there
> is an order of operations, meaning do the WHERE columns are first, followed
> by JOIN columns, and lastly by SELECT list columns? Consider the following:
> For example, let's say this is my table in it's original form, and I want to
> create a nonclustered covering index for the SQL Statement below.
> CREATE TABLE dbo.MyTable(
> MyTable_UID int IDENTITY(1,1) NOT NULL,
> Note varchar(25) NOT NULL,
> Display int NOT NULL,
> DetailFID int NOT NULL,
> CodeFID int NOT NULL,
> Sequence int NULL
> ) ON [PRIMARY]
> GO
> CREATE CLUSTERED INDEX IDX_CL_MyTable ON dbo.MyTable(MyTable_UID) ON
> PRIMARY
> GO
> SQL STATEMENT:
> SELECT mt.Note
> FROM dbo.MyTable mt
> JOIN dbo.ChargeCodes cc
> ON cc.Detail_UID = mt.DetailFID
> AND cc.Display = mt.Display
> JOIN dbo.MasterCodes mc
> ON mc.Code_UID = mt.CodeFID
> AND mc.Display = mt.Display
> WHERE
> mt.Sequence = 1
>
> When determining column ordering for a nonclustered covering index, should
> you:
> 1. Make the WHERE clause column first, followed by JOIN columns, followed by
> columns in the SELECT list, resulting in
> CREATE NONCLUSTERED INDEX IDX_NCL_MyTable_Sequence
> ON dbo.MyTable(Sequence, DetailFID, Display, CodeFID, Note) ON PRIMARY
> 2. Or, should it be by the most selective column to the least selective?
> 3. Or, does it depend? And in that case, what are the dependencies?
>
The goal is to eliminate unwanted rows as quickly as possible.
Generally, the columns referenced in the WHERE clause should be the
leading edge columns of your index(es). Experiment a little bit,
compare the execution plan that is produced with different combinations
of indexes and query structures.
Tracy McKibben
MCDBA
http://www.realsqlguy.com
|||"cbrichards via droptable.com" wrote:
> I am trying to understand proper index column ordering for a covering index
> when the columns are used in a WHERE, JOIN, and SELECT clauses. I understand
> the need for proper index column ordering, but I do not understand if there
> is an order of operations, meaning do the WHERE columns are first, followed
> by JOIN columns, and lastly by SELECT list columns? Consider the following:
> For example, let's say this is my table in it's original form, and I want to
> create a nonclustered covering index for the SQL Statement below.
> CREATE TABLE dbo.MyTable(
> MyTable_UID int IDENTITY(1,1) NOT NULL,
> Note varchar(25) NOT NULL,
> Display int NOT NULL,
> DetailFID int NOT NULL,
> CodeFID int NOT NULL,
> Sequence int NULL
> ) ON [PRIMARY]
> GO
> CREATE CLUSTERED INDEX IDX_CL_MyTable ON dbo.MyTable(MyTable_UID) ON
> PRIMARY
> GO
> SQL STATEMENT:
> SELECT mt.Note
> FROM dbo.MyTable mt
> JOIN dbo.ChargeCodes cc
> ON cc.Detail_UID = mt.DetailFID
> AND cc.Display = mt.Display
> JOIN dbo.MasterCodes mc
> ON mc.Code_UID = mt.CodeFID
> AND mc.Display = mt.Display
> WHERE
> mt.Sequence = 1
> When determining column ordering for a nonclustered covering index, should
> you:
> 1. Make the WHERE clause column first, followed by JOIN columns, followed by
> columns in the SELECT list, resulting in
> CREATE NONCLUSTERED INDEX IDX_NCL_MyTable_Sequence
> ON dbo.MyTable(Sequence, DetailFID, Display, CodeFID, Note) ON PRIMARY
> 2. Or, should it be by the most selective column to the least selective?
> 3. Or, does it depend? And in that case, what are the dependencies?
What Tracy said is true. The optimizer will try to elimate as many rows
as quickly as possible. Whether this is done with the WHERE clause or
with one of the two joins depends on your data.
If your WHERE clause tries to do an exact match (such as in your
example), then it can be useful to use that column (or columns) followed
by the join columns of most restricting join that will follow. In that
case, make sure that the join columns are in order of the index defined
on the referenced table. So in your example, the join with ChargeCodes
should be more restricting that the join with MasterCodes. Also, the
primary on ChargeCode should be on (DetailFID, Display), in that
particular order.
If your WHERE clause will result in a selection or range of values (for
example WHERE ms.Sequence > 100) then the columns that follow in the
index definition will probably not be used for a loop join or merge
join.
In general, once the set of columns from left to right will result in
multiple values, then the order of the remaining columns are no longer
important, because for fast execution SQL Server will have to reorder
the values anyway.
The simplest approach is to create several indexes and check the query
plan to see which is used.
If you use SQL Server 2005, then note that you can include the
additional columns, instead of adding them to each key.
For example:
CREATE INDEX .. ON MyTable(Sequence, DetailFID, Display) INCLUDE
(CodeFID, Note)
CREATE INDEX .. ON MyTable(Sequence, CodeFID, Display) INCLUDE
(DetailFID, Note)
HTH,
Gert-Jan

No comments:

Post a Comment