Wednesday, March 28, 2012

Index scan

If i have only one index and that being a clustered index on say a datetime
column namely date1 and if my query is
select col1,col2 from table1 where date1 between '2/1/2008' and '2/3/2008'
If I look at the query plan and it says its doing an index scan, does that
mean its actually touching each and every page of that table or will it
somehow start at the page that has the first entry for '2/1/2008' and
continues through the linked list at the leaf level of the pages and stops
after it reaches '2/3/2008' ?
How is this different if instead of a clustered index, its a non clustered
index ?
Thanks
John
It is doing Clustered Index Scan , you meant? If you have CI on the table
that means SQL Server logicaly orders all data by Clustered Index Key.
It depends on the query , an optimizer may or may not decide to do a scan
, for example the table is pretty small .
In this case it scans index pages toread the data which ordered (logicaly)
by date column
In you case I'd suggest t create an index on col1,col2 and dt columns called
COVERING index.

> How is this different if instead of a clustered index, its a non clustered
> index ?
The difference is that clusterd index contains at the bottom level the
actual data , while noclustetred contains pointers to the data pages.
"John Doe" <Johndoe@.jd.com> wrote in message
news:eXI36LscIHA.484@.TK2MSFTNGP06.phx.gbl...
> If i have only one index and that being a clustered index on say a
> datetime column namely date1 and if my query is
> select col1,col2 from table1 where date1 between '2/1/2008' and '2/3/2008'
> If I look at the query plan and it says its doing an index scan, does that
> mean its actually touching each and every page of that table or will it
> somehow start at the page that has the first entry for '2/1/2008' and
> continues through the linked list at the leaf level of the pages and stops
> after it reaches '2/3/2008' ?
> How is this different if instead of a clustered index, its a non clustered
> index ?
> Thanks
>
|||Uri,
In my case, its doing a clustered index scan and wanted to know if as a
result, its touching all the pages that may have dates prior to '2/1' and
after '2/3' as my query is only seeking to obtain data between '2/1/2008 and
'2/3/2008' and as you mentioned that data in the CI is ordered. Let me know
how the storage engine fetches the pages.
Also if it was a non clustered index on date1 instead of a clustered index,
i take it that the leaf level of the non clustered index is also
ordered..right ?
"Uri Dimant" <urid@.iscar.co.il> wrote in message
news:OBuZpSscIHA.3932@.TK2MSFTNGP05.phx.gbl...
> John
> It is doing Clustered Index Scan , you meant? If you have CI on the table
> that means SQL Server logicaly orders all data by Clustered Index Key.
> It depends on the query , an optimizer may or may not decide to do a scan
> , for example the table is pretty small .
> In this case it scans index pages toread the data which ordered (logicaly)
> by date column
> In you case I'd suggest t create an index on col1,col2 and dt columns
> called COVERING index.
>
> The difference is that clusterd index contains at the bottom level the
> actual data , while noclustetred contains pointers to the data pages.
>
> "John Doe" <Johndoe@.jd.com> wrote in message
> news:eXI36LscIHA.484@.TK2MSFTNGP06.phx.gbl...
>
|||John
> Also if it was a non clustered index on date1 instead of a clustered
> index, i take it that the leaf level of the non clustered index is also
> ordered..right ?
Think about CI as phone book which is 'ordered' by LastName ,so if you
want to search by Lastname is easy to traverse and get it
But what if you want to search by NCI -FirstName, then you will have to
page by page which requires an 'extra work'
I'd suggest to take some course or buying a book to underastand the
structure and behaviour.
"John Doe" <Johndoe@.jd.com> wrote in message
news:uT7oOcscIHA.748@.TK2MSFTNGP04.phx.gbl...
> Uri,
> In my case, its doing a clustered index scan and wanted to know if as a
> result, its touching all the pages that may have dates prior to '2/1' and
> after '2/3' as my query is only seeking to obtain data between '2/1/2008
> and '2/3/2008' and as you mentioned that data in the CI is ordered. Let me
> know how the storage engine fetches the pages.
> Also if it was a non clustered index on date1 instead of a clustered
> index, i take it that the leaf level of the non clustered index is also
> ordered..right ?
>
> "Uri Dimant" <urid@.iscar.co.il> wrote in message
> news:OBuZpSscIHA.3932@.TK2MSFTNGP05.phx.gbl...
>
|||On Feb 19, 12:45Xpm, "John Doe" <John...@.jd.com> wrote:
> If i have only one index and that being a clustered index on say a datetime
> column namely date1 and if my query is
> select col1,col2 from table1 where date1 between '2/1/2008' and '2/3/2008'
> If I look at the query plan and it says its doing an index scan, does that
> mean its actually touching each and every page of that table or will it
> somehow start at the page that has the first entry for '2/1/2008' and
> continues through the linked list at the leaf level of the pages and stops
> after it reaches '2/3/2008' ?
> How is this different if instead of a clustered index, its a non clustered
> index ?
> Thanks
With clustered indexes records are stored on the leaf level and
therefore faster but non-clustered indexes leaf level pages have
locations to the page of searched record and therefore slower. HTH.
|||If you have a clustered index, yes, it will go to the first entry in the
table with a datetime between '2/1/2008' and '2/3/2008', do a partial scan
of the clustered index and stop as soon as it finds an datetime >
'2/3/2008'. BTW, it is best in SQL Server to specify datetimes as
'yyyy-mm-ddThh:mm:ss.mmm'
or 'yyyy-mm-ddThh:mm:ss'
or 'yyyymmdd'
When you use a date format like '2/1/2008', that is Feb 1 in some locations
and Jan 2 in others. But '20080201' is Feb 1 everywhere.
For a nonclustered index, the answer is, "it depends". If the index is
nonclustered, then SQL Server knows the rows for any given value of dt might
be scattered throughout the physical table. So, for example, the first row
might be in page 1000, followed by a bunch of rows in other pages, followed
by another row in page 1000. But by this time page 1000 might not be in
memory anymore, so the page must be physically read again (this could
possibly happen many times). So SQL Server attempts to keep statistics on
how many rows are in each range and will attempt to estimate what percentage
of the table your query will return. If it is a small percentage, it will
use your index on dt, go to the first entry in the index with a date >=
'2/1/2008', start there and scan the index until it reaches a row with a dt
> '2/3'2008' and then stop. For each row it finds in the index on dt, it
will then use the clustered index to find the actual row and return your
values. But if it is a large percentage, then SQL Server will just scan the
entire clustered index (that is, it won't use the nonclustered index on dt
at all), because it knows that way it only has to read each physical page in
the clustered index once.
As Uri points out, your nonclustered index can be what is known as a
"covering index" for your query. This occurs when every column you need for
your query is in the index (either explicitly as part of the key or in the
INCLUDED columns or implicitly because the column is in the key of the
clustered index). When that happens, SQL Server knows it can satisify the
query requirements without ever reading from the table, so it will use the
index and only scan the part of the index it needs for the range of data you
want.
Tom
"John Doe" <Johndoe@.jd.com> wrote in message
news:uT7oOcscIHA.748@.TK2MSFTNGP04.phx.gbl...
> Uri,
> In my case, its doing a clustered index scan and wanted to know if as a
> result, its touching all the pages that may have dates prior to '2/1' and
> after '2/3' as my query is only seeking to obtain data between '2/1/2008
> and '2/3/2008' and as you mentioned that data in the CI is ordered. Let me
> know how the storage engine fetches the pages.
> Also if it was a non clustered index on date1 instead of a clustered
> index, i take it that the leaf level of the non clustered index is also
> ordered..right ?
>
> "Uri Dimant" <urid@.iscar.co.il> wrote in message
> news:OBuZpSscIHA.3932@.TK2MSFTNGP05.phx.gbl...
>

No comments:

Post a Comment