Sunday, February 19, 2012

Index and Data Pages

I am trying to understand how the data in sql server is stored and
also regarding fill factor and page splitting.

1) My first question what is the difference between Index pages and
Data pages.
and how are they different for clustered and non clustered indexes and
heap tables.

2) What is the relation between index and data pages.

2) BOL says that fill factor is used to define the amount of free
space on each page of index. I am confused here what does index pages
and data pages contain
for clustered/non clustered/heap tables.

3) Why does page splits occur and do they occur due to the lack of
space in index or data pages?

Thanks

skura"skura" <thotakura1@.comcast.net> wrote in message
news:ccbd7dbf.0311252154.32a46cd5@.posting.google.c om...
> I am trying to understand how the data in sql server is stored and
> also regarding fill factor and page splitting.

I recommend you get a copy of Inside SQL Server 2000. It contains a lot of
detailed information on the low-level implementation of data and index
pages. I've given some short answers below (which will certainly be
oversimplified, but hopefully accurate enough), but there's enough
information in Books Online to get a good understanding.

> 1) My first question what is the difference between Index pages and
> Data pages.
> and how are they different for clustered and non clustered indexes and
> heap tables.

Data pages have data only; clustered index pages have index information and
data (because the leaf level is made up of data pages); non-clustered index
pages have index information only.

> 2) What is the relation between index and data pages.

If the index is clustered, the leaf nodes are data pages. If the index is
non-clustered, but there is already another clustered index on the table,
the leaf nodes point to keys in the clustered index. If the table is a heap,
the index leaf nodes point to rows in data pages.

> 2) BOL says that fill factor is used to define the amount of free
> space on each page of index. I am confused here what does index pages
> and data pages contain
> for clustered/non clustered/heap tables.

As above, with a clustered index, the index includes data pages; with a heap
table, the indexes have only index pages.

> 3) Why does page splits occur and do they occur due to the lack of
> space in index or data pages?

If the table has a clustered index, and the data pages in the leaf node fill
up with data, a split occurs because SQL Server has to 'make room' for the
new data. The same applies to non-clustered indexes - as data is added,
sooner or later the index pages will get full, whatever other indexes may be
on the table.

Think of putting new books in the middle of a shelf which already has lots
of books, sorted in alphabetical order. If all the old books are side to
side with no gaps, you'll have to move some of them every time you add a new
book to the shelf, in order to keep them all in alphabetical order. But if
you leave a number of spaces between the books, then you'll usually be able
to add many more new books before you have to move any of the old ones.

> Thanks
> skura

I hope that helps.

Simon|||Simon, That helps and thanks for the info. I will get the book and read it.

"Simon Hayes" <sql@.hayes.ch> wrote in message news:<3fc51cae$1_3@.news.bluewin.ch>...
> "skura" <thotakura1@.comcast.net> wrote in message
> news:ccbd7dbf.0311252154.32a46cd5@.posting.google.c om...
> > I am trying to understand how the data in sql server is stored and
> > also regarding fill factor and page splitting.
> I recommend you get a copy of Inside SQL Server 2000. It contains a lot of
> detailed information on the low-level implementation of data and index
> pages. I've given some short answers below (which will certainly be
> oversimplified, but hopefully accurate enough), but there's enough
> information in Books Online to get a good understanding.
> > 1) My first question what is the difference between Index pages and
> > Data pages.
> > and how are they different for clustered and non clustered indexes and
> > heap tables.
> Data pages have data only; clustered index pages have index information and
> data (because the leaf level is made up of data pages); non-clustered index
> pages have index information only.
> > 2) What is the relation between index and data pages.
> If the index is clustered, the leaf nodes are data pages. If the index is
> non-clustered, but there is already another clustered index on the table,
> the leaf nodes point to keys in the clustered index. If the table is a heap,
> the index leaf nodes point to rows in data pages.
> > 2) BOL says that fill factor is used to define the amount of free
> > space on each page of index. I am confused here what does index pages
> > and data pages contain
> > for clustered/non clustered/heap tables.
> As above, with a clustered index, the index includes data pages; with a heap
> table, the indexes have only index pages.
> > 3) Why does page splits occur and do they occur due to the lack of
> > space in index or data pages?
> If the table has a clustered index, and the data pages in the leaf node fill
> up with data, a split occurs because SQL Server has to 'make room' for the
> new data. The same applies to non-clustered indexes - as data is added,
> sooner or later the index pages will get full, whatever other indexes may be
> on the table.
> Think of putting new books in the middle of a shelf which already has lots
> of books, sorted in alphabetical order. If all the old books are side to
> side with no gaps, you'll have to move some of them every time you add a new
> book to the shelf, in order to keep them all in alphabetical order. But if
> you leave a number of spaces between the books, then you'll usually be able
> to add many more new books before you have to move any of the old ones.
> > Thanks
> > skura
> I hope that helps.
> Simon

No comments:

Post a Comment