Friday, March 30, 2012

Index tree

I can understand the number of table records will affect the width of the
tree. But why it also affect the depth of the tree ? Or how the key size
will afftect depth and width of the index tree ?
Alan,
The b-tree will increase its depth to improve the query time by reducing
the number of I/Os. It does this by shortening the path to the data, or
leaf node.
Let's say you have an index on surname. The more data in the table, the
greater the depth of the b-tree will result in fewer I/Os. I'm finding
this topic difficult to describe in a newsgroup posting, so hopefully
the following articles on B-Trees will help you:
B-tree algorithms
http://www.semaphorecorp.com/btp/algo.html
Binary tree
http://en.wikipedia.org/wiki/Binary_tree
binary tree
http://planetmath.org/encyclopedia/BinaryTree.html
I think B-Tree actually stands for Balanced Tree, not Binary Tree; but
the two terms are often used synonymously.
Mark Allison, SQL Server MVP
http://www.markallison.co.uk
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602m.html
Alan wrote:
> I can understand the number of table records will affect the width of the
> tree. But why it also affect the depth of the tree ? Or how the key size
> will afftect depth and width of the index tree ?
>
|||On Tue, 7 Dec 2004 15:03:09 +1100, Alan wrote:

>I can understand the number of table records will affect the width of the
>tree. But why it also affect the depth of the tree ? Or how the key size
>will afftect depth and width of the index tree ?
Hi Alan,
I'll use an example to explain. Let's assume you have a nonclustered index
with 800 bytes in the indexed columns and another 800 bytes in the
clustered key.
The leaf pages store both the indexed values and the corresponding values
in the clustered index (as locator to the actual row). Other pages (root,
intermediate) store only the indexed values. Since each page is about 8K,
a leaf page holds values for 5 rows; other pages have pointers for 10
rows.
If the number of rows in the table is 10, we need two leaf pages (assuming
no empty space, which will not always be the case in practice) to hold all
these rows. The root page will have the indexed values of the first row on
leaf page 1 and the first row on page 2; the rest of the root page remains
empty. This B-tree has depth 2 (root is level 1; leaf at level 2).
If we add another 40 rows for a total of 50, we need 10 leaf pages. The
root page will have the indexed values of the first row on each of these
10 leaf pages and no room to spare.
One extra row, bringing the total to 51, means we now need 11 leaf pages.
Since the root page can only point to max 10 pages, we need to add a
level. The new B-tree will have one root, two intermediate (level-2) and
11 data pages. One intermediate page will have the indexed values of the
first row on the first 5 or 6 leaf pages; the other intermediate page has
the indexed values of the first row on the remaining leaf pages; the root
page will only hold the indexed values of the first row of the two
intermediate pages. Note that we now have depth 3: root at level 1,
intermediate at level 2 and leaf at level 3.
We can now continue to add rows. When there are 500 rows, there are 100
leaf pages, 10 intermediate pages (each pointing to 10 leaf pages) and 1
root page (pointing to the 10 intermediate pages). All these pages are
completely full: when the 501st row is added, another level has to be
added to the index and it is now at depth 4.
The above shows how number of rows affects the B-tree depth. To see how
key size affects B-tree depth as well, imagine what happens if the indexed
columns are not 800 but 80 bytes: now you can store the indexed values of
not 10 but 100 rows in non-leaf pages. For the leaf pages, the capacity
would increase to (8K / (80 + 800)) = 9 rows. If the size of the clustered
index would decrease as well, this number would rise even further. This of
course means that you can add more rows until all leaf, intermediate and
root pages are full and another level has to be added.
Best, Hugo
(Remove _NO_ and _SPAM_ to get my e-mail address)
sql

No comments:

Post a Comment