Wednesday, March 28, 2012

Index size question (DDL)

This should be the DDL:
CREATE TABLE [dbo].[Transactions] (
[ID] [int] IDENTITY (1000, 1) NOT NULL ,
[SSN_TIN] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
,
[ACCT_NUMBER] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS
NOT NULL ,
[Settle_Date] [smalldatetime] NOT NULL ,
[SEQ_NUM] [int] NOT NULL ,
[FUND_ID] [varchar] (22) COLLATE SQL_Latin1_General_CP1_CI_AS NOT
NULL ,
[TRANS_CODE] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS
NOT NULL ,
[INT_TYPE] [smallint] NOT NULL ,
[Trade_Date] [smalldatetime] NULL ,
[QUANTITY] [decimal](16, 6) NOT NULL ,
[PRICE] [decimal](21, 8) NOT NULL ,
[PROCEEDS] [money] NULL ,
[FIN_INST_ID] [smallint] NOT NULL ,
[FUND_CODE] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS
NULL ,
[TradeMonth] [int] NULL ,
[Acct_Pre] [varchar] (3) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[Transactions-Old] (
[SSN_TIN] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
,
[ACCT_NUMBER] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS
NOT NULL ,
[Settle_Date] [smalldatetime] NOT NULL ,
[SEQ_NUM] [int] NOT NULL ,
[FUND_ID] [varchar] (22) COLLATE SQL_Latin1_General_CP1_CI_AS NOT
NULL ,
[TRANS_CODE] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS
NOT NULL ,
[INT_TYPE] [smallint] NOT NULL ,
[Trade_Date] [smalldatetime] NULL ,
[QUANTITY] [decimal](16, 6) NOT NULL ,
[PRICE] [decimal](21, 8) NOT NULL ,
[PROCEEDS] [money] NULL ,
[FIN_INST_ID] [smallint] NOT NULL ,
[FUND_CODE] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS
NULL ,
[TradeMonth] AS (datepart(year,[Trade_Date]) * 100 + datepart
(month,[Trade_Date])) ,
[Acct_Pre] AS (left([Acct_Number],3))
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Transactions-Old] WITH NOCHECK ADD
CONSTRAINT [PK_Transactions] PRIMARY KEY CLUSTERED
(
[SSN_TIN],
[ACCT_NUMBER],
[Settle_Date],
[SEQ_NUM]
) WITH FILLFACTOR = 90 ON [PRIMARY]
GO
CREATE CLUSTERED INDEX [IX_CL_Transactions_TradeDate] ON [dbo].
[Transactions]([ID]) WITH FILLFACTOR = 80 ON [PRIMARY]
GO
ALTER TABLE [dbo].[Transactions] ADD
CONSTRAINT [PK_Transactions_ID] PRIMARY KEY NONCLUSTERED
(
[ID]
) ON [PRIMARY]
GO
DW <None> wrote in news:OupT2CjAEHA.3348@.TK2MSFTNGP11.phx.gbl:

> I have a SQL 2000 table with 16 million rows. I made a copy of it.
> The two tables have the same number of rows, 16,152,139 to be exact.
> The old table had a clustered, composite primary key across the first
> four columns -- ssn (char(9)), Acct Number (varchar 20), sequence
> number (varchar(20)), and transaction date (smalldatetime). The
> database size was 1,708,032 KB and the index was 8,520 KB.
> To the new table, I added an ID field of type Int, and made it the
> primary key nonclustered, also an identity field. The only other
> index is a different date field in the table that's a clustered index
> (smalldatetime). I also set the index fill factor to 80% from 90% in
> the old one.
> The new table takes 2,406,592 KB; it's bigger because of the extra
> field. BUT the index (as shown in the Task Pad summary) is 163,656
> KB. *How could two single-column indexes take 19 times the storage
> space as one 4-column composite index?* I have run dbcc dbreindex on
> the table.
> I also ran dbcc updateusage on the new table and got trivial
> differences.
> Here is what SHOWCONTIG gives, if that helps. Anything else I can
> look at?
> DBCC SHOWCONTIG scanning 'Transactions-Old' table...
> Table: 'Transactions-Old' (87671360); index ID: 1, database ID: 7
> TABLE level scan performed.
> - Pages Scanned........................: 212443
> - Extents Scanned.......................: 26688
> - Extent Switches.......................: 26687
> - Avg. Pages per Extent..................: 8.0
> - Scan Density [Best Count:Actual Count]......: 99.51% [26556:266
88]
> - Logical Scan Fragmentation ..............: 0.00%
> - Extent Scan Fragmentation ...............: 0.50%
> - Avg. Bytes Free per Page................: 766.4
> - Avg. Page Density (full)................: 90.53%
> DBCC SHOWCONTIG scanning 'Transactions' table...
> Table: 'Transactions' (711673583); index ID: 1, database ID: 7
> TABLE level scan performed.
> - Pages Scanned........................: 280366
> - Extents Scanned.......................: 35103
> - Extent Switches.......................: 35102
> - Avg. Pages per Extent..................: 8.0
> - Scan Density [Best Count:Actual Count]......: 99.84% [35046:351
03]
> - Logical Scan Fragmentation ..............: 0.00%
> - Extent Scan Fragmentation ...............: 0.11%
> - Avg. Bytes Free per Page................: 1562.7
> - Avg. Page Density (full)................: 80.69%
> DBCC SHOWCONTIG scanning 'Transactions' table...
> Table: 'Transactions' (711673583); index ID: 2, database ID: 7
> LEAF level scan performed.
> - Pages Scanned........................: 19966
> - Extents Scanned.......................: 2500
> - Extent Switches.......................: 2499
> - Avg. Pages per Extent..................: 8.0
> - Scan Density [Best Count:Actual Count]......: 99.84% [2496:2500
]
> - Logical Scan Fragmentation ..............: 0.01%
> - Extent Scan Fragmentation ...............: 0.60%
> - Avg. Bytes Free per Page................: 6.2
> - Avg. Page Density (full)................: 99.92%
> I would post the DDL but I can't find the link to get the format...
> Thanks.
> David Walker
>Oops -- the new Transactions table actually has the ID (int) field
indexed twice, both clustered and non-clustered, with 2 different
indexes. Now how did that happen? :-)
I'll fix the indexes and check again.
David Walker
DW <None> wrote in news:ebCGMOjAEHA.3828@.TK2MSFTNGP10.phx.gbl:

> This should be the DDL:
> CREATE TABLE [dbo].[Transactions] (
> [ID] [int] IDENTITY (1000, 1) NOT NULL ,
> [SSN_TIN] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NOT
> NULL
> ,
> [ACCT_NUMBER] [varchar] (20) COLLATE SQL_Latin1_General_CP1_C
I_AS
> NOT NULL ,
> [Settle_Date] [smalldatetime] NOT NULL ,
> [SEQ_NUM] [int] NOT NULL ,
> [FUND_ID] [varchar] (22) COLLATE SQL_Latin1_General_CP1_CI_AS
NOT
> NULL ,
> [TRANS_CODE] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI
_AS
> NOT NULL ,
> [INT_TYPE] [smallint] NOT NULL ,
> [Trade_Date] [smalldatetime] NULL ,
> [QUANTITY] [decimal](16, 6) NOT NULL ,
> [PRICE] [decimal](21, 8) NOT NULL ,
> [PROCEEDS] [money] NULL ,
> [FIN_INST_ID] [smallint] NOT NULL ,
> [FUND_CODE] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_
AS
> NULL ,
> [TradeMonth] [int] NULL ,
> [Acct_Pre] [varchar] (3) COLLATE SQL_Latin1_General_CP1_CI_AS
> NULL
> ) ON [PRIMARY]
> GO
> CREATE TABLE [dbo].[Transactions-Old] (
> [SSN_TIN] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NOT
> NULL
> ,
> [ACCT_NUMBER] [varchar] (20) COLLATE SQL_Latin1_General_CP1_C
I_AS
> NOT NULL ,
> [Settle_Date] [smalldatetime] NOT NULL ,
> [SEQ_NUM] [int] NOT NULL ,
> [FUND_ID] [varchar] (22) COLLATE SQL_Latin1_General_CP1_CI_AS
NOT
> NULL ,
> [TRANS_CODE] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI
_AS
> NOT NULL ,
> [INT_TYPE] [smallint] NOT NULL ,
> [Trade_Date] [smalldatetime] NULL ,
> [QUANTITY] [decimal](16, 6) NOT NULL ,
> [PRICE] [decimal](21, 8) NOT NULL ,
> [PROCEEDS] [money] NULL ,
> [FIN_INST_ID] [smallint] NOT NULL ,
> [FUND_CODE] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_
AS
> NULL ,
> [TradeMonth] AS (datepart(year,[Trade_Date]) * 100 + datepart
> (month,[Trade_Date])) ,
> [Acct_Pre] AS (left([Acct_Number],3))
> ) ON [PRIMARY]
> GO
> ALTER TABLE [dbo].[Transactions-Old] WITH NOCHECK ADD
> CONSTRAINT [PK_Transactions] PRIMARY KEY CLUSTERED
> (
> [SSN_TIN],
> [ACCT_NUMBER],
> [Settle_Date],
> [SEQ_NUM]
> ) WITH FILLFACTOR = 90 ON [PRIMARY]
> GO
> CREATE CLUSTERED INDEX [IX_CL_Transactions_TradeDate] ON [dbo].
> [Transactions]([ID]) WITH FILLFACTOR = 80 ON [PRIMARY]
> GO
> ALTER TABLE [dbo].[Transactions] ADD
> CONSTRAINT [PK_Transactions_ID] PRIMARY KEY NONCLUSTERED
> (
> [ID]
> ) ON [PRIMARY]
> GO
>
> DW <None> wrote in news:OupT2CjAEHA.3348@.TK2MSFTNGP11.phx.gbl:
>
>|||Um, I fixed the incorrect index to be a clustered index on the
Trade_Date column like it should have been, and the results are
essentially the same. I'm still confused.
Thanks for any insights.
David Walker
DW <None> wrote in news:ebCGMOjAEHA.3828@.TK2MSFTNGP10.phx.gbl:

> This should be the DDL:
> CREATE TABLE [dbo].[Transactions] (
> [ID] [int] IDENTITY (1000, 1) NOT NULL ,
> [SSN_TIN] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NOT
> NULL
> ,
> [ACCT_NUMBER] [varchar] (20) COLLATE SQL_Latin1_General_CP1_C
I_AS
> NOT NULL ,
> [Settle_Date] [smalldatetime] NOT NULL ,
> [SEQ_NUM] [int] NOT NULL ,
> [FUND_ID] [varchar] (22) COLLATE SQL_Latin1_General_CP1_CI_AS
NOT
> NULL ,
> [TRANS_CODE] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI
_AS
> NOT NULL ,
> [INT_TYPE] [smallint] NOT NULL ,
> [Trade_Date] [smalldatetime] NULL ,
> [QUANTITY] [decimal](16, 6) NOT NULL ,
> [PRICE] [decimal](21, 8) NOT NULL ,
> [PROCEEDS] [money] NULL ,
> [FIN_INST_ID] [smallint] NOT NULL ,
> [FUND_CODE] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_
AS
> NULL ,
> [TradeMonth] [int] NULL ,
> [Acct_Pre] [varchar] (3) COLLATE SQL_Latin1_General_CP1_CI_AS
> NULL
> ) ON [PRIMARY]
> GO
> CREATE TABLE [dbo].[Transactions-Old] (
> [SSN_TIN] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NOT
> NULL
> ,
> [ACCT_NUMBER] [varchar] (20) COLLATE SQL_Latin1_General_CP1_C
I_AS
> NOT NULL ,
> [Settle_Date] [smalldatetime] NOT NULL ,
> [SEQ_NUM] [int] NOT NULL ,
> [FUND_ID] [varchar] (22) COLLATE SQL_Latin1_General_CP1_CI_AS
NOT
> NULL ,
> [TRANS_CODE] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI
_AS
> NOT NULL ,
> [INT_TYPE] [smallint] NOT NULL ,
> [Trade_Date] [smalldatetime] NULL ,
> [QUANTITY] [decimal](16, 6) NOT NULL ,
> [PRICE] [decimal](21, 8) NOT NULL ,
> [PROCEEDS] [money] NULL ,
> [FIN_INST_ID] [smallint] NOT NULL ,
> [FUND_CODE] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_
AS
> NULL ,
> [TradeMonth] AS (datepart(year,[Trade_Date]) * 100 + datepart
> (month,[Trade_Date])) ,
> [Acct_Pre] AS (left([Acct_Number],3))
> ) ON [PRIMARY]
> GO
> ALTER TABLE [dbo].[Transactions-Old] WITH NOCHECK ADD
> CONSTRAINT [PK_Transactions] PRIMARY KEY CLUSTERED
> (
> [SSN_TIN],
> [ACCT_NUMBER],
> [Settle_Date],
> [SEQ_NUM]
> ) WITH FILLFACTOR = 90 ON [PRIMARY]
> GO
> CREATE CLUSTERED INDEX [IX_CL_Transactions_TradeDate] ON [dbo].
> [Transactions]([ID]) WITH FILLFACTOR = 80 ON [PRIMARY]
> GO
> ALTER TABLE [dbo].[Transactions] ADD
> CONSTRAINT [PK_Transactions_ID] PRIMARY KEY NONCLUSTERED
> (
> [ID]
> ) ON [PRIMARY]
> GO
>
> DW <None> wrote in news:OupT2CjAEHA.3348@.TK2MSFTNGP11.phx.gbl:
>
>|||Hi David,
Thank you for using the newsgroup and it is my pleasure to help you with
you issue.
From you informaiton provided, you generate the SQL script from one table.
You noticed that the ID column in the following part:
CREATE CLUSTERED INDEX [IX_CL_Transactions_TradeDate] ON [dbo].
[Transactions]([ID]) WITH FILLFACTOR = 80 ON [PRIMARY]
GO
ALTER TABLE [dbo].[Transactions] ADD
CONSTRAINT [PK_Transactions_ID] PRIMARY KEY NONCLUSTERED
(
[ID]
) ON [PRIMARY]
GO
There is an clustered index and nonclustered index bulid on the table. You
wonder how it come, right?
To get the information on the table, you could run this statement in you
Query Analyzer:
exec sp_help transactions
OR
sp_helpindex transactions
From my experience, in the Enterprise Manger you have first create a
clustered index 'IX_CL_Transactions_TradeDate' on the column ID, then you
create a PRIMARY KEY constraint on this same column. So, finaly, you will
found that the 'sp_help transactions' or 'sp_helpindex transactions' will
show that the index on the column ID is non-clustered.
For maintenance purpose, you could run the DBCC INDEXDEFRAG
Hope this helps and if you still have questions, please feel free to post
your message here and I am glad to help.
Thanks.
Sincerely Yours
Baisong Wei
Microsoft Online Support
----
Get Secure! - www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
Please reply to newsgroups only. Thanks.|||Hello David,
Thanks to Gert for pointing you in the right direction. For
additional information, I would recommend you to read
"Estimating the Size of a Table with a Clustered Index" topic in the
Books On line which give some formula of calculating the
size of Clustered/Non-Clustered indexs in the Table. As per this
info for a Non-Clustered index
Total leaf index row size (Index_Row_Size) = CIndex_Row_Size +
Fixed_Key_Size + Variable_Key_Size + Index_Null_Bitmap + 1
The final value of 1 represents the index row header.
"CIndex_Row_Size is the total index row size for the clustered index
key".
That means NonClustered Index size would contain Clustered index row
size which is why you see larger size for non-clustered index.
Does that help answer your question ?
Thanks for using MSDN Newsgroup.
Vikrant Dalwale
Microsoft SQL Server Support Professional
Microsoft highly recommends to all of our customers that they visit
the http://www.microsoft.com/protect site and perform the three
straightforward steps listed to improve your computers security.
This posting is provided "AS IS" with no warranties, and confers no
rights.
--
>Subject: Re: Index size question (DDL)
>From: DW <None>
>References: <OupT2CjAEHA.3348@.TK2MSFTNGP11.phx.gbl>
<ebCGMOjAEHA.3828@.TK2MSFTNGP10.phx.gbl>
>User-Agent: Xnews/06.08.25
>Message-ID: <O6osh$jAEHA.1212@.TK2MSFTNGP12.phx.gbl>
>Newsgroups: microsoft.public.sqlserver.server
>Date: Thu, 04 Mar 2004 15:24:44 -0800
>NNTP-Posting-Host: nensdsllascruces195.hyperspeeddsl.com
209.136.33.195
>Lines: 1
>Path:
cpmsftngxa06.phx.gbl!TK2MSFTNGXA06.phx.gbl!TK2MSFTNGXA05.phx.gbl!TK2MS
FTNGP08.phx.gbl!TK2MSFTNGP12.phx.gbl
>Xref: cpmsftngxa06.phx.gbl microsoft.public.sqlserver.server:332420
>X-Tomcat-NG: microsoft.public.sqlserver.server
>Um, I fixed the incorrect index to be a clustered index on the
>Trade_Date column like it should have been, and the results are
>essentially the same. I'm still confused.
>Thanks for any insights.
>David Walker
>
>DW <None> wrote in news:ebCGMOjAEHA.3828@.TK2MSFTNGP10.phx.gbl:
>
SQL_Latin1_General_CP1_CI_AS[color=darkr
ed]
NOT
SQL_Latin1_General_CP1_CI_AS[color=darkr
ed]
SQL_Latin1_General_CP1_CI_AS[color=darkr
ed]
SQL_Latin1_General_CP1_CI_AS[color=darkr
ed]
NOT
SQL_Latin1_General_CP1_CI_AS[color=darkr
ed]
SQL_Latin1_General_CP1_CI_AS[color=darkr
ed]
it.
exact.
first
index
90% in
163,656
storage
dbreindex on
[26556:26688]
[35046:35103]
[2496:2500]
format...
>|||> info for a Non-Clustered index
> Total leaf index row size (Index_Row_Size) = CIndex_Row_Size +
> Fixed_Key_Size + Variable_Key_Size + Index_Null_Bitmap + 1

> That means NonClustered Index size would contain Clustered index row
> size which is why you see larger size for non-clustered index.
What exactly is "clustered index row size" and "index row size"? I
can't quite parse that... The clustered index is on a column, and all
rows have the same size ;-)
And larger, yes, but this much larger' hmmmmm...
Both tables, where one is a copy of the other, have clustered indexes.
The size of the first table's clustered index (on the first 4 columns)
is 8 meg. The size of the second tables two indexes, one clustered on a
smalldatetime field and the other non-clustered on Account Number and
SSN, is 137 meg. So the non-clustered index takes 129 meg.
Any more help would be great. Mr Wei, I didn't "wonder how it come", I
wondered why the one that should be smaller was so much bigger than the
other. I ran dbreindex, but I'll run indexdefrag as you suggested, and
see what that does.
Thanks.
David Walker
vikrantd@.online.microsoft.com (Vikrant V Dalwale [MSFT]) wrote in news:
#q#3CcTBEHA.4044@.cpmsftngxa06.phx.gbl:

>
> Hello David,
> Thanks to Gert for pointing you in the right direction. For
> additional information, I would recommend you to read
> "Estimating the Size of a Table with a Clustered Index" topic in the
> Books On line which give some formula of calculating the
> size of Clustered/Non-Clustered indexs in the Table. As per this
> info for a Non-Clustered index
> Total leaf index row size (Index_Row_Size) = CIndex_Row_Size +
> Fixed_Key_Size + Variable_Key_Size + Index_Null_Bitmap + 1
> The final value of 1 represents the index row header.
> "CIndex_Row_Size is the total index row size for the clustered index
> key".
> That means NonClustered Index size would contain Clustered index row
> size which is why you see larger size for non-clustered index.
> Does that help answer your question ?
> Thanks for using MSDN Newsgroup.
> Vikrant Dalwale
> Microsoft SQL Server Support Professional
>
> Microsoft highly recommends to all of our customers that they visit
> the http://www.microsoft.com/protect site and perform the three
> straightforward steps listed to improve your computers security.
> This posting is provided "AS IS" with no warranties, and confers no
> rights.
>
> --
> <ebCGMOjAEHA.3828@.TK2MSFTNGP10.phx.gbl>
> 209.136.33.195
> cpmsftngxa06.phx.gbl!TK2MSFTNGXA06.phx.gbl!TK2MSFTNGXA05.phx.gbl!TK2MS
> FTNGP08.phx.gbl!TK2MSFTNGP12.phx.gbl
> SQL_Latin1_General_CP1_CI_AS
> NOT
> SQL_Latin1_General_CP1_CI_AS
> SQL_Latin1_General_CP1_CI_AS
> SQL_Latin1_General_CP1_CI_AS
> NOT
> SQL_Latin1_General_CP1_CI_AS
> SQL_Latin1_General_CP1_CI_AS
> it.
> exact.
> first
> index
> 90% in
> 163,656
> storage
> dbreindex on
> [26556:26688]
> [35046:35103]
> [2496:2500]
> format...
>|||I posted before reading that BOL topic; sorry -- I'll check it out.
In addition to the formulas, what is the logical reason, that I can wrap my
head around, why a nonclustered index would be 19 times as large as a
clustered index?
Thanks.
David Walker|||You may want to read my reply again, and check out BOL on indexes.
Bottom line is, that your clustered index is not 8 MB, and so the NC
index is not 19 times larger. Only the branches (the pointers to the
lowest level index pages) are 8 MB in total. As mentioned before, no
system can possibly index a table with just 0.5 bytes per row (on
average). The index will need at least the same amount of space as the
(average) size of the indexed column.
Actually, your clustered index used 1716 MB (table data which includes
leafs of clustered index 1708MB + branches of clustered index 8MB). So
with 129 MB I would say your NC index is considerably smaller...
Gert-Jan
DW wrote:
> I posted before reading that BOL topic; sorry -- I'll check it out.
> In addition to the formulas, what is the logical reason, that I can wrap m
y
> head around, why a nonclustered index would be 19 times as large as a
> clustered index?
> Thanks.
> David Walker
(Please reply only to the newsgroup)|||OK, thanks for the info. I'm reading Inside SQL Server 2000, and when I'm
done, I should understand!
David
Gert-Jan Strik <sorry@.toomuchspamalready.nl> wrote in
news:4050DBB4.42F156BA@.toomuchspamalready.nl:

> You may want to read my reply again, and check out BOL on indexes.
> Bottom line is, that your clustered index is not 8 MB, and so the NC
> index is not 19 times larger. Only the branches (the pointers to the
> lowest level index pages) are 8 MB in total. As mentioned before, no
> system can possibly index a table with just 0.5 bytes per row (on
> average). The index will need at least the same amount of space as the
> (average) size of the indexed column.
> Actually, your clustered index used 1716 MB (table data which includes
> leafs of clustered index 1708MB + branches of clustered index 8MB). So
> with 129 MB I would say your NC index is considerably smaller...
> Gert-Jan
>
> DW wrote:
>

No comments:

Post a Comment