hide random home http://www.microsoft.com/TechNet/boes/bo/sql/technote/admn_b.htm (PC Press Internet CD, 03/1996)

Updated: March 13,1996 TechNet Logo Go To TechNet Home Page

Estimating Database Size-Contents

This paper contains formulas and examples that show how to estimate the size of tables and indexes in a database.

icobrnchCalculating the Size of a Table with a Clustered Index
icobrnchCalculating the Size of a Nonclustered Index


The following examples are used:

Since the amount of overhead in tables containing variable-length fields is greater, two sets of formulas are presented in each example -- one for fixed-length fields and one for variable-length fields.

To calculate the size of a database, add the number of bytes of data and associated overhead and divide that number by the number of bytes available on a data page. Each 2K data page uses 32 bytes of overhead, so there are 2016 (2048 -- 32) bytes available for data on each data page.

For best accuracy, round down divisions that calculate the number of rows per page, and round up divisions that calculate the number of pages.

If you are using FILL FACTOR in your CREATE INDEX statement, it will change some of the equations. For more information, see "Other Factors," later in this paper.

If a table includes text or image datatypes, use 16K (the size of the text pointer that is stored in the row) as indicated in the following examples, and see "Using Average Sizes for text/image Data Pages," later in this paper.

The following are the storage sizes for SQL Server datatypes:

Datatype

Size

char

Defined size

varchar

Data size

binary

Defined size

varbinary

Data size

int

4

smallint

2

tinyint

1

float

8

float(b)

4 (for precision of 1--7), 8 (for precision of 8--15)

double precision

8

real

4

money

8

smallmoney

4

datetime

8

smalldatetime

4

bit

1

decimal

2--17 bytes, depending on precision (see the following table)

numeric

2--17 bytes, depending on precision (see the following table)

text

16 bytes + 2K per initialized column

image

16 bytes + 2K per initialized column

timestamp

8


Note

Any columns defined to accept NULL values must be considered variable-length columns, since they involve the overhead associated with the variable-length columns.


The decimal and numeric datatypes have a maximum precision of 38. Based on the precision specified, a length is computed and used as the size of the array to store the datatype. The following table shows the mapping from precision to length.

Precision

Size (bytes)

0--2

2

3--4

3

5--7

4

8--9

5

10--12

6

13--14

7

15--16

8

17--19

9

20--21

10

22--24

11

25--26

12

27--28

13

29--31

14

32--33

15

34--36

16

37--38

17

All calculations in the following examples are based on the maximum size for varchar and varbinary data -- the defined size of the columns. They also assume that columns were defined as NOT NULL. If you want to use average values for variable-length columns, see "Using Average Sizes for Variable Fields," later in this paper.

Document Contents


Calculating the Size of a Table with a Clustered Index

The steps in this section show how to calculate the size of a table with a clustered index.

For an example of how to use the formulas presented in this section, see "Calculating the Size of a 9,000,000-Row Table with a Clustered Index," later in this paper.

Step 1: Calculate the Data Row Size

Calculation of data row size is slightly different depending on whether the row stores any variable-length columns. Use the first formula if all the columns are fixed-length and defined as NOT NULL. Use the second formula if the row contains variable-length columns or if the columns are defined as NULL.

For fixed-length columns, calculate:

2 + (Sum of bytes in all fixed-length columns) = Data Row Size

For fixed- and variable-length columns, calculate:

2 + (Sum of bytes in all fixed-length columns) + (Sum of bytes in all variable- length columns) = Subtotal

Subtotal + ((Subtotal / 256) + 1) + (Number of variable-length columns + 1) + 2 = Data Row Size

The values 1 and 2 in the previous equations are overhead that SQL Server uses for internal storage.

Step 2: Calculate the Number of Data Pages

Calculate the number of data pages:

2016 / (Data row size) = Number of data rows per page

(Number of rows) / (Number of data rows per page) = Number of data pages required

Step 3: Calculate the Size of Clustered Index Rows

Calculating the size of clustered index rows depends on whether the keys are fixed- or variable-length. Use the first formula if all keys are fixed-length. Use the second formula if keys include variable-length columns.

For fixed-length columns, calculate:

5 + (Sum of bytes in the fixed-length index keys) = Clustered row size

For fixed- and variable-length columns, calculate:

5 + (Sum of bytes in the fixed-length index keys) + (Sum of bytes in variable-length index keys) = Subtotal

(Subtotal) + ((Subtotal / 256) + 1) + (Number of variable-length columns + 1) + 2 = Clustered index row size

The values 5 and 2 in the previous equations are overhead that SQL Server uses for internal storage.

Step 4: Calculate the Number of Clustered Index Pages

After you have calculated the row size of the clustered index, you can calculate the number of pages for the clustered index.

(2016 / Clustered index row size) -- 2 = Number of clustered index rows per page

(Number of data pages) / (Number of clustered index rows per page) = Number of index pages at level 0

If the result is greater than 1, repeat the following division step, using the quotient as the next dividend, until the quotient equals 1. When the quotient equals 1, you have reached the root level of the index.

(Number of index pages at the last level) / (Number of clustered index rows per page) = Number of index pages at next level

Step 5: Calculate the Total Number of Pages

Calculate the total number of data pages:

(Number of data pages) + (Number of index pages at level 0) + (Number of index pages at next level) + (Number of index pages at next level) [...] = Total number of data pages

Calculating the Size of a 9,000,000-Row Table with a Clustered Index

The following example calculates the size (in 2K pages) of the data and clustered index for a table containing:

Follow these steps:

  1. Calculate the data row size:

    4

    (Overhead)

    + 100

    Sum of bytes in all fixed-length columns

    + 50

    Sum of bytes in all variable-length columns

    ---

    154

    Subtotal

    154

    Subtotal

    + 1

    (Subtotal / 256) + 1 (Overhead)

    + 3

    Number of variable-length columns + 1

    + 2

    (Overhead)

    ---

    160

    Data row size

  2. Calculate the number of data pages:
    2016/160 = 12 Data rows per page

    9,000,000/12 = 750,000 Data pages

  3. Calculate the size of clustered index rows:

    5

    Overhead

    + 4

    Sum of bytes in the fixed-length index keys

    ----

    9

    Clustered index row size

  4. Calculate the number of clustered index pages:
    (2016 / 9) -- 2 = 222 Clustered index rows per page

    750,000 / 222 = 3378 Index pages (Level 0)

    3378 / 222 = 15 Index pages (Level 1)

    15 / 222 = 1 Index page (Level 2)

  5. Calculate the total number of pages:

Totals:

Pages

Rows

Level 2 (root)

1

15

Level 1

15

3,378

Level 0

3,378

750,000

Data

750,000

9,000,000

------------

Total number of 2K pages

753,394

Document Contents


Calculating the Size of a Nonclustered Index

The steps in this section show how to calculate the size of a nonclustered index.

Step 1: Calculate the Size of the Leaf Index Row

Calculating of the size of nonclustered index rows depends on whether the keys are fixed- or variable-length. Use the first formula if all of the keys are fixed-length. Use the second formula if keys include variable-length columns.

Calculate the size of nonclustered leaf rows with fixed-length keys only:

7 + (Sum of fixed-length keys) = Size of leaf index row

Calculate the size of nonclustered leaf rows with variable-length keys:

9 + (Sum of length of fixed-length keys) + (Sum of length of variable-length keys) + (Number of variable-length keys) + 1 = Subtotal

(Subtotal) + ((Subtotal / 256) + 1) = (Size of leaf index row)

The values 7 and 9 in the previous equations are overhead that SQL Server uses for internal storage.

Step 2: Calculate the Number of Leaf Pages in the Index

Calculate the number of leaf pages:

2016 / (Size of leaf index row) = Number of leaf rows per page

(Number of rows in table) / (Number of leaf rows per page) = Number of leaf pages

Step 3: Calculate the Size of the Nonleaf Rows

Calculate the size of nonleaf rows:

(Size of leaf index row) + 4 = Size of nonleaf row

The value 4 in the previous equation is overhead that SQL Server uses for internal storage.

Step 4: Calculate the Number of Nonleaf Pages

Calculate the number of nonleaf pages:

(2016 / Size of nonleaf row) -- 2 = Number of nonleaf index rows per page

(Number of leaf pages / Number of nonleaf index rows per page) = Number of index pages at Level 1

If the result is greater than 1, repeat the following division step, using the quotient as the next dividend, until the quotient is equal to 1. When the quotient is equal to 1, you have reached the root level of the index.

(Number of index pages at last level / Number of nonleaf index rows per page) = Number of index pages at next level

Step 5: Calculate the Total Number of Nonleaf Index Pages

(Number of index pages at Level 0) + (Number of index pages at next level) + (Number of index pages at next level) + (Number of index pages at next level) + (Number of index pages at next level) [...] = (Total number of 2K data pages used)

Calculating the Size of a 9,000,000-Row Table with a Nonclustered Index

The following example calculates the size of a nonclustered index on the 9,000,000-row table used in "Calculating the Size of a 9,000,000-Row Table with a Clustered Index," earlier in this paper. There are two keys, one fixed- and one variable-length. The table contains the following:

Follow these steps:

  1. Calculate the size of the leaf index row:

    9

    Overhead

    + 4

    Sum of length of fixed-length keys

    + 20

    Sum of length of variable-length keys

    + 2

    Number of variable-length keys + 1

    --------

    35

    Subtotal

    --------

    --------------------------------
    35

    Subtotal

    + 1

    (Subtotal/256) + 1

    --------

    36

    Size of Leaf Index Row

  2. Calculate the number of leaf pages in the index:
    2016/36 = 56 Nonclustered leaf rows per page

    9,000,000/56 = 160,715 Leaf pages

  3. Calculate the size of the nonleaf rows:
    36 + 4 = 40 Size of nonleaf index row
  4. Calculate the number of nonleaf pages:
    (2016/40) -- 2 = 48 Nonleaf index rows per page

    160,715/48 = 3348 Index pages, Level 1
    3348/48 = 69 Index pages, Level 2
    69/48 = 2 Index pages, Level 3
    2/48 = 1 Index page, Level 4

  5. Calculate the total number of nonleaf index pages:

Totals:

Pages

Rows

Level 4 (root)

1

2

Level 3

2

69

Level 2

69

3,348

Level 1

3,348

60,715

Level 0 (Leaf)

160,715

9,000,000

-------------

Total number of2K pages

164,135

Other Factors

There are other factors to take into account when you calculate the database size. The FILL FACTOR has an effect on the calculations for clustered index pages, distribution pages, and data pages, and using an average size for variable fields affects the calculations presented in the preceding examples. The following sections provide more information about these topics and calculations for using average sizes for variable fields.

Using a FILL FACTOR Value of 100 Percent

Normally, the index management process leaves room for two additional rows on each index page. When you set FILL FACTOR to 100 percent, it no longer leaves room for these rows. The only effect on these calculations is on calculating the number of clustered index pages (Step 4) and calculating the number of nonleaf pages (Step 4). Both of these calculations subtract 2 from the number of rows per page. When using a FILL FACTOR of 100 percent, do not subtract 2 from the number of rows per page.

Using Different FILL FACTOR Values

Other values for FILL FACTOR reduce the number of rows per page on data pages and leaf index pages. To compute the correct values when using FILL FACTOR, multiply the size of the available data page (2016) by the FILL FACTOR. For example, if your FILL FACTOR is 75 percent, your data page would hold 1512 bytes. Use this value in place of 2016 in Step 2 for both clustered and nonclustered index calculations.

Using Distribution Pages

Distribution pages are created when you create an index on existing data and when you run UPDATE STATISTICS. A distribution page occupies one full data page.

The UPDATE STATISTICS statement adds one distribution page per index on which statistics have been created. If you UPDATE STATISTICS for an entire table, add one page for each index on the table. If you UPDATE STATISTICS for only one index, add 1.

Using Average Sizes for Variable Fields

These calculations use the maximum size of the variable-length fields. If you know the average size of the fields, you can use this value in Steps 1 and 4 for calculating table size, and the value in Step 1 for calculating the nonclustered index size. You'll need slightly different formulas for clustered and nonclustered indexes.

For Clustered Indexes

In "Step 1: Calculate the Data Row Size," use the sum of the average length of the variable-length columns instead of the sum of the defined length of the variable-length columns to determine the average data row size:

4 + (Sum of bytes in all fixed-length columns) + (Sum of bytes of the average length of variable-length columns) = Subtotal

In "Step 2: Calculate the Number of Data Pages," use the average data row size from the first formula:

2016 / (Average data row size) = Number of data rows per page

In "Step 3: Calculate the Size of Clustered Index Rows," you must perform the addition twice. The first time, calculate the maximum index row size, using the given formula. The second time, calculate the average index row size, substituting the sum of the average number of bytes in the variable-length index keys for the sum of the defined number of bytes in the variable-length index keys:

5 + (Sum of bytes in the fixed-length index keys) + (Sum of bytes in variablelength keys) = Subtotal

Subtotal + ((Subtotal / 256) + 1) + 2 = Maximum index row size

In "Step 4: Calculate the Number of Clustered Index Pages," substitute the following formula for the first formula, using the two different length values you calculated in the previous equation:

(2016 -- (2 * Maximum index row size)) / Average index row size = Number of clustered index rows per page

For Nonclustered Indexes

In "Step 1: Calculate the Size of the Leaf Index Row," you must perform the addition twice. The first time, calculate the maximum leaf index row size, using the given formula. The second time, calculate the average leaf index row size, substituting the average number of bytes in the variable-length index keys for the sum of bytes in the variable-length index keys:

5 + (Sum of bytes in the fixed-length index keys) + (Sum of the average number of bytes in variable-length index keys) = Subtotal

Subtotal + ((Subtotal / 256) + 1) + 2 = Average index row size

In "Step 2: Calculate the Number of Leaf Pages in the Index," use the average leaf index row size in the first division procedure:

2016 / (Average leaf index row size) = Number of leaf rows per page

In "Step 3: Calculate the Size of the Nonleaf Rows," use the average leaf index row size:

(Average leaf index row) + 4 = Size of nonleaf row

In "Step 4: Calculate the Number of Nonleaf Pages," substitute the following formula for the first formula, using the maximum and average row sizes calculated in Step 1:

(2016 -- (2 * Maximum index row size)) / Average index row size = Number of nonleaf index rows per page

Using Average Sizes for Very Small Rows

SQL Server can store as many as 256 data or index rows on a page. If your rows are extremely short, the minimum number of data pages are:

Number of rows / 256 = Number of data pages required

Using Average Sizes for text/image Data Pages

Each text or image column stores a 16-byte pointer in the data row, with the datatype varchar(16). Each text or image column that is initialized requires at least 2K bytes (1 data page) of storage space.

Text and image columns are designed to store implicit null values, meaning that the text pointer in the data row remains null and there is no text page initialized for the value, saving 2K of storage space.

If a text or image column is defined to allow null values and the row is created with an INSERT statement that includes NULL for the text or image column, the column is not initialized, and the storage is not allocated.

If a text or image column is changed in any way with UPDATE, then the text page is allocated. Of course, inserts or updates that place actual data in a column initialize the page.

The text chains that store text and image data have 112 bytes of overhead per page. Use the following formula to calculate the number of text chain pages that a particular entry will use:

Data length / 1800 = Number of 2K pages

The result should be rounded up in all cases; for example, a data length of 1800 bytes requires two 2K pages.

Document Contents


search icon Click Here to Search TechNet Web Contents TechNet CD Overview TechNet logo Microsoft TechNet Credit Card Order Form
At this time we can only support electronic orders in the US and Canada. International ordering information.


TechNet logo Go To TechNet Home Page ©1996 Microsoft Corporation Microsoft homepage Go To Microsoft Home Page