/* Database+Disk+Performance™: Disk Performance and BACKUP to NUL: Command */ Database+Disk+Performance™ /* */ /* -- -- */ /* Footer ----------------------------------------------- */ #footer { width:660px; clear:both; margin:0 auto; } #footer hr { display:none; } #footer p { margin:0; padding-top:15px; font:78%/1.6em "Trebuchet MS",Trebuchet,Verdana,Sans-serif; text-transform:uppercase; letter-spacing:.1em; }

Tuesday, June 20, 2006

Disk Performance and BACKUP to NUL: Command

Introduction

Disk drive performance and databases are directly tied to each other with backups. A full backup, by definition, will start at the beginning of a database and read all the way through to the end. In the process, it can generate a large amount of disk IO. If you have a slow disk subsystem, it can show up here.

 

As you will see in the following examples, slow backups can be an indication of overall system health and poor database architecture.

 

The following table illustrates the different IO patterns that SQL Server uses when going after data. It is possible that you can get good performance in an OLTP environment because it’s doing many small 8K IOs, and then have poor performance when running backups because it’s doing 1M IOs.

 

 

OLTP Log

Sequential

Write

512 bytes - 64KB

OLTP Data

Random

R/W

8K

Bulk Insert

Sequential

Write

8K multiples to 128K

Read Ahead

Sequential

Read

8K multiples to 256K

Backup

Sequential

R/W

1MB

Restore

Sequential

R/W

64K

CHECKDB

Sequential

Read

8K - 64K

 

Table: SQL Server Disk IO Access Patterns

 

Source: http://blogs.msdn.com/sqlcat/

 

One of the best reasons to use the backup command is its uniformity and simplicity. Every SQL Server has it which enables comparison between different platforms, and at the end of the backup, it summarizes for you the actual throughput so you don’t have to even calculate anything.

 

BACKUP to a Null Device

 

The main purpose of a backup is an offline copy of the database for possible recovery purposes. Typically backups dump to disk or tape. In some cases, it’s beneficial to make a backup, but not store the results anywhere. For example, if you want to run through an operational process for practice… or in our case, stress the disk subsystem.

 

When a backup is run, it reads from the database, and then writes to the backup media. Hence it will read then pause then read then pause when accessing the database itself. The pause to the database occurs as it’s writing to the backup media. This does not help with stress testing by putting a constant load on the system.

 

The only way to put a full load using the BACKUP command is running it with the NUL: device syntax.

 

Syntax:

backup database <db_name> to disk = ‘NUL:’ with stats = <perc compl>

Example of Usage:

backup database siebelprd to disk = ‘NUL:’ with stats = 1

Sample Output:

Processed 1416 pages for database 'newmsdb', file 'MSDBData' on file 1.

100 percent backed up.

Processed 1 pages for database 'newmsdb', file 'MSDBLog' on file 1.

BACKUP DATABASE successfully processed 1417 pages in 0.071 seconds (163.392MB/sec).

 

Backing up to the NUL: device will cause a full time read to the database without pausing to write to the backup media.

 

Another issue when testing databases is caching. If the same test is run often, the results will always be in RAM hence distorting the results. By running a backup, it usually exceeds the amount of cache on the server thus helps eliminate this distortion.

 

Judging the Results: Bad, Good, Best

Once the command is run, the DBA needs to know what the results mean. The following examples were taken from customers I have worked with over the last 2 years and help illustrate what the throughput numbers mean.

 

 

Laptop Disk Drive With 36MB Database

 

99 percent backed up.

100 percent backed up.

BACKUP DATABASE successfully processed 4473 pages in 1.172 seconds (31.259 MB/sec).

 

 

Desktop internal disk drive 3.4GB database:

 

100 percent backed up.
BACKUP DATABASE successfully processed 425841 pages in 135.945 seconds (25.660 MB/sec).

 

Bad

 

Customer’s 1.6TB database with 12 CPU and High End SAN RAID 1:

 

100 percent backed up.

BACKUP DATABASE successfully processed 202986177 pages in 47070.773 seconds (35.326 MB/sec).

 

 

This is almost the same IO as my laptop (31.259 MB/Sec)?!

 

 

Concern

 

Customer’s 87GB database with 8 CPU 32G and High End RAID 5:

 

ファイル 1 で、データベース 'siebeldb' 、ファイル 'siebeldb_prm1' 2670712 ページが処理されました。

ファイル 1 で、データベース 'siebeldb' 、ファイル 'siebeldb_prm2' 2665816 ページが処理されました。

ファイル 1 で、データベース 'siebeldb' 、ファイル 'siebeldb_prm3' 2662176 ページが処理されました。

ファイル 1 で、データベース 'siebeldb' 、ファイル 'siebeldb_prm4' 2662064 ページが処理されました。

ファイル 1 で、データベース 'siebeldb' 、ファイル 'siebeldb_log1' 47 ページが処理されました。

BACKUP DATABASE 10660815 ページを 789.876 秒で正常に処理しました (110.565 MB/ )

 

 

Good

 

Customer 575GB SAN:

 

100 percent backed up.

BACKUP DATABASE successfully processed 70246627 pages in 3540.611 seconds (162.531 MB/sec).

 

Microsoft Dev Database 101G with RAID10 on 146Gb 10K drive on SAN:

 

100 percent processed.

BACKUP DATABASE successfully processed 12376579 pages in 707.413 seconds (143.323 MB/sec).

 

Microsoft DSS database 28GB  database High End SAN 72G 15K:

 

100 percent backed up.

BACKUP DATABASE successfully processed 3468866 pages in 151.831 seconds (187.161 MB/sec).

 

Microsoft OLTP Database 2TB with 72Gb 15K drives Raid10 on SAN:

 

100 percent backed up.

BACKUP DATABASE successfully processed 244138625 pages in 5561.102 seconds (359.638 MB/sec).

 

 

Great

 

Processed 25342792 pages for database 'TUC', file 'data1' on file 1.

Processed 25215088 pages for database 'TUC', file 'data3' on file 1.

Processed 25215168 pages for database 'TUC', file 'data5' on file 1.

Processed 25214320 pages for database 'TUC', file 'data2' on file 1.

Processed 25214344 pages for database 'TUC', file 'data4' on file 1.

Processed 25215168 pages for database 'TUC', file 'data6' on file 1.

Processed 25213704 pages for database 'TUC', file 'data7' on file 1.

Processed 25214552 pages for database 'TUC', file 'data8' on file 1.

Processed 25215200 pages for database 'TUC', file 'data9' on file 1.

Processed 25214280 pages for database 'TUC', file 'data10' on file 1.

Processed 25214920 pages for database 'TUC', file 'data11' on file 1.

Processed 25321384 pages for database 'TUC', file 'data12' on file 1.

Processed 3 pages for database 'TUC', file 'log1' on file 1.

BACKUP DATABASE successfully processed 302810923 pages in 6010.430 seconds (412.720 MB/sec).

DB Size= 2,365,619Mb

Backup time to Null:  1 hr. 40 min.

 

What to Do?

 

If you see that you are having slow throughput times (low MB/sec), the fixes are usually to add more spindles to the database and spread the load out more and/or increase bandwidth to the disk by adding faster/more controllers.

 

 

Warning!

 

This command runs great accept under the following conditions:

 

  1. If you run a differential backup, don’t run this command! The reason is that the differential backup picks up pages that have changed. Running the backup command resets the flag on the page. Since the backup to a NUL: device reads the page, but does not back it up to media, the flag is reset as if it HAD been backed up… so the database does NOT back it up during the next differential. If you have run a backup to NUL: and do differential backups, run a FULL backup immediately and everything will be fine.
  2. If you have heavy disk IO now, a backup to a NUL: device will only add more stress slowing things down more.
  3. For reason #2, run this command in off hours.

Summary

 

Backup to a NUL: device is a handy command. It allows you to compare disk performance throughput of a laptop to a high end database server with a SAN. It’s a poor man’s stress test and doesn’t take much to setup or run… or interpret the results.  It will not fix any of the problems, but hopefully show whether the database as a whole can sustain massive IO within its architecture.

 

 

 

 

0 Comments:

Post a Comment

<< Home