Checkpoint causes need for better IO subsystem?

Discussion in 'microsoft.public.sqlserver.tools' started by donsql22222, Mar 8, 2006.

  1. donsql22222

    donsql22222 Guest

    Using Profiler and PerfMon, when there is a checkpoint, the durations of
    INSERTS and SELECTS increase to approx 5000ms ... up from 15-30M...which
    causes distress for clients and needs to be fixed.

    Note that not ALL of the INSERTS and SELECTS are afffected..perhaps during
    the 10 seconds that the checkpoint takes place..10% have a duration increase.

    It's during pereids of batch inserts that this happens which occurs many
    times during the day at odd intervals.. I've written about this before and
    someone suggested that the batch inserts take place off-peak. Can't be done.
    The nature of the business dictates otherwise.

    It's also been suggested that a better IO subsystem be installed. We're
    using a 168bit/sec controller card and using PerfMon and tracking data
    transfered over all of the hard drives, that during these batch inserts, the
    total IO bits/sec is not even half of the 168bit/sec capacity, i.e., the
    controller is able to handle the data.

    To answer your other question..The MDF, LDF and C: drive are all on their
    own physical separate disk drives and have been defragmented. These are huge
    130GB drives. There is 4GB of Ram on each server. Dual CPUs at 2396MHZ.

    Any Help appreciated.

    Don

    SQL 2000 SP4
     
    donsql22222, Mar 8, 2006
    #1
    1. Advertisements

  2. Checkpoints tend to be semi-random writes across the entire database file
    footprint. Batch inserts can be sequential or not, depending on whether
    your clustered index based on a monotonically increasing column. As such,
    the random write capability of the drives comes into play, not the data
    throughput limit. Besides, the theoretical limits stated by the
    manufacturers are under very narrowly defined conditions. If you believe
    manufacturer specs match up to SQL Server usage, I have a bridge I would
    like to offer for sale. :)

    Given that you are on a bus architecture disk subsystem, high write activity
    can block read activity, this causing your slow response. Five to ten
    seconds typically matches the duration of a normal checkpoint. A high-end
    disk subsystem with one or more gigabytes of cache and a full-duplex
    connection path can help. That translates to a Fibre-Channel connected SAN.

    I would also check on the Page Life Expectency performance counter. If it
    is low, you may benefit from more physical RAM in the server. This will
    allow more data to stay in cache longer, thus eliminating the need to
    constantly reload the data from the disks.

    Finally, you can change the clustered indexes to use a monotonically
    increasing key, thus making the data loads sequential and reducing the
    number of page splits, random IO operations, and overall server load during
    a data load.
     
    Geoff N. Hiten, Mar 8, 2006
    #2
    1. Advertisements

  3. donsql22222

    donsql22222 Guest

    The Page Life Expectency performance counter hoovers around 850..not sure if
    that's good or bad.
    There's still 700M of RAM available and SQL's set dynamically to use all 4GB
    of RAM if needed. I'm thinking if SQL needed more RAM, it's there for the
    taking.

    Not sure about how to setup a monotonically increasing key.

    Currently, the Clustered index is on multiple cols (2)... (Name, Date)
    Would a monotonically increasing key include a new column with an
    incrementing sequential value?
    such as this?

    (newvalue, Name, Date)

    don
     
    donsql22222, Mar 8, 2006
    #3
  4. 850 is a bit on the low side. 4000-10000 or higher is considered good. As
    it is, you are rewriting memory every 14 minutes. Not great.

    Identity columns provide monotonically increasing keys. SQL creates a
    clustered index out of your primary key by default, but that is not a
    requirement. You can separate the two.

    Narrow clustered indexes work better. Google the following string for some
    excellent articles on clustered index selection and its impact on
    performance:

    clustered index sql kimberly tripp
     
    Geoff N. Hiten, Mar 8, 2006
    #4
  5. Just a check: you have Enterprise Edition? Standard only handles 2GB of
    memory.
    Name does not sound like it would grow monotonically. :) Furthermore it
    sounds like something I would avoid in a clustred index. Since the
    clustered key is also the row-locator in a non-clustered index, a wide
    clustered index also make the NC indexes wide and less effecient.

    What about the date, is always today's date, or could it be far in
    the past? Dates are often good for monotonically clustered indexes.

    Of course, there may be other parts of the application that would
    perform less well, if there is no clustered index on name.

    One alternative is to create the clustered index with a low fill
    factor, say 50%. That would create gaps that newly inserted data
    can be filled into, and you would thus avoid page splits. This
    strategy would require you to routinely rebuild the index, to create
    new gaps. I learned this idea from SQL Server MVP Greg Linwood. He
    used GUIDs for this, and they are truely random. Nmaes may be less
    random and the strategy may work less well for names.
     
    Erland Sommarskog, Mar 8, 2006
    #5
  6. My strategy (stolen from Kimberly) is to create a clustered key from an
    identity column. Lest I provoke the Wrath of Celko(tm), I don't actually
    use that column anywhere in the application. It is simply to (a) force
    insert order at the end of the table, and (b) provide for a very narrow
    clustered key for index lookups and index intersection. It is a physical
    characteristic only and has no place in my logical data model. Thus, the
    Primary Key is materialized by a non-clustered index.

    --
    Geoff N. Hiten
    Senior Database Administrator
    Microsoft SQL Server MVP
     
    Geoff N. Hiten, Mar 8, 2006
    #6
  7. donsql22222

    donsql22222 Guest

    The Kimberly webcast of indexing was great. Thanks.
    Now, is this a monotonically increasing indexing scheme that I've created?

    Dropped all indexes.

    I added a new field of type Indentity, decimal.

    I then created a clustered unique index on this field.

    I then created a nonclustered index on Name, date.

    I'm still showing the problem indicated earlier...during checkpoints, some
    large increases in duration of some INSERTS and SELECTS.

    If this monotonically increasing that I've created looks correct, I might
    just leave it in as it sounds like it has some performance benefits.

    Thanks,
    Don
     
    donsql22222, Mar 14, 2006
    #7
  8. Kimberly is an excellent speaker. She is consistantly one of the top if not
    the top rated speaker at any conference where she presents.

    I usually use int or bigint for identity columns but decimal should be OK.
    I like int and bigint for index intersection tuning. The new index
    structure should help with caching and table fragmentation. You still may
    have an inadequate IO subsystem, but at least your load isn't artifically
    increased by a bad indexing scheme. I have had problems with checkpoints
    slowing down regular IO before on SCSI disk arrays. RAID level choice will
    have a drastic affect on how rapidly the subsystem can absorb data. See if
    you can estimate the size of the checkpoint using performance monitor. If
    it is over 300 MB or so, you probably will have to go to a SAN to completely
    remove the performance hit.

    --
    Geoff N. Hiten
    Senior Database Administrator
    Microsoft SQL Server MVP
     
    Geoff N. Hiten, Mar 14, 2006
    #8
  9. donsql22222

    donsql22222 Guest

    Kimberly really is an outstanding presenter! I can't say enough positive
    things about her indexing webcast. I'm a believer. I listened to it again,
    and will again this afternoon as there's things I pickup each time through.

    I'm feeling optimistic. I've put monotonically increasing indexes with
    bigint on the identify col on all the tables in the DB..even the small ones
    that were just heaps. I've got the LDF and MDF on their own defragged
    physical drives. And preliminary tests show that now the highest duration is
    approx 300ms during the checkpoint where it was 4000-5000ms for "some" of
    the INSERTS before this. So i'm hoping!

    There's only 9M records in the testDB so I'll not sure if the behavior will
    change with the production size of approx 1.5B records in each of 3 tables.
    Will be testing it in the next few days.
    btw, the size of the checkpoint is only around 150MG...that led me to think
    maybe it's not the IO and that maybe it's an indexing performance issue.
    Don
     
    donsql22222, Mar 16, 2006
    #9
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.