Friday, March 9, 2012

expectations for SUM query on 50+ million rows

what's a reasonable expectation for a query like this run against a
table with 50+ million rows with SQL 2005 Standard:
SELECT
NAME, SUM(Tickets) as NumTickets
FROM
MYTABLE
WHERE
TicketDate BETWEEN @.StartDate and @.EndDate
GROUP BY
NAME
ORDER BY
NumTickets DESC
let's assume my table is properly indexed, and the server machine is a
dual xeon 2.8 with 2GB of memory, and plain jane SATA drives in RAID-1.
also assume the result rows (between start and enddate) could be
anywhere from 5k to 500k rows, and there's minimal load on the server.
any takers? I'm looking for a ballpark of how well SQL should be able
to handle this...and things to consider to speed this query up.
thanks!do you need sum or count?
take a look at this count is 2 but sum is 3
select count(tickets),sum(tickets)
from (select 1 as tickets union all
select 2) x
http://sqlservercode.blogspot.com/|||"Jeff Turner" <zigjst@.gmail.com> wrote in message
news:1142527278.244991.109940@.e56g2000cwe.googlegroups.com...
> what's a reasonable expectation for a query like this run against a
> table with 50+ million rows with SQL 2005 Standard:
> SELECT
> NAME, SUM(Tickets) as NumTickets
> FROM
> MYTABLE
> WHERE
> TicketDate BETWEEN @.StartDate and @.EndDate
> GROUP BY
> NAME
> ORDER BY
> NumTickets DESC
> let's assume my table is properly indexed, and the server machine is a
> dual xeon 2.8 with 2GB of memory, and plain jane SATA drives in RAID-1.
> also assume the result rows (between start and enddate) could be
> anywhere from 5k to 500k rows, and there's minimal load on the server.
> any takers? I'm looking for a ballpark of how well SQL should be able
> to handle this...and things to consider to speed this query up.
>
Test it.
David|||how many dates are there?
is there an index on ticketdate?
is there a composite, non-clustered index on ticketdate,name,tickets ?
i'd try changing the where clause to
WHERE
TicketDate >= @.StartDate and
TicketDate <= @.EndDate
Just a personal deal - sometimes between isn't quite obnoxious enough
on the hint to the compiler.
how many bytes is the average row? if the average row is 1k, then
gonna take a long time. if average row is 50 bytes, then i'll say 1
second per 10k rows, given the perfect index.|||On 16 Mar 2006 08:41:18 -0800, Jeff Turner wrote:

>what's a reasonable expectation for a query like this run against a
>table with 50+ million rows with SQL 2005 Standard:
(snip query and hardware description)
Hi Jeff,
I have no idea - you'll have to test it.
So why do I bother to reply? Becuase you also ask for ...

>...and things to consider to speed this query up.
Try the following two indexes. Hard to predict which one of them will be
the best.
CREATE NONCLUSTERED INDEX SomeName
ON MyTable (Name, TicketDate, Tickets)
*** or ***
CREATE NONCLUSTERED INDEX SomeName
ON MyTable (TicketDate, Name, Tickets)
Hugo Kornelis, SQL Server MVP

No comments:

Post a Comment