Sunday, February 19, 2012

Execution Plans/Performance Issues: Variables vs. Literals

I have a problem w/ two very simple queries yielding largely varying
execution plans and performance. One uses a variable (the fast one), and
one uses a literal (the slow one).
Here's the info:
@.@.version:
Microsoft SQL Server 2000 - 8.00.2039 (Intel X86) May 3 2005 23:18:38
Copyright (c) 1988-2003 Microsoft Corporation Enterprise Edition on Windows
NT 5.0 (Build 2195: Service Pack 4)
--QUERY 1: USING VARIABLE (FAST)
declare @.UserID int
select @.UserID = 54462
select Thread.ThreadID
from Thread
join PrivateThreadUser
on PrivateThreadUser.ThreadID = Thread.ThreadID
where PrivateThreadUser.UserID = @.UserID
--QUERY 2: USING LITERAL VALUE (SLOW)
select Thread.ThreadID
from Thread
join PrivateThreadUser
on PrivateThreadUser.ThreadID = Thread.ThreadID
where PrivateThreadUser.UserID = 54462
And here are the query costs between the two:
Query 1: Query cost (relative to batch): 5.60%
Query 2: Query cost (relative to batch): 94.40%
To see the execution plans for the two, see these files:
Text file:
http://www.animalcrossingcommunity.com/ExecPlan.txt
Excel spreadsheet (full details):
http://www.animalcrossingcommunity.com/ExecPlan.xls
Why in the world does Query 2 yield a MUCH -- 17 times -- less efficient
query plan?
This originally started out as a stored proc issue, which I later found that
may be related to parameter sniffing. But now that I've narrowed it down to
two straight queries in QA -- one using a variable, one using a literal --
this doesn't seem to be an issue of parameter sniffing, as I've read this is
only related to stored proc parameters.
Any insight as to the logic SQL Server is using here would be appreciated.
Also, any insight as to what logic I should use when trying to
diagnose/address these issues would be appreciated as well.
Thanks in advance.
JeradJerad,
There could be several reasons for this behavior:
1) Maybe your statistics are not up to date. When in doubt, run UPDATE
STATISTICS, preferably WITH FULLSCAN
2) The distribution of the UserID and/or ThreadID column could be
skewed. If the query returns just a few rows for UserID=54462, but many
rows for a different UserID, then the optimizer will optimize for the
worst case scenario when using variables, because it will have no prior
knowledge of the actual @.UserID it will be running with.
HTH,
Gert-Jan
Jerad Rose wrote:
> I have a problem w/ two very simple queries yielding largely varying
> execution plans and performance. One uses a variable (the fast one), and
> one uses a literal (the slow one).
> Here's the info:
> @.@.version:
> Microsoft SQL Server 2000 - 8.00.2039 (Intel X86) May 3 2005 23:18:38
> Copyright (c) 1988-2003 Microsoft Corporation Enterprise Edition on Windo
ws
> NT 5.0 (Build 2195: Service Pack 4)
> --QUERY 1: USING VARIABLE (FAST)
> declare @.UserID int
> select @.UserID = 54462
> select Thread.ThreadID
> from Thread
> join PrivateThreadUser
> on PrivateThreadUser.ThreadID = Thread.ThreadID
> where PrivateThreadUser.UserID = @.UserID
> --QUERY 2: USING LITERAL VALUE (SLOW)
> select Thread.ThreadID
> from Thread
> join PrivateThreadUser
> on PrivateThreadUser.ThreadID = Thread.ThreadID
> where PrivateThreadUser.UserID = 54462
> And here are the query costs between the two:
> Query 1: Query cost (relative to batch): 5.60%
> Query 2: Query cost (relative to batch): 94.40%
> To see the execution plans for the two, see these files:
> Text file:
> http://www.animalcrossingcommunity.com/ExecPlan.txt
> Excel spreadsheet (full details):
> http://www.animalcrossingcommunity.com/ExecPlan.xls
> Why in the world does Query 2 yield a MUCH -- 17 times -- less efficient
> query plan?
> This originally started out as a stored proc issue, which I later found th
at
> may be related to parameter sniffing. But now that I've narrowed it down
to
> two straight queries in QA -- one using a variable, one using a literal --
> this doesn't seem to be an issue of parameter sniffing, as I've read this
is
> only related to stored proc parameters.
> Any insight as to the logic SQL Server is using here would be appreciated.
> Also, any insight as to what logic I should use when trying to
> diagnose/address these issues would be appreciated as well.
> Thanks in advance.
> Jerad|||1. Are you absolutely certain you don't have the fast and slow
reversed? Frequently we see the variable version slower than a
literal version.
2. What is the datatype of PrivateThreadUser.UserID? I don't see an
explicit CONVERT in your execution plans, but even slightly misaligned
datatypes can cause the plan to go wild.
J.
On Sat, 17 Sep 2005 21:39:29 -0400, "Jerad Rose" <no@.spam.com> wrote:
>I have a problem w/ two very simple queries yielding largely varying
>execution plans and performance. One uses a variable (the fast one), and
>one uses a literal (the slow one).
>Here's the info:
>@.@.version:
>Microsoft SQL Server 2000 - 8.00.2039 (Intel X86) May 3 2005 23:18:38
>Copyright (c) 1988-2003 Microsoft Corporation Enterprise Edition on Window
s
>NT 5.0 (Build 2195: Service Pack 4)
>--QUERY 1: USING VARIABLE (FAST)
>declare @.UserID int
>select @.UserID = 54462
>select Thread.ThreadID
>from Thread
>join PrivateThreadUser
>on PrivateThreadUser.ThreadID = Thread.ThreadID
>where PrivateThreadUser.UserID = @.UserID
>--QUERY 2: USING LITERAL VALUE (SLOW)
>select Thread.ThreadID
>from Thread
>join PrivateThreadUser
>on PrivateThreadUser.ThreadID = Thread.ThreadID
>where PrivateThreadUser.UserID = 54462
>And here are the query costs between the two:
>Query 1: Query cost (relative to batch): 5.60%
>Query 2: Query cost (relative to batch): 94.40%
>To see the execution plans for the two, see these files:
>Text file:
>http://www.animalcrossingcommunity.com/ExecPlan.txt
>Excel spreadsheet (full details):
>http://www.animalcrossingcommunity.com/ExecPlan.xls
>Why in the world does Query 2 yield a MUCH -- 17 times -- less efficient
>query plan?
>This originally started out as a stored proc issue, which I later found tha
t
>may be related to parameter sniffing. But now that I've narrowed it down t
o
>two straight queries in QA -- one using a variable, one using a literal --
>this doesn't seem to be an issue of parameter sniffing, as I've read this i
s
>only related to stored proc parameters.
>Any insight as to the logic SQL Server is using here would be appreciated.
>Also, any insight as to what logic I should use when trying to
>diagnose/address these issues would be appreciated as well.
>Thanks in advance.
>Jerad
>|||Hi Gert-Han.
1) I actually tried UPDATE STATISTICS already (meant to post that), but I
went ahead and tried it again WITH FULLSCAN, and still got the same results.
2) The highest saturation for one user is 913 of 26540 rows (3.44%
saturation), so I wouldn't think this would be the case.
Any other ideas?
Thanks for your help.
Jerad
"Gert-Jan Strik" <sorry@.toomuchspamalready.nl> wrote in message
news:432D51BC.E14A51BA@.toomuchspamalready.nl...[vbcol=seagreen]
> Jerad,
> There could be several reasons for this behavior:
> 1) Maybe your statistics are not up to date. When in doubt, run UPDATE
> STATISTICS, preferably WITH FULLSCAN
> 2) The distribution of the UserID and/or ThreadID column could be
> skewed. If the query returns just a few rows for UserID=54462, but many
> rows for a different UserID, then the optimizer will optimize for the
> worst case scenario when using variables, because it will have no prior
> knowledge of the actual @.UserID it will be running with.
> HTH,
> Gert-Jan
>
> Jerad Rose wrote:|||Hi J.
1) Positive. I thought this too, as I had read about parameter sniffing
causing problems, but in this case, the literal is the slow one. Also, if
you look at the execution plan (text file), you'll see that it's using an
Index Scan on the literal query.
2) It is an int (64-bit). And referencial integrity is enforced on these
tables, so it is required that the datatypes match between the FKs and PKs.
Any other ideas?
Thanks for your help.
Jerad
"JXStern" <JXSternChangeX2R@.gte.net> wrote in message
news:8rhri19u3e8e988cj2sih5qha8iltstjdj@.
4ax.com...
> 1. Are you absolutely certain you don't have the fast and slow
> reversed? Frequently we see the variable version slower than a
> literal version.
> 2. What is the datatype of PrivateThreadUser.UserID? I don't see an
> explicit CONVERT in your execution plans, but even slightly misaligned
> datatypes can cause the plan to go wild.
> J.
>
> On Sat, 17 Sep 2005 21:39:29 -0400, "Jerad Rose" <no@.spam.com> wrote:
>|||On Mon, 19 Sep 2005 01:11:50 -0400, "Jerad Rose" <no@.spam.com> wrote:
>Hi J.
>1) Positive. I thought this too, as I had read about parameter sniffing
>causing problems, but in this case, the literal is the slow one. Also, if
>you look at the execution plan (text file), you'll see that it's using an
>Index Scan on the literal query.
Hmmm.

>2) It is an int (64-bit). And referencial integrity is enforced on these
>tables, so it is required that the datatypes match between the FKs and PKs.
So, your variable is an int (32), and your column is an int (64)? But
that's the fast one. Hmm.
Well, heck, I guess you can just use the variable. Maybe I should try
that on some of my own code? Hmm, ...
J.|||That is remarkable. You must be having a hot cache for the Literal query
plan to actually run *slower* than the Variable query plan.
One possible reason could be that with a cold cache (no relevant data
pages in memory) the Literal query plan is in fact faster. If you have a
narrow index, a wide table and many rows, then 3.44% could be enough to
switch from Lookups to an index scan. Especially since the nonclustered
index is covering. If you add one other column of table Thread to the
query this might also change things.
You could test this by running the following lines before each
execution:
--clear proc cache
DBCC FREEPROCCACHE
--write dirty pages to disk
CHECKPOINT
--free clean buffers
DBCC DROPCLEANBUFFERS
If you still consider it to be a problem (for example because you know
you will always be running with a hot cache), then you could add a LOOP
join hint. If not, then I would simply accept the less-then-optimal
query plan.
HTH,
Gert-Jan
Jerad Rose wrote:[vbcol=seagreen]
> Hi Gert-Han.
> 1) I actually tried UPDATE STATISTICS already (meant to post that), but I
> went ahead and tried it again WITH FULLSCAN, and still got the same result
s.
> 2) The highest saturation for one user is 913 of 26540 rows (3.44%
> saturation), so I wouldn't think this would be the case.
> Any other ideas?
> Thanks for your help.
> Jerad
> "Gert-Jan Strik" <sorry@.toomuchspamalready.nl> wrote in message
> news:432D51BC.E14A51BA@.toomuchspamalready.nl...

No comments:

Post a Comment