Friday, February 17, 2012

Execution Plan Mystery

I was hoping someone could shed some light on wierd situation i'm experiencing. I have the following query:

select count(*) LeadCount
from auto_leads al
where received > dbo.GetDay(GetDate())

dbo.GetDay simply returns a smalldatetime value of today's date.

Now I recently got thrown into a data mess and for some reason this query takes 8 seconds to run. Now the first thing I did was update the stats on the Received column of this auto_leads table. I re-run the query and I'm still getting 8 seconds. I look at the execution plan I can make figure out why this is happening.

I then change the above query so the filter received > dbo.GetDay(GetDate()) is now just received > '5/31/2006' and the query comes back immediately. This doesn't make sense to me because the GetDay function is really simple and comes back immediately. I then try the following query to confirm it isn't a problem with the GetDay function:

declare @.Today DateTime

set @.Today = dbo.getday(GetDate())

select count(*) leads
from auto_leads al
join type_lead_status tls on (tls.type_lead_status_id = al.type_lead_status_id)
where received > @.Today

Sure enough, the query came back immediately. Next thing to go through my mind is that the query execution plan has been cached by SQL Server using the execution plan from before I updated the stats on the received column. So I executed sp_recompile 'auto_leads' and tryed the original query again. Still taking 8-10 seconds to come back.

So my question, is why when I remove the GetDay function call in my query filter is the query slow, as opposed to me just passing a variable into the query? Thanks!

- James

Look at the execution plans. The query optimizer handles literals (i.e. '5/31/2006') and variables different than calculated values (i.e. dbo.getday(GetDate()))|||

Ya the execution plans are different. How does it handle literal's and calculated values differently? What do I need to do to get calculated values to be handled the same as literal's in terms of performance? Thanks!

- James

|||

Is there a reason you can't go with your last attempt (putting the search value into a variable and then using that)?

That would ideally be the way to go. Otherwise, you can use the hints to "force" the usage of an index. That would look something like

from myTable WITH (INDEX(myindex))

(look up TABLE HINT in BOL for a full explanation)

If you're using 2005, take a look at USE PLAN

|||

If the value of a variable or value cannot be determined at compile time then the query optimizer takes a guess and generates a plan accordingly. In your case, you should first do away with the scalar UDF in the WHERE clause. Using scalar UDFs in SELECT statements have lot of performance implications. If the logic/expression in the scalar UDF is not complex then you should inline the expression in the WHERE clause or SELECT list to get the best performance possible. So rewrite the WHERE clause like:

where received > CAST(GetDate() as smalldatetime)

Please take a look at the white paper below for more discussion on how query plans are cached and reused. It also talks about parameter sniffing which helps in determining the value of a variable at compilation time so that the optimizer can generate a better plan based on the value/statistics.

http://www.microsoft.com/technet/prodtechnol/sql/2005/recomp.mspx

|||

Thanks Umachandar! That's a great article.

- James

No comments:

Post a Comment