hello Guru's.
Environment:
SQL 7.0 SP4
TableA:
col1 int Primary Key NONCLUSTERED identity NOT NULL
col2 int NOT NULL
col2 nvarchar(255) NOT NULL
No other indexes
select distinct COL2 from tableA
The execution plan performs a hash/match aggregate operation
however
TableB:
col1 int Primary Key NONCLUSTERED identity NOT NULL
col2 nvarchar(255) NOT NULL
col3 nvarchar(255) NOT NULL
col4 nvarchar(255) NOT NULL
col5 nvarchar(4000) NULL
No other indexes
The execution plan performs a SORT/DISTINCT SORT operation
select distinct COL2 from tableA
Anyone know why this is happening? The reason why I am looking into this is
because development ran into a issue where the first query returned data in
sorted order, now occording to them, the data is not sorted. Of course I
explained that since there is no clustered index the data returned is
unpredictable. I have since added a clustered index to TableA to resolve
their issue but I need to know why the second query performs the sort with n
o
order by clause.
I have tested this on 2000 and both queries perform the SORT/DISTINCT SORT
operation.
Itzik if you are out there I appreciate your feedback too.Hi Fred
It doesn't matter whether a Clustered Index is present or not. If you don't
explicitly use an ORDER BY, you're implicitly stating that you don't care
about the order of output. If you really want sorted output, use ORDER BY.
As for the hash match aggregate qn, you have col2 listed twice in TableA, so
I'm a bit confused about what's really going on here. Can you provide some
more information about how much data is in these tables please?
Regards,
Greg Linwood
SQL Server MVP
http://blogs.sqlserver.org.au/blogs/greg_linwood
"FredG" <FredG@.discussions.microsoft.com> wrote in message
news:9E4B8155-C3FC-47BC-9664-2AA3C83E695E@.microsoft.com...
> hello Guru's.
> Environment:
> SQL 7.0 SP4
> TableA:
> col1 int Primary Key NONCLUSTERED identity NOT NULL
> col2 int NOT NULL
> col2 nvarchar(255) NOT NULL
> No other indexes
> select distinct COL2 from tableA
> The execution plan performs a hash/match aggregate operation
> however
> TableB:
> col1 int Primary Key NONCLUSTERED identity NOT NULL
> col2 nvarchar(255) NOT NULL
> col3 nvarchar(255) NOT NULL
> col4 nvarchar(255) NOT NULL
> col5 nvarchar(4000) NULL
> No other indexes
> The execution plan performs a SORT/DISTINCT SORT operation
> select distinct COL2 from tableA
> Anyone know why this is happening? The reason why I am looking into this
> is
> because development ran into a issue where the first query returned data
> in
> sorted order, now occording to them, the data is not sorted. Of course I
> explained that since there is no clustered index the data returned is
> unpredictable. I have since added a clustered index to TableA to resolve
> their issue but I need to know why the second query performs the sort with
> no
> order by clause.
> I have tested this on 2000 and both queries perform the SORT/DISTINCT SORT
> operation.
> Itzik if you are out there I appreciate your feedback too.
>|||Greg is 100% correct. The order in which the data is returned is only
guaranteed with an ORDER BY clause. No ifs ands or buts.
Andrew J. Kelly SQL MVP
"FredG" <FredG@.discussions.microsoft.com> wrote in message
news:9E4B8155-C3FC-47BC-9664-2AA3C83E695E@.microsoft.com...
> hello Guru's.
> Environment:
> SQL 7.0 SP4
> TableA:
> col1 int Primary Key NONCLUSTERED identity NOT NULL
> col2 int NOT NULL
> col2 nvarchar(255) NOT NULL
> No other indexes
> select distinct COL2 from tableA
> The execution plan performs a hash/match aggregate operation
> however
> TableB:
> col1 int Primary Key NONCLUSTERED identity NOT NULL
> col2 nvarchar(255) NOT NULL
> col3 nvarchar(255) NOT NULL
> col4 nvarchar(255) NOT NULL
> col5 nvarchar(4000) NULL
> No other indexes
> The execution plan performs a SORT/DISTINCT SORT operation
> select distinct COL2 from tableA
> Anyone know why this is happening? The reason why I am looking into this
> is
> because development ran into a issue where the first query returned data
> in
> sorted order, now occording to them, the data is not sorted. Of course I
> explained that since there is no clustered index the data returned is
> unpredictable. I have since added a clustered index to TableA to resolve
> their issue but I need to know why the second query performs the sort with
> no
> order by clause.
> I have tested this on 2000 and both queries perform the SORT/DISTINCT SORT
> operation.
> Itzik if you are out there I appreciate your feedback too.
>|||Sorry TableA
The second Col2 should be Col3
TableA has 3K rows
TableB has 1K rows
If that is the case then tell me why tableB without the clustered index is
returning data in sorted order every time the results are returned. Also,
tell me why when I added the clustered index the returned data was sorted
when I remove the clustered index the data was not sorted. I am sorry but
seeing is believing and there more to this than just "no if ands or buts"
So how you doing Andrew? Fred G. from SEI.
"Andrew J. Kelly" wrote:
> Greg is 100% correct. The order in which the data is returned is only
> guaranteed with an ORDER BY clause. No ifs ands or buts.
> --
> Andrew J. Kelly SQL MVP
> "FredG" <FredG@.discussions.microsoft.com> wrote in message
> news:9E4B8155-C3FC-47BC-9664-2AA3C83E695E@.microsoft.com...
>
>|||http://www.sqlmag.com/articles/inde...articleid=92887
"Andrew J. Kelly" wrote:
> Greg is 100% correct. The order in which the data is returned is only
> guaranteed with an ORDER BY clause. No ifs ands or buts.
> --
> Andrew J. Kelly SQL MVP
> "FredG" <FredG@.discussions.microsoft.com> wrote in message
> news:9E4B8155-C3FC-47BC-9664-2AA3C83E695E@.microsoft.com...
>
>|||The problem is you are only seeing a part of the picture
Whilst the execution plan SQL Server has chosen might be providing sorted
output presently, remember that SQL Server can arbitrarily chose to change
the plan at will & provide unsorted output if you haven't explicitly stated
you want sorted output.
One example of how this can happen is where you add more CPUs to your
machine & SQL Server decides to use parallelism to process the query - as
each thread scans data & writes output buffers, output will be interleaved
based on wherever each thread's scan is up to rather than written in an
overall sort order. At a later stage, SQL Server might then decide to stop
using parallelism & run the query serially on a single CPU (perhaps in
response to increased user activity) & you might see your output sorted
again.
Another example is that version upgrades & service packs change the way SQL
Server processes queries internally & your output might be affected as a
result - this is actually a very common problem that inexperienced
developers experience when they don't take good advice from experienced
database professionals - just search Google & you will find many sorry
stories in this area (c:
Regards,
Greg Linwood
SQL Server MVP
http://blogs.sqlserver.org.au/blogs/greg_linwood
"FredG" <FredG@.discussions.microsoft.com> wrote in message
news:A7FB8ECD-6209-4057-B3B1-C8EAA40027BE@.microsoft.com...[vbcol=seagreen]
> Sorry TableA
> The second Col2 should be Col3
> TableA has 3K rows
> TableB has 1K rows
> If that is the case then tell me why tableB without the clustered index is
> returning data in sorted order every time the results are returned. Also,
> tell me why when I added the clustered index the returned data was sorted
> when I remove the clustered index the data was not sorted. I am sorry but
> seeing is believing and there more to this than just "no if ands or buts"
> So how you doing Andrew? Fred G. from SEI.
> "Andrew J. Kelly" wrote:
>|||Here is the entire article by Itzik
http://www.sqlmag.com/articles/inde...articleid=92886
http://www.sqlmag.com/articles/inde...articleid=92887
http://www.sqlmag.com/articles/inde...articleid=92888
"Andrew J. Kelly" wrote:
> Greg is 100% correct. The order in which the data is returned is only
> guaranteed with an ORDER BY clause. No ifs ands or buts.
> --
> Andrew J. Kelly SQL MVP
> "FredG" <FredG@.discussions.microsoft.com> wrote in message
> news:9E4B8155-C3FC-47BC-9664-2AA3C83E695E@.microsoft.com...
>
>|||I have tried to explain to the developers that that a query without an order
by is unpredictable in regards to sorted data. However, its obvious that I
need to study more on the internals and perform tests. Moreover, once they
told me that this particular query has been running in production for a few
years without issues I could not explain to them why it was now broken. My
understanding of how data is stored became confused. I really appreciate you
r
response and Andrew's as well.
"Greg Linwood" wrote:
> The problem is you are only seeing a part of the picture
> Whilst the execution plan SQL Server has chosen might be providing sorted
> output presently, remember that SQL Server can arbitrarily chose to change
> the plan at will & provide unsorted output if you haven't explicitly state
d
> you want sorted output.
> One example of how this can happen is where you add more CPUs to your
> machine & SQL Server decides to use parallelism to process the query - as
> each thread scans data & writes output buffers, output will be interleaved
> based on wherever each thread's scan is up to rather than written in an
> overall sort order. At a later stage, SQL Server might then decide to stop
> using parallelism & run the query serially on a single CPU (perhaps in
> response to increased user activity) & you might see your output sorted
> again.
> Another example is that version upgrades & service packs change the way SQ
L
> Server processes queries internally & your output might be affected as a
> result - this is actually a very common problem that inexperienced
> developers experience when they don't take good advice from experienced
> database professionals - just search Google & you will find many sorry
> stories in this area (c:
> Regards,
> Greg Linwood
> SQL Server MVP
> http://blogs.sqlserver.org.au/blogs/greg_linwood
> "FredG" <FredG@.discussions.microsoft.com> wrote in message
> news:A7FB8ECD-6209-4057-B3B1-C8EAA40027BE@.microsoft.com...
>
>|||"FredG" <FredG@.discussions.microsoft.com> wrote in message
news:75ABF0D6-345B-4595-B94D-0A42D5B8DDE1@.microsoft.com...[vbcol=seagreen]
>I have tried to explain to the developers that that a query without an
>order
> by is unpredictable in regards to sorted data. However, its obvious that I
> need to study more on the internals and perform tests. Moreover, once they
> told me that this particular query has been running in production for a
> few
> years without issues I could not explain to them why it was now broken. My
> understanding of how data is stored became confused. I really appreciate
> your
> response and Andrew's as well.
> "Greg Linwood" wrote:
>|||Hey Fred. Sorry I didn't recognize you. I see you already found some links
so I won't go into too many details. But the bottom line is that while it
may return results (or appear to) this can not be trusted and can change
with service packs and versions. Let me know if the links you found don't
answer all the questions. Hope all is well.
Andrew J. Kelly SQL MVP
"FredG" <FredG@.discussions.microsoft.com> wrote in message
news:A7FB8ECD-6209-4057-B3B1-C8EAA40027BE@.microsoft.com...[vbcol=seagreen]
> Sorry TableA
> The second Col2 should be Col3
> TableA has 3K rows
> TableB has 1K rows
> If that is the case then tell me why tableB without the clustered index is
> returning data in sorted order every time the results are returned. Also,
> tell me why when I added the clustered index the returned data was sorted
> when I remove the clustered index the data was not sorted. I am sorry but
> seeing is believing and there more to this than just "no if ands or buts"
> So how you doing Andrew? Fred G. from SEI.
> "Andrew J. Kelly" wrote:
>
No comments:
Post a Comment