Wednesday, March 21, 2012

Exponential performance degradation with openxml the larger the message

Can anyone shed light on this?
I have been experiencing exponentially bad performance with OPENXML the
larger the xml message gets that it is processing. I have used SQL Profiler
to check what it going on - and it becomes obvious that SQL Server is stuck
at the OPENXML clause.
The table below shows this for 5000, 10000, and 20000 records. After the
table I have included the actual transact sql code that was in the profiler
at which it took that amount of time to process. The transact sql is loading
the XML in to a table variable.
The XML has a fairly flat structure. Nothing startling.
Please help. Its such a shame because OPENXML has been working so well for
us. I have also tried it on 100,000 records but the time it took to process
went through the roof!
records millis seconds minutes
-- -- -- --
20,000 122036 122 2.03
10,000 34169 34 0.57
5,000 10636 11 0.18
insert @.supplies
(
BranchNumber ,
RetailerCode ,
TitleCode ,
IssueYear ,
IssueNumber ,
InitialOrder ,
MainSupply ,
MainSupplyDate ,
TotalSupplyWithClaims ,
TotalReturns,
PmIndicator
)
select
BranchNumber,
RetailerCode,
TitleCode,
IssueYear,
IssueNumber,
InitialOrder,
MainSupply,
MainSupplyDate,
TotalSupplyWithClaims,
TotalReturns,
case lower ( PmIndicator ) when 'true' then 1 else 0 end
from
OPENXML ( @.iDoc, @.supplyXPath, 1 )
with
(
BranchNumber smallint '../@.Number',
RetailerCode smallint '@.RetailerCode',
TitleCode smallint '@.TitleCode',
IssueYear smallint '@.IssueYear',
IssueNumber smallint '@.IssueNumber',
InitialOrder integer '@.InitialOrder',
MainSupply integer '@.MainSupply',
MainSupplyDate datetime '@.MainSupplyDate',
TotalSupplyWithClaims integer '@.TotalSupplyWithClaims',
TotalReturns integer '@.TotalReturns',
PmIndicator varchar ( 5 ) '@.PmIndicator'
)
Hah! Found out the reason why...
its all because of this:../@.Number
Simply referencing the parent element made OPENXML run over 10 times slower!
As soon as I flattened the XML by one level, in effect negating the need to
get the Number value from the parent, OPENXML flew like a rocket!
Get a load of this: processing 20,000 records was taking 122 seconds and now
it takes 8.
A warning to everyone therefore: if you are working with large XML documents
then don't reference a parent element, make sure your xml is very flat!!!
"Xerox" <info@.thinkscape.com> wrote in message
news:#8TgOmoJFHA.2772@.TK2MSFTNGP14.phx.gbl...
> Can anyone shed light on this?
> I have been experiencing exponentially bad performance with OPENXML the
> larger the xml message gets that it is processing. I have used SQL
Profiler
> to check what it going on - and it becomes obvious that SQL Server is
stuck
> at the OPENXML clause.
> The table below shows this for 5000, 10000, and 20000 records. After the
> table I have included the actual transact sql code that was in the
profiler
> at which it took that amount of time to process. The transact sql is
loading
> the XML in to a table variable.
> The XML has a fairly flat structure. Nothing startling.
> Please help. Its such a shame because OPENXML has been working so well for
> us. I have also tried it on 100,000 records but the time it took to
process
> went through the roof!
>
> records millis seconds minutes
> -- -- -- --
> 20,000 122036 122 2.03
> 10,000 34169 34 0.57
> 5,000 10636 11 0.18
>
> insert @.supplies
> (
> BranchNumber ,
> RetailerCode ,
> TitleCode ,
> IssueYear ,
> IssueNumber ,
> InitialOrder ,
> MainSupply ,
> MainSupplyDate ,
> TotalSupplyWithClaims ,
> TotalReturns,
> PmIndicator
> )
> select
> BranchNumber,
> RetailerCode,
> TitleCode,
> IssueYear,
> IssueNumber,
> InitialOrder,
> MainSupply,
> MainSupplyDate,
> TotalSupplyWithClaims,
> TotalReturns,
> case lower ( PmIndicator ) when 'true' then 1 else 0 end
> from
> OPENXML ( @.iDoc, @.supplyXPath, 1 )
> with
> (
> BranchNumber smallint '../@.Number',
> RetailerCode smallint '@.RetailerCode',
> TitleCode smallint '@.TitleCode',
> IssueYear smallint '@.IssueYear',
> IssueNumber smallint '@.IssueNumber',
> InitialOrder integer '@.InitialOrder',
> MainSupply integer '@.MainSupply',
> MainSupplyDate datetime '@.MainSupplyDate',
> TotalSupplyWithClaims integer '@.TotalSupplyWithClaims',
> TotalReturns integer '@.TotalReturns',
> PmIndicator varchar ( 5 ) '@.PmIndicator'
> )
>
|||Thanks "Xerox"
This is indeed one of the caveats of using OpenXML that has to do with how
parent axes are implemented by the XPath engine.
If you cannot flatten, you could also use two OpenXML calls and use the
@.mp:id and @.mp:parentid metaproperties to then be able to join the data
together.
Best regards
Michael
"Xerox" <info@.thinkscape.com> wrote in message
news:OjWfW%23oJFHA.3500@.TK2MSFTNGP14.phx.gbl...
> Hah! Found out the reason why...
> its all because of this:../@.Number
> Simply referencing the parent element made OPENXML run over 10 times
> slower!
> As soon as I flattened the XML by one level, in effect negating the need
> to
> get the Number value from the parent, OPENXML flew like a rocket!
> Get a load of this: processing 20,000 records was taking 122 seconds and
> now
> it takes 8.
> A warning to everyone therefore: if you are working with large XML
> documents
> then don't reference a parent element, make sure your xml is very flat!!!
>
> "Xerox" <info@.thinkscape.com> wrote in message
> news:#8TgOmoJFHA.2772@.TK2MSFTNGP14.phx.gbl...
> Profiler
> stuck
> profiler
> loading
> process
>

No comments:

Post a Comment