Content Query Web Part (CQWP), Cross List Query Nightmare Part 3

Tuesday, 30 October 2007 01:24 by RanjanBanerji

Cross List Query Bug

 

In Part 2 I talked about how I had to create an extended version of the CQWP in order to overcome the fact that Microsoft neglected to provide its users with the ability to create queries that could use relative dates.  Just when I thought my days of fixing SharePoint 2007 issues were over I was notified of yet another odd behaviour.

It started with a report that said that queries created using the CQWP would not work if I created a filter using a Multiple Choice custom site column.  I tested the application in question and could repeatedly reproduce the errors.  In fact I noticed that MS Cross List Query functionality was generating some rather odd SQL.

However, when I created a new application I could not reproduce this error.  So I came to the conclusion that something was wrong with the Content Type in the application in question.  So I created a new Content Type with new custom site columns, of which one was a multiple choice check box column.  The error surfaced here too.  My final conclusion was that I am dealing with a bad DB.

As a temporary work around I suggested that we use a single line text site column in which the user will type the values that he/she would have otherwise selected by selecting check boxes.  Not as elegant, but one can capture the data they required with certain amounts of typos etc. 

To demonstrate this work around I created a custom site column of type single line text and then added it to my Content Type.  I then edited several pages in several sites in my site collection and added values for this new site column.  Following this I set up a CQWP with a filter that would query information based on what is in this new single line text column.  So for example the CQWP was set to retrieve all data from the site collection where the Content Type was X and field A had a value that contained "abc".

The results that I obtained were quite shocking and confusing.  What appeared to be a simple query and a simple case did not work.  In some cases I received data and some cases I did not.  The driving factor was whether the query was for the entire site collection or for a certain site and its children.

It took me quite a bit of research to find a way to reliably reproduce the error.  I have since then had many discussions with Microsoft's tech support and I believe this will soon be released as an official bug.

This bug is of particular importance because it is entirely possible that when you set up a CQWP for a query you are not getting all the data you should be, but because you are not intimately knowledgeable about the data you will be unaware that the query has not given you all that you seek.  Make sense?  No?  Instead of getting 10 pages you will get 6, but if you did not know you were supposed to get 10 you may believe that 6 is the right answer.

Dissecting SharePoint 2007 Content Type and Site Column Creation and the Cross List Query

When you create a new Content Type, lets call it, ContentTypeA, SharePoint 2007 puts it in the table ContentTypes and gives it an ID.  Now lets say you create site column 1 and 2 as two custom site columns that are assigned to this Content Type.  You can get the information on the Content Type by running a query like:

select tp_ContentType 
        ,tp_ContentTypeId 
        ,tp_ListId, nvarchar20, nvarchar24  
    from Alluserdata 
    where tp_ContentType like '%ContentTypeA%'

Now when you actually create a page using this content type for a certain site in your site collection SharePoint create a new ID for your content type for the site in which you created the page.  So if you create 2 pages in 5 sites each you will have a total of 10 pages and 5 + 1(original) IDs for the same Content Type (ContentTypeA) that you created.  Weird huh?  As though this is not bad enough, there is no guarantee that SharePoint will put data for Site Columns 1 and 2 in the same columns in table AllUserData (which is where SharePoint stores all your data).  What does this mean?

Imagine you create a Content Type called Person with two site Columns, FirstName and LastName.  Now you create Person Pages using a Person Page Layout in several sites in your site collection.  It is entirely possible that SharePoint will store FirstName in allUserData column nVarchar1 for sites 1, 2, and 3 and in nVarchar2 for the remaining.

This behaviour is easy to reproduce the following way:

  • Create a site collection with one root node with 10 children.  Give each child another 10 children each and give each of those 5 each.  Essentially giving you 500 plus sites in your site collection.  Why so many?  I find that the problem manifests itself faster in this case.

  • Create 5 site columns SC1, SC2, SC3, SC4, and SC5.

  • Create a Content Type TypeA by inheriting from Publishing Page.  Assign site columns SC1, SC2, SC3 in the order mentioned, i.e., SC1 first then 2 and then SC3.

  • Create pages using this content type randomly across your site collection.

  • Create a Content Type TypeB by inheriting from Publishing Page.  Assign it site columns SC3, Sc4, SC1, SC5, SC2.  Can you see how I am jumbling the order in which SC1, SC2, and SC3 are used in this Content Type vs the other?  You may say, how is this relevant?  I wondered too.  But stay with me  :-)

  • Now go ahead and create content using the new Content Type TypeB.  create content in various sites and keep a tab on what you have created and where.

  • Now create a CQWP querying for data from Content Type TypeB for the entire site collection.  There is an extremely high likelihood that you will not get all the data you expected.  based on what you set as the source of your CQWP (site collection or a specific site and its children) you may or may not get any data.

Why this strange behaviour?  Well its because:

  • SharePoint stores data for a site column in different columns in table allUserData for different sites.

  • SharePoint then generates queries that account for the fact that site columns are stored in different columns in table allUserData.

  • The queries SharePoint generates are faulty.

Here is an example.  Please note that I have modified the actual SQL generated by SharePoint a little to assist with my debugging.  But essentially the SQL is what SharePoint will generate.

SELECT 
    UserData.[tp_Created] AS QryCol0,UserData.[tp_ID] AS QryCol1,ListOrds.[WebId] AS QryCol2,ListOrds.[ListId] AS QryCol3,
    t1.[TimeCreated] AS QryCol4,
    CASE 
        WHEN DATALENGTH(t1.DirName) = 0 THEN t1.LeafName 
        WHEN DATALENGTH(t1.LeafName) = 0 THEN t1.DirName 
        ELSE t1.DirName + N'/' + t1.LeafName 
    END AS QryCol5,
    UserData.[tp_ModerationStatus] AS QryCol6,UserData.[tp_Level] AS QryCol7,UserData.[ntext1] AS QryCol8,
    UserData.[nvarchar7] AS QryCol9,UserData.[tp_Modified] AS QryCol10,UserData.[tp_Author] AS QryCol11,
    t2.[nvarchar1] AS QryCol12,t2.[tp_ID] AS QryCol13,t2.[nvarchar4] AS QryCol14,t2.[nvarchar5] AS QryCol15,
    t2.[tp_Created] AS QryCol16,UserData.[tp_Editor] AS QryCol17,t3.[nvarchar1] AS QryCol18,
    t3.[tp_ID] AS QryCol19,t3.[nvarchar4] AS QryCol20,t3.[nvarchar5] AS QryCol21,t3.[tp_Created] AS QryCol22,
    UserData.[ntext3] AS QryCol23,NULL AS QryCol24,UserData.[ntext2] AS QryCol25,
     -- NOTE hwo the case statement accounts for the fact that my data could be in any of these columns
     -- Now go down to the WHERE clause.
    CASE
        WHEN ListOrds.ColumnOrd27=2 THEN UserData.[nvarchar25] 
        WHEN ListOrds.ColumnOrd27=1 THEN UserData.[nvarchar27] 
        WHEN ListOrds.ColumnOrd27=3 THEN UserData.[nvarchar23] 
    END AS QryCol26 
FROM RBListOrd AS ListOrds 
-- RBListOrd is a table I created to store data that would otherwise go into a temp table.  This was done to make
-- debuggin easier.
    INNER LOOP JOIN UserData WITH(NOLOCK) 
    ON 
        ListOrds.ListId=UserData.tp_ListId AND 
        ListOrds.JoinOrd=0  
    INNER JOIN Docs AS t1 WITH(NOLOCK) 
    ON 
        ( 
            1 = 1  AND 
            UserData.[tp_RowOrdinal] = 0 AND 
            t1.SiteId = UserData.tp_SiteId AND 
            t1.SiteId = '82EA27AB-B99D-4767-88A3-E2433BA0D3DD' AND 
            t1.DirName = UserData.tp_DirName  AND 
            t1.LeafName = UserData.tp_LeafName  AND 
            t1.Level = UserData.tp_Level  AND  
            (
                UserData.tp_Level = 255 AND t1.LTCheckoutUserId =79 OR 
                (
                    UserData.tp_Level = 1 AND 
                    (
                        UserData.tp_DraftOwnerId IS NULL OR  
                        (
                            UserData.tp_DraftOwnerId <>79 AND 
                            t1.ScopeId<>ListOrds.ScopeId
                        )
                    ) OR 
                    UserData.tp_Level = 2 AND 
                    (
                        UserData.tp_DraftOwnerId = 79 OR 
                        t1.ScopeId=ListOrds.ScopeId
                    )
                ) AND 
                (
                    t1.LTCheckoutUserId IS NULL OR 
                    t1.LTCheckoutUserId <> 79 
                )
            ) AND 
            (1 = 1)
        ) 
    LEFT OUTER JOIN AllUserData AS t2 WITH(NOLOCK, INDEX=AllUserData_PK) 
        ON 
        (
            UserData.[tp_Author]=t2.[tp_ID] AND 
            UserData.[tp_RowOrdinal] = 0 AND 
            t2.[tp_RowOrdinal] = 0 AND 
            ( 
                (t2.tp_IsCurrent = 1) 
            )  AND 
            t2.[tp_CalculatedVersion] = 0  AND 
            t2.[tp_DeleteTransactionId] = 0x  AND 
            t2.tp_ListId = ListOrds.JoinKeyId1
        ) 
    LEFT OUTER JOIN AllUserData AS t3 WITH(NOLOCK, INDEX=AllUserData_PK) 
    ON 
        (
            UserData.[tp_Editor]=t3.[tp_ID] AND 
            UserData.[tp_RowOrdinal] = 0 AND 
            t3.[tp_RowOrdinal] = 0 AND 
            ( 
                (t3.tp_IsCurrent = 1) 
            )  AND 
            t3.[tp_CalculatedVersion] = 0  AND 
            t3.[tp_DeleteTransactionId] = 0x  AND 
            t3.tp_ListId = ListOrds.JoinKeyId1) 
WHERE
    (
        (
            UserData.[tp_ContentTypeId]  >= 0x010100C568DB52D9D0A14D9B2FDCC96666E9F2007948130EC3DB064584E219954237AF3900596EA17300E2D7449B3B3E2EB2092471 AND 
            UserData.[tp_ContentTypeId] <= 0x010100C568DB52D9D0A14D9B2FDCC96666E9F2007948130EC3DB064584E219954237AF3900596EA17300E2D7449B3B3E2EB2092471 + 0xff
        ) AND 
        (
        -- NOTE that the WHERE clause does not have the case statement or an equivalent logic as the select statement
        -- thereby getting back incorrect data
            UserData.[nvarchar27] LIKE '%Some Search Text%'
        )
    ) 
ORDER BY QryCol0 Desc

So you can see that the mal formed WHERE clause in the SQL above will result in queries that may or may not return all data as expected.  So be careful when you use custom site columns and CQWP queries based on them.  Your users may be seeing incorrect results.

I am waiting for a final response from Microsoft on this issue and hopefully a hot fix.

Another Curious Observation

Notice how in the where clause SharePoint tries to get data where tp_ContentTypeID >= some value and tp_ContentTypeID <= that value + 255?  What's with that?  I mentioned earlier that SharePoint create a new ID for each Content Type, Site combination.  Apparently the IDs it creates are in the range of the original ID of the Content Type as created in table ContentTypes + 255.  but this would imply that any site collection that has more than 255 sites will fail to correctly respond to a query.  So even if Microsoft fixes the problem with the where clause (where they are not querying all possible columns in which data can be found), there exists a second problem that will restrict queries to 255 sites.  Now I thought the default limit for a Cross List Query is a 1000 sites. Hmmmmmmm.

 

Tags:   ,
Categories:   SharePoint | SharePoint
Actions:   E-mail | Permalink | Comments (13) | Comment RSSRSS comment feed

Comments

February 18. 2008 17:57

Sherman Woo

Wow, great detective work. I may be encountering a related problem. Judging by what you are saying, then the bug resides in the CQWP, or by extension, the SPSiteDataQuery class (upon which, presumably, the actual CQWP query execution is based). I am testing my query using SPSiteDataQuery, and my initial tests are indicating the same result set as the CQWP.

I was coming to the same conclusion as you based on this article: http://support.microsoft.com/kb/946484. However, you proved it with your T-SQL (how the heck did you figure that out, by the way?).

One thing I will try next is to see if I can be explicit about which internal column names I use for each field I create. I had hoped to not have to do this, since that would require us to track which column names have already been used. If this helps address (er, workaround) the query issue, then I suppose we hae no choice.

FYI, I also found this (www.sharepointblogs.com/.../...ies-or-lists.aspx), which pretty much confirms my next move (aside from opening a support call with MS, which we did today).

Thanks for the information. I will try to remember to post back here if I find anything new/useful.

Sherman Woo

June 9. 2008 01:27

Me

You're quite the digital sleuth. This series really helped out with my research into using the CQWP in business practices.

Me

July 3. 2008 03:25

Servé Hermans

I have experienced different results of the CBQWP by just refreshing the page a few times. I had also created a custom content type (two actually, one inherits the other one), added the fields to the commonviewfields property and just like you said, you are not aware that a few results are missing from the list.
What I discovered however was that results always show correctly and consistent when the page is in edit mode. I also tried to disable page output cache but that has no effect. As soon as I check in the draft version and publish the page, the results either change to less or stay the same. Very inconsistent. I will now try to put an out of  the box CBQWP without any commonviewfield or custom xsl to see if that one shows inconsistent results as well.

Servé Hermans

July 3. 2008 08:35

RanjanBanerji

Serve,

I believe this issue has now been fixed by Microsoft but teh official patch/hotfix is yet to be released.  I find that the best way to avoid this problem is to very carefully create the Content Types and possibly avoid inheritance and modification of Content Types across sites.  Beats the purpose of the flexibility they are trying to offer but better than getting a bug.  Worst part is that creating a CQWP of your own does nto help as long as you use CAML and Cross List Queries.  The SQL generation is deep insde some COM object.  Using reflector I determined that all the way up to the last .Net call into teh COM object the CAML is correct.  But the resulting SQL is wrong.

Good luck.

Ranjan

RanjanBanerji

July 3. 2008 16:09

pingback

Pingback from feeds.feedburner.com

Serve's Sharepoint Blog: Inconsistent results with Content by Query web part

feeds.feedburner.com

July 3. 2008 16:20

Servé Hermans

Thanx Ranjan for the response. I have posted my experiences on my blog: http://hermansberghem.blogspot.com and referenced your post. Keep up the good work!

Servé Hermans

July 4. 2008 00:00

RanjanBanerji

Serve,

Here is a section of text from an email I got from Microsoft on this issue:

"The fix for Item 7 has already been validated, but the integrated rollup package is still being tested and is expected to be available by the end of the summer; there are a couple of updates ahead of it in the product group's test queue, so the potential exists for some slip in the delivery schedule if show-stopping problems are identified."

Item 7 being this particular issue.  As you can see I am dealing with quite a few issues Smile.  So I am hoping that this issue will soon be brought to an end.

Thanks,

Ranjan

RanjanBanerji

August 5. 2008 14:29

Jan

Ranjan,

Great article! This issue has been driving mee crazy for the better part of a day and I was about to give up when I found this. It still seems to be a little known issue around the web.

Thanks,

Jan

October 2. 2008 14:57

Frank

Great work about the mysterious CQWP. As ISV we had the same problems with it, in the end it was possible to usw the CQWP as filter for centrally managed, cross-site, tree-style categories. If you add the SharePartXXL Taxonomy Extension feature to your SharePoint for content categorization, then you can use these categories to filter for in the CQWP. Or you can use cross-type, meta-data based result lists, related item links etc.

See how it works:
http://www.sharepartxxl.com/products/taxonomy/

Frank

November 3. 2008 15:31

Matt

Any idea if the hotfix / patch has been released? Is there a known bug associated with this ("Item 7")?

We don't have many different content types, we are just doing a content query on a single type on a custom field that sometimes returns 9 results, sometimes returns all the results (30+). I poked around in the allUserData table and it seemed pretty clean- all the data for this custom column (a datetime) was being stored in a single column.

Matt

November 3. 2008 15:42

RanjanBanerji

Matt,

If the data is in the same column in AllUserData then you are facing a different problem.  One way to check is to turn SQL Profiler on and run the CQWP when there are no other users (this way you will have lesser information to sift through).  See the SQL that gets generated in your small number of results vs large and see what may be causing the difference.

As for a patch.  I received an email from MS Tech Support long ago saying they were not going to fix this.  It has now been a while since I have touched SharePoint so I really don't know if MS lived up to its word or they actually woke up and fixed the problem.

Thanks,

Ranjan

RanjanBanerji

July 15. 2009 09:00

Christian

Thanks for this great post - I will be sure to check out your blog more often

Christian

August 12. 2010 22:48

trackback

Content Query Web Part (CQWP), Cross List Query Nightmare Part 2

Content Query Web Part (CQWP), Cross List Query Nightmare Part 2

Ranjan Banerji

Add comment


(Will show your Gravatar icon)

  Country flag

biuquote
  • Comment
  • Preview
Loading