Content Query Web Part (CQWP), Cross List Query Nightmare Part 3

Tuesday, 30 October 2007 01:24 by RanjanBanerji

Cross List Query Bug

 

In Part 2 I talked about how I had to create an extended version of the CQWP in order to overcome the fact that Microsoft neglected to provide its users with the ability to create queries that could use relative dates.  Just when I thought my days of fixing SharePoint 2007 issues were over I was notified of yet another odd behaviour.

It started with a report that said that queries created using the CQWP would not work if I created a filter using a Multiple Choice custom site column.  I tested the application in question and could repeatedly reproduce the errors.  In fact I noticed that MS Cross List Query functionality was generating some rather odd SQL.

However, when I created a new application I could not reproduce this error.  So I came to the conclusion that something was wrong with the Content Type in the application in question.  So I created a new Content Type with new custom site columns, of which one was a multiple choice check box column.  The error surfaced here too.  My final conclusion was that I am dealing with a bad DB.

As a temporary work around I suggested that we use a single line text site column in which the user will type the values that he/she would have otherwise selected by selecting check boxes.  Not as elegant, but one can capture the data they required with certain amounts of typos etc. 

To demonstrate this work around I created a custom site column of type single line text and then added it to my Content Type.  I then edited several pages in several sites in my site collection and added values for this new site column.  Following this I set up a CQWP with a filter that would query information based on what is in this new single line text column.  So for example the CQWP was set to retrieve all data from the site collection where the Content Type was X and field A had a value that contained "abc".

The results that I obtained were quite shocking and confusing.  What appeared to be a simple query and a simple case did not work.  In some cases I received data and some cases I did not.  The driving factor was whether the query was for the entire site collection or for a certain site and its children.

It took me quite a bit of research to find a way to reliably reproduce the error.  I have since then had many discussions with Microsoft's tech support and I believe this will soon be released as an official bug.

This bug is of particular importance because it is entirely possible that when you set up a CQWP for a query you are not getting all the data you should be, but because you are not intimately knowledgeable about the data you will be unaware that the query has not given you all that you seek.  Make sense?  No?  Instead of getting 10 pages you will get 6, but if you did not know you were supposed to get 10 you may believe that 6 is the right answer.

Dissecting SharePoint 2007 Content Type and Site Column Creation and the Cross List Query

When you create a new Content Type, lets call it, ContentTypeA, SharePoint 2007 puts it in the table ContentTypes and gives it an ID.  Now lets say you create site column 1 and 2 as two custom site columns that are assigned to this Content Type.  You can get the information on the Content Type by running a query like:

select tp_ContentType 
        ,tp_ContentTypeId 
        ,tp_ListId, nvarchar20, nvarchar24  
    from Alluserdata 
    where tp_ContentType like '%ContentTypeA%'

Now when you actually create a page using this content type for a certain site in your site collection SharePoint create a new ID for your content type for the site in which you created the page.  So if you create 2 pages in 5 sites each you will have a total of 10 pages and 5 + 1(original) IDs for the same Content Type (ContentTypeA) that you created.  Weird huh?  As though this is not bad enough, there is no guarantee that SharePoint will put data for Site Columns 1 and 2 in the same columns in table AllUserData (which is where SharePoint stores all your data).  What does this mean?

Imagine you create a Content Type called Person with two site Columns, FirstName and LastName.  Now you create Person Pages using a Person Page Layout in several sites in your site collection.  It is entirely possible that SharePoint will store FirstName in allUserData column nVarchar1 for sites 1, 2, and 3 and in nVarchar2 for the remaining.

This behaviour is easy to reproduce the following way:

  • Create a site collection with one root node with 10 children.  Give each child another 10 children each and give each of those 5 each.  Essentially giving you 500 plus sites in your site collection.  Why so many?  I find that the problem manifests itself faster in this case.

  • Create 5 site columns SC1, SC2, SC3, SC4, and SC5.

  • Create a Content Type TypeA by inheriting from Publishing Page.  Assign site columns SC1, SC2, SC3 in the order mentioned, i.e., SC1 first then 2 and then SC3.

  • Create pages using this content type randomly across your site collection.

  • Create a Content Type TypeB by inheriting from Publishing Page.  Assign it site columns SC3, Sc4, SC1, SC5, SC2.  Can you see how I am jumbling the order in which SC1, SC2, and SC3 are used in this Content Type vs the other?  You may say, how is this relevant?  I wondered too.  But stay with me  :-)

  • Now go ahead and create content using the new Content Type TypeB.  create content in various sites and keep a tab on what you have created and where.

  • Now create a CQWP querying for data from Content Type TypeB for the entire site collection.  There is an extremely high likelihood that you will not get all the data you expected.  based on what you set as the source of your CQWP (site collection or a specific site and its children) you may or may not get any data.

Why this strange behaviour?  Well its because:

  • SharePoint stores data for a site column in different columns in table allUserData for different sites.

  • SharePoint then generates queries that account for the fact that site columns are stored in different columns in table allUserData.

  • The queries SharePoint generates are faulty.

Here is an example.  Please note that I have modified the actual SQL generated by SharePoint a little to assist with my debugging.  But essentially the SQL is what SharePoint will generate.

SELECT 
    UserData.[tp_Created] AS QryCol0,UserData.[tp_ID] AS QryCol1,ListOrds.[WebId] AS QryCol2,ListOrds.[ListId] AS QryCol3,
    t1.[TimeCreated] AS QryCol4,
    CASE 
        WHEN DATALENGTH(t1.DirName) = 0 THEN t1.LeafName 
        WHEN DATALENGTH(t1.LeafName) = 0 THEN t1.DirName 
        ELSE t1.DirName + N'/' + t1.LeafName 
    END AS QryCol5,
    UserData.[tp_ModerationStatus] AS QryCol6,UserData.[tp_Level] AS QryCol7,UserData.[ntext1] AS QryCol8,
    UserData.[nvarchar7] AS QryCol9,UserData.[tp_Modified] AS QryCol10,UserData.[tp_Author] AS QryCol11,
    t2.[nvarchar1] AS QryCol12,t2.[tp_ID] AS QryCol13,t2.[nvarchar4] AS QryCol14,t2.[nvarchar5] AS QryCol15,
    t2.[tp_Created] AS QryCol16,UserData.[tp_Editor] AS QryCol17,t3.[nvarchar1] AS QryCol18,
    t3.[tp_ID] AS QryCol19,t3.[nvarchar4] AS QryCol20,t3.[nvarchar5] AS QryCol21,t3.[tp_Created] AS QryCol22,
    UserData.[ntext3] AS QryCol23,NULL AS QryCol24,UserData.[ntext2] AS QryCol25,
     -- NOTE hwo the case statement accounts for the fact that my data could be in any of these columns
     -- Now go down to the WHERE clause.
    CASE
        WHEN ListOrds.ColumnOrd27=2 THEN UserData.[nvarchar25] 
        WHEN ListOrds.ColumnOrd27=1 THEN UserData.[nvarchar27] 
        WHEN ListOrds.ColumnOrd27=3 THEN UserData.[nvarchar23] 
    END AS QryCol26 
FROM RBListOrd AS ListOrds 
-- RBListOrd is a table I created to store data that would otherwise go into a temp table.  This was done to make
-- debuggin easier.
    INNER LOOP JOIN UserData WITH(NOLOCK) 
    ON 
        ListOrds.ListId=UserData.tp_ListId AND 
        ListOrds.JoinOrd=0  
    INNER JOIN Docs AS t1 WITH(NOLOCK) 
    ON 
        ( 
            1 = 1  AND 
            UserData.[tp_RowOrdinal] = 0 AND 
            t1.SiteId = UserData.tp_SiteId AND 
            t1.SiteId = '82EA27AB-B99D-4767-88A3-E2433BA0D3DD' AND 
            t1.DirName = UserData.tp_DirName  AND 
            t1.LeafName = UserData.tp_LeafName  AND 
            t1.Level = UserData.tp_Level  AND  
            (
                UserData.tp_Level = 255 AND t1.LTCheckoutUserId =79 OR 
                (
                    UserData.tp_Level = 1 AND 
                    (
                        UserData.tp_DraftOwnerId IS NULL OR  
                        (
                            UserData.tp_DraftOwnerId <>79 AND 
                            t1.ScopeId<>ListOrds.ScopeId
                        )
                    ) OR 
                    UserData.tp_Level = 2 AND 
                    (
                        UserData.tp_DraftOwnerId = 79 OR 
                        t1.ScopeId=ListOrds.ScopeId
                    )
                ) AND 
                (
                    t1.LTCheckoutUserId IS NULL OR 
                    t1.LTCheckoutUserId <> 79 
                )
            ) AND 
            (1 = 1)
        ) 
    LEFT OUTER JOIN AllUserData AS t2 WITH(NOLOCK, INDEX=AllUserData_PK) 
        ON 
        (
            UserData.[tp_Author]=t2.[tp_ID] AND 
            UserData.[tp_RowOrdinal] = 0 AND 
            t2.[tp_RowOrdinal] = 0 AND 
            ( 
                (t2.tp_IsCurrent = 1) 
            )  AND 
            t2.[tp_CalculatedVersion] = 0  AND 
            t2.[tp_DeleteTransactionId] = 0x  AND 
            t2.tp_ListId = ListOrds.JoinKeyId1
        ) 
    LEFT OUTER JOIN AllUserData AS t3 WITH(NOLOCK, INDEX=AllUserData_PK) 
    ON 
        (
            UserData.[tp_Editor]=t3.[tp_ID] AND 
            UserData.[tp_RowOrdinal] = 0 AND 
            t3.[tp_RowOrdinal] = 0 AND 
            ( 
                (t3.tp_IsCurrent = 1) 
            )  AND 
            t3.[tp_CalculatedVersion] = 0  AND 
            t3.[tp_DeleteTransactionId] = 0x  AND 
            t3.tp_ListId = ListOrds.JoinKeyId1) 
WHERE
    (
        (
            UserData.[tp_ContentTypeId]  >= 0x010100C568DB52D9D0A14D9B2FDCC96666E9F2007948130EC3DB064584E219954237AF3900596EA17300E2D7449B3B3E2EB2092471 AND 
            UserData.[tp_ContentTypeId] <= 0x010100C568DB52D9D0A14D9B2FDCC96666E9F2007948130EC3DB064584E219954237AF3900596EA17300E2D7449B3B3E2EB2092471 + 0xff
        ) AND 
        (
        -- NOTE that the WHERE clause does not have the case statement or an equivalent logic as the select statement
        -- thereby getting back incorrect data
            UserData.[nvarchar27] LIKE '%Some Search Text%'
        )
    ) 
ORDER BY QryCol0 Desc

So you can see that the mal formed WHERE clause in the SQL above will result in queries that may or may not return all data as expected.  So be careful when you use custom site columns and CQWP queries based on them.  Your users may be seeing incorrect results.

I am waiting for a final response from Microsoft on this issue and hopefully a hot fix.

Another Curious Observation

Notice how in the where clause SharePoint tries to get data where tp_ContentTypeID >= some value and tp_ContentTypeID <= that value + 255?  What's with that?  I mentioned earlier that SharePoint create a new ID for each Content Type, Site combination.  Apparently the IDs it creates are in the range of the original ID of the Content Type as created in table ContentTypes + 255.  but this would imply that any site collection that has more than 255 sites will fail to correctly respond to a query.  So even if Microsoft fixes the problem with the where clause (where they are not querying all possible columns in which data can be found), there exists a second problem that will restrict queries to 255 sites.  Now I thought the default limit for a Cross List Query is a 1000 sites. Hmmmmmmm.

 

Tags:   ,
Categories:   SharePoint | SharePoint
Actions:   E-mail | Permalink | Comments (13) | Comment RSSRSS comment feed

Content Query Web Part (CQWP), Cross List Query Nightmare Part 2

Wednesday, 24 October 2007 14:52 by RanjanBanerji

Extending the Content Query Web Part

In Part 1 I talked about how the CQWP creators overlooked what I felt is a critical feature, i.e., the ability to create a query using relative dates as a filter criteria.  Microsoft proposes some workarounds for this but they all come up short.

In Part 2 I will talk about modifying the CQWP.

I have seen several blog posts on the CQWP and several that create extended versions of it.  However, I could not find one that handled relative dates.  This was a bit of a challenge for me as I had never built a web part before and of course I barely had any experience with SharePoint and its object model.

None the less the process started.  It did not take long to determine that the ContentByQueryWebPart resides in the Microsoft.SharePoint.Publishing namesapce and is a public class.  This was good news.  This implied I could inherit from this class, make some modifications to cater for relative dates and move on.

Not so simple.  What I did not know is that a web part is configured via another object, i.e., a ToolPart.  So if I wanted to modify the UI of the CQWP to enter a offset value on a date query I needed to work on it ToolPart which is the ContentByQueryToolPart.  But here starts my problem.  You see, the ContentByQueryToolPart is a sealed class.  So while I can create a class by inheriting from the ContentByQueryWebPart I cannot easily create a few overrides and be done by inherting from ContentByQueryToolPart.

The ContentByQueryWebPart is open but the ContentByQueryToolPart is sealed

Why did Microsoft choose to seal the ContentByQueryToolPart?  I have no clue.  But it sure created some significant inconvenience.  I was now left with two options:

1.  Create a web part by inheriting from the ContentByQueryWebPart and create a tool part.

//Override the following method to now use your new ToolPart.
public override ToolPart[] GetToolParts() {
    return new ToolPart[] { new MyToolPart() /*ContentByQueryToolPart()*/, new WebPartToolPart() };
}

 

But this option is not easy.  Using reflector one can see that the ContentByQueryToolPart is a large complex class that makes many calls to internal methods.  So copying its code to create your own or to re-invent the wheel could be time consuming.

2.  Find a way to hack the code.  LOL.  Yes, this is the option I went with.  A bad, dangerous option.  More on that later.  The hack is to find a way to modify the ContentByQueryToolPart UI without the need to create a new ToolPart. The idea is to get the control tree of the ToolPart and then inject new UI elements into it.  Then find a way to load and extract data from the injected UI elements and send the data to the appropriate CQWP code that will then do what is needed.  Cross List queries use CAML as input to execute the queries.  CAML and the Cross List Query technology is fully capable of handling relative dates, i.e., an expression like [Today]-7.

[Guid( "50cc2520-8afc-47dd-w20e-d4567f89j7fm" )]
public class MyExtendedCQWP : ContentByQueryWebPart {
    #region Data
    /// 
    /// Reference to a TextBox that we will inject as Offset days for filter 1
    /// 
    TextBox _textBoxOffset1;
    /// 
    /// Reference to a TextBox that we will inject as Offset days for filter 1
    /// 
    TextBox _textBoxOffset2;
    /// 
    /// Reference to a TextBox that we will inject as Offset days for filter 1
    /// 
    TextBox _textBoxOffset3;
    /// 
    /// Reference to the sealed ContentByQueryToolPart used by the ContentByQueryWebPart
    /// We get a reference to it in the override GetToolParts below.
    /// 
    ContentByQueryToolPart _contentByQueryToolPart = null;
    private string _dateOffset1 = string.Empty;
    private string _dateOffset2 = string.Empty;
    private string _dateOffset3 = string.Empty;
    #endregion Data

    /// 
    /// Constructor
    /// 
    public MyExtendedCQWP() {
        this.ExportMode = WebPartExportMode.All;
        this.Title = "My Extended Content Query Web Part";
    }


    #region Overrides
    /// 
    /// Override of WebPart GetToolParts method
    /// 
    /// 
    public override ToolPart[] GetToolParts() {
        ToolPart[] retVal = base.GetToolParts();
        _contentByQueryToolPart = ( ContentByQueryToolPart )retVal[ 0 ];
        _contentByQueryToolPart.Init += new EventHandler( ContentByQueryToolPart_Init );
        _contentByQueryToolPart.Load += new EventHandler( ContentByQueryToolPart_Load );

        return retVal;
    }

    /// 
    /// Override of the ContentByQuery GetXPathNavigator method.  This is where
    /// modify the FilterValue to account for the offset
    /// 
    /// 
    /// 
    protected override XPathNavigator GetXPathNavigator( string viewPath ) {
        if( FilterValue1 == "[Today]" ) {
            FilterValue1 = "[Today]-" + _dateOffset1;
        }
        if( FilterValue2 == "[Today]" ) {
            FilterValue2 = "[Today]-" + _dateOffset2;
        }
        if( FilterValue3 == "[Today]" ) {
            FilterValue3 = "[Today]-" + _dateOffset3;
        }

        XPathNavigator x = base.GetXPathNavigator( viewPath );
        return x;
    }

    protected override void Render( HtmlTextWriter writer ) {
        base.Render( writer );
        // TODO: add custom rendering code here.
        // writer.Write("Output HTML");
    }

    #endregion Overrides

    #region Events
    /// 
    /// Handler for when the ContentByQueryToolPart is loaded.  This way
    /// we can data back from the injected UI to local fields.
    /// 
    /// 
    /// 
    void ContentByQueryToolPart_Load( object sender, EventArgs e ) {
        if( ( ( ContentByQueryToolPart )sender ).Page.IsPostBack ) {
            _dateOffset1 = _textBoxOffset1.Text;
            _dateOffset2 = _textBoxOffset2.Text;
            _dateOffset3 = _textBoxOffset3.Text;
        }
        else {

        }
    }

    /// 
    /// The OnInit event for the ToolPart.  Checks to see if a Today Radio button is on the form
    /// If so It creates a corresponding Offset TextBox.  There can be upto 3 Today Radio buttons.
    /// Then we inject the offset TextBox and a label into the ToolParts control tree.
    /// 
    /// 
    /// 
    void ContentByQueryToolPart_Init( object sender, EventArgs e ) {
        if( _contentByQueryToolPart != null ) {
            Control radio = FindControlRecursive( _contentByQueryToolPart, "CBQToolPartfilter1DateTodayRadioButton" );
            if( radio != null ) {
                radio.Parent.Parent.Controls[ 1 ].Controls.Add( new LiteralControl( " - " ) );
                _textBoxOffset1 = new TextBox();
                _textBoxOffset1.Width = new Unit( 20 );
                _textBoxOffset1.ID = "textBoxOffset1";
                _textBoxOffset1.Text = _dateOffset1;
                radio.Parent.Parent.Controls[ 1 ].Controls.Add( _textBoxOffset1 );
                radio.Parent.Parent.Controls[ 1 ].Controls.Add( new LiteralControl( " Offset Days" ) );
            }
            radio = FindControlRecursive( _contentByQueryToolPart, "CBQToolPartfilter2DateTodayRadioButton" );
            if( radio != null ) {
                radio.Parent.Parent.Controls[ 1 ].Controls.Add( new LiteralControl( " - " ) );
                _textBoxOffset2 = new TextBox();
                _textBoxOffset2.Width = new Unit( 20 );
                _textBoxOffset2.ID = "textBoxOffset2";
                _textBoxOffset2.Text = _dateOffset2;
                radio.Parent.Parent.Controls[ 1 ].Controls.Add( _textBoxOffset2 );
                radio.Parent.Parent.Controls[ 1 ].Controls.Add( new LiteralControl( " Offset Days" ) );
            }
            radio = FindControlRecursive( _contentByQueryToolPart, "CBQToolPartfilter3DateTodayRadioButton" );
            if( radio != null ) {
                radio.Parent.Parent.Controls[ 1 ].Controls.Add( new LiteralControl( " - " ) );
                _textBoxOffset3 = new TextBox();
                _textBoxOffset3.Width = new Unit( 20 );
                _textBoxOffset3.ID = "textBoxOffset3";
                _textBoxOffset3.Text = _dateOffset3;
                radio.Parent.Parent.Controls[ 1 ].Controls.Add( _textBoxOffset3 );
                radio.Parent.Parent.Controls[ 1 ].Controls.Add( new LiteralControl( " Offset Days" ) );
            }
        }
    }

    #endregion Events

    #region Properties
    /// 
    /// Gets or sets the offset date for a Date filter if [Today] was chosen
    /// 
    public string DateOffset1 {
        get { return _dateOffset1; }
        set { _dateOffset1 = value; }
    }

    #endregion Properties

    #region Private Methods
    private Control FindControlRecursive( Control control, string ID ) {
        Control retVal = null;

        if( control.ID == ID ) {
            retVal = control;
        }
        else {
            foreach( Control childControl in control.Controls ) {
                Control targetChild = FindControlRecursive( childControl, ID );
                if( targetChild != null ) {
                    retVal = targetChild;
                    break;
                }
            }
        }

        return retVal;
    }


    #endregion Private Methods
}

 

The code above is what your web part code should look like.  This is the result when you selct a site columm of type Date in any of the three filters:

WARNING!!!!  DO NOT DO THIS OR USE EXTREME CAUTION

So now that I have shown you how to inject UI controls into the ContentByQueryToolPart and hijack its functionality why am I throwing this warning?  Quite simple.  If Microsoft wakes up one day and realizes that perhaps a richly featured querying tool such as the Content Query Web Part should have the ability to create filters based on relative dates, the code above will get hosed.  Because the code above is not overriding any of the features of the ContentByQueryToolPart.  We are forcibly injecting UI elements.  What happens when Microsoft places their own elements in those positions?  What happens if Microsoft changes their code?  Since the ContentByQueryToolPart is a sealed class I guess they don't expect people doing stuff with it.

Why did I do the above?  I really needed a short term solution until such time Microsoft wakes up and makes a better Content Query Web Part.

For now my performance issues are being taken care of by filtering down the data I get by setting a filter to get me data only from the last 7 days.  Just as I thought I am done with SharePoint and I can go back to good old coding I was assigned to yet another SharePoint 2007 mystery.  Oh! and it was once again related to the ContentQuery Web Part.  You can just imagine my joy.  More on that in Part 3.......

Tags:  
Categories:   SharePoint
Actions:   E-mail | Permalink | Comments (3) | Comment RSSRSS comment feed