Quantcast
Channel: SQL Server Analysis Services forum
Viewing all articles
Browse latest Browse all 14337

SSAS 2008R2: Dynamic partitioning via SSIS - is this logic correct?

$
0
0

Hi all,

I'm setting up dynamic partitioning on a large financial cube and implementing a 36month sliding window for the data. Just want to make sure that my logic is correct.

Basically, is doing a process update of all the dims then a process default of my facts (after i've run the xmla to add/remove partitions) enough to have a fully processed (and performant/aggregated) and accurate cube?

Assume I have a fact that has a 'reporting month', 'location key' and then numerous measures and dim keys. It holds the revenue for that location for the reporting month.

The reporting month can never be backdated. subsequent runs can only overwrite the current reporting month or add the next month.

Assume the data warehouse has been loaded successfully. The warehouse holds a 72month rolling history.

Now, to the dynamic partitioning. The fact is partitioned by reporting month and has aggregation designs.

My SSIS package initially does a process update on all the dimensions. My understanding is that this 'flags' which existing measure partitions need to be reindexed.

Then in my data flow:

I run a simple query over my fact (select 'my partition ' + str(billmonth,6) AS PartitionName, count(*) as EstCount from myFact where billmonth > 36months ago group by billmonth order by PartitionName) to get a list of all the partitions that exist in the data warehouse and that should be in the cube.

I do a full outer merge on the partition name with the equivalent of that but from my cube. I use a script component as a source with the following code:

{

    AMO.Server amoServer;
    AMO.MeasureGroup amoMeasureGroup;

    public override void PreExecute()
    {
        base.PreExecute();

        amoServer = new AMO.Server();
        amoServer.Connect(Connections.Cube.ConnectionString);
        amoMeasureGroup = amoServer.Databases.FindByName(amoServer.ConnectionInfo.Catalog.ToString()).Cubes.FindByName(Variables.CubeName.ToString()).MeasureGroups.FindByName(Variables.MeasureGroupName.ToString());
        
        amoServer.CaptureXml = true;
    }

    public override void PostExecute()
    {
        base.PostExecute();

        amoServer.Dispose();
    }

    public override void CreateNewOutputRows()
    {
        try
        {
            foreach (AMO.Partition OLAPPartition in amoMeasureGroup.Partitions)
            {

                Output0Buffer.AddRow();
                Output0Buffer.PartitionName = OLAPPartition.Name;
            }
        }
        catch(Exception e)
        {
            bool Error = true;
            this.ComponentMetaData.FireError(-1, this.ComponentMetaData.Name, String.Format("The measure group {0} could not be found. " + e.ToString(),Variables.MeasureGroupName.ToString()), "", 0, out Error);
            throw;
        }
    }

}

(not a c# coder, above stolen + butchered from elsewhere.. but it seems to work)

I use a conditional split to separate the rows where datawarehouse.PartitionName is null (generate XMLA to delete from cube) and cube.PartitionName is null (generate xmla to add to cube). I dont do anything with partitions that exist in both the cube and data warehouse.

I then perform a process default of the measure group.

I'm assuming this will do a 'process full' of the new unprocessed partitions, and that it'll do a process data + process index of any partitions that were modified by the dimensions' process update. Is this correct? Or do I need to do any other explicit processing of my measure groups to make sure my facts are 100% accurate?

Thanks.


Jakub @ Adelaide, Australia



Viewing all articles
Browse latest Browse all 14337


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>