Aller directement à la fin des métadonnées
Aller au début des métadonnées

 

In the section Using PDI (Data Integration) to extract, transform and load data I explained how the datawarehouse database was populated from the PDI transformation. That was nice, and the data is sitting there waiting to be reported on. And with a tool like Report Designer it is no problem at all to extract data from an MySQL database using a query. But at PUM Senior Experts we want to use Saiku Analytics using data cubes. And that means we have to create a Modrian Schema which creates a kind of mask for the database.

A Mondrian Scheme is published from an XML file, and the Schema Workbench is a tool set that allows you to manipulate that XML file and publish it on the server. Personally I prefer to use my PhpStorm IDE to modify the XML file and then check the result with Schema Workbench and publish it to the Pentaho server. This is the route I will explain here.

Creating the XML file

You can find the full XML schema for the Mondrian Schema here: http://mondrian.pentaho.com/documentation/xml_schema.php, with more detailed explanation here: http://mondrian.pentaho.com/documentation/schema.php

I have not studied the schema in full detail, I have checked an example from Owen and changed it. The XML file for the data warehouse looks like this:

As you can see there are some patterns that are fairly obvious:

  • the dimensions are created outside the actual data cube so they can be referenced inside cubes (with the DimensionUsage elemenet) but are defined only once
  • each dimension has a hierarchy and a level for each field in the dimension
  • you can see each dim_date table is linked to a date field I want to compare with (check Adding Date and Time tables)

Although there is fair amount of data I find it reasonably quick to manually edit. I do recommend you use an editor that will do some syntax checking as my PhpStorm does.

Checking the XML file in Schema Workbench

Once I have edited the XML file I will test the result in Schema Workbench.

Using Schema Workbench

To install Pentaho Schema Workbench on Ubuntu Desktop check Installation of Pentaho components and setting up connections#InstallPentahoSchemaWorkbench%28PSW%29onUbuntuDesktop. To open Schema Workbench from your relevant folder:

When you open Schema Workbench you will always need to do a quick check if you have the right connection. You can do this with Options/Connection and you should then see something like this:

If you open the XML file you created or edited in the previous step in Schema Workbench you will get a representation like below. I have left an error so you can see that there is an error symbol and an error message:

You can expand each element in the tree in the left, showing you the data you have entered in your XML. Obviously you can use Schema Editor to update the XML file, and add Dimensions, Hierarchies, Levels, Cubes etc. It is probably worth a play to see if this works better for you than editing the XML directly, which is what I have done.

When I expand the CaseEndDate in error:

Although I prefer editing the XML file in PhpStorm, this is quite a good way to check if all my column and tables names are correct, and I have not misplaced a '<' or '>' somewhere.

Publish the XML file to the server and refresh the Mondrian Schema

Once all is well I want to publish the Mondrian Schema to my server.  I can do this in Schema Workbench with a valid connection. If you then select File/Publish you will get something like this:

If your connection is valid the Mondrian Schema will be published on your server. All you need to do now is go to your Pentaho server, click on Tools/Refresh/Mondrian Schema Cache and you can use your new definition.

Use the right name!

In the publish window at Pentaho or JNDI Data Source I have entered the name of the JNDI on my Pentaho server!

 

 

Étiquette
  • Aucun