Blogs

Process chain (flows) optimization Part 3

25 juni 2018

This will be the last article about process chain optimization. In the previous article we covered the optimization of infopackages and DTP’s. Now we will close out the topic with DSO’s and Cubes.

So for DSO’s. The only thing that I found out about DSO’s that has impact on performance is the setting ‘SID generation’. If you aren’t going to use the DSO for reporting then you should set it to never create SID’s.

For cubes the most important thing that slows down process chains is the dimensions. Dimensions have influence on the fact table. We want to keep the entries of a dimension as small as possible compared to the fact table. And if you load in data in a 1:1 transformation and the load is still slow it will most certainly has to do with this. You can easily check this by going to transaction ‘se38’ and executing following program: ‘SAP_INFOCUBE_DESIGNS’. Once loaded search for the cube you want to optimize.

Example of a cube that can be optimized (name of dimensions and fact table normally stands left but I cropped this out).

 

You want to keep the ratios of the dimensions as close to 0% as possible. I did this the following way. First take a look at the cube and it’s dimensions. Look how many charactiristics you have in total excluding the ones of the dimensions ‘Data Package’, ‘Time’ and ‘Unit’. If you have 13 or less characteristics, create as many dimensions as you have characteristics and separate them all under the new dimensions. Next go to the properties of the dimensions and make them Line Item Dimensions. In order to do this you will have to delete the data of the cube. Also when you make your dimensions line items, remember to go to transaction ‘SE38’ and execute ‘RSDG_TRFN_ACTIVATE’, Here you’ll have to reactivate all your transformations that load in data to the cube otherwise next load you’ll get a lot of errors.

If you have more then 13 characteristics in total, check the dimension that has a high ratio in the cube. You have to see the dimensions kinda like a grouping of characteristics that are related to each other. Try to find the characteristic that is out of place which seems to be very different from all others. Exclude this dimensions the same way as before and make it a ‘Line Item Dimension’ again.  This will reduce the loading time of the cube.

You can only make dimensions with 1 characteristic ‘Line Item Dimensions’. I highly recommend always doing this when possible because it has advantages in performance:

  • When loading transaction data, no IDs are generated for the entries in the dimension table. This number range operation can compromise performance precisely in the case where a degenerated dimension is involved.
  • A table- having a very large cardinality- is removed from the star schema. As a result, the SQL-based queries are simpler. In many cases, the database optimizer can choose better execution plans.

The only disadvantage to this is that like I mentioned you cannot have more than 1 characteristic in this dimension. This is why I said that if you have less than 13 characteristics to just split them because the advantages are just way better.

Here’s an extra trick I learned by goofing around with DTP’s. Standard your DTP will probably have a package size of 50.000 and a number of parallel processing of 3. Take a look at the infoprovider where the data is loaded in and check how many records are being processed. For example I had a DSO which loaded in 3.000 records. Which means it will load in 1datapackage of 3.000records. Try to change the package size to 1.000. Now there will be 3datapackages running and being processed at the same time due to parallel processing. This resulted in the runtime being 1/3th of the previous load.

 

This will be my blog about how I’m optimizing my process chains (flows) at the moment. As of this day I’m still working on these kind of things so if I would come across something new that might help the performance I will include it later on.

Here you can find my original blog in one piece: https://blogs.sap.com/2018/05/08/process-chain-flows-optimization/ .

If you have any other or better ways to improve the performance on process chains, feel free to let me know.

Sven Swennen

  • Blogs

    so.. SAP Lumira is dead.. now What ?

    As a consultant I’m responsible for the Analytics environment for an Innovative Company in Natural Chemistry and innovative in their IT. Recently I was forced to ponder on the post-Lumira question “ Now what ? ”.

     

    Read more >
  • Blogs

    Fetching XML API data

    Fetching XML API data

    I recently finished a project for which I had to fetch public data of parking lots and load it to a multiprovider to use it in reporting.  The data that I needed to fetch was from an open data API. After successfully finishing this project I thought I would share how I did this as an easy guide for others.

    Read more >
  • Blogs

    Take on the fight with Python

    Take on the fight with Python

    Introduction

    If there is one thing common in everyone’s career than it is that at some point everyone said: “Can’t this be automated or made simple?”. On a given moment at my customer I was asked to do a data-compare of two data providers with over 50 characteristics each and containing millions of records in SAP HANA. At first I started to run queries against the SAP HANA-tables to check the data but soon enough I thought…

    Read more >