Blogs

Process chain (flows) optimization Part 3

25 juni 2018

This will be the last article about process chain optimization. In the previous article we covered the optimization of infopackages and DTP’s. Now we will close out the topic with DSO’s and Cubes.

So for DSO’s. The only thing that I found out about DSO’s that has impact on performance is the setting ‘SID generation’. If you aren’t going to use the DSO for reporting then you should set it to never create SID’s.

For cubes the most important thing that slows down process chains is the dimensions. Dimensions have influence on the fact table. We want to keep the entries of a dimension as small as possible compared to the fact table. And if you load in data in a 1:1 transformation and the load is still slow it will most certainly has to do with this. You can easily check this by going to transaction ‘se38’ and executing following program: ‘SAP_INFOCUBE_DESIGNS’. Once loaded search for the cube you want to optimize.

Example of a cube that can be optimized (name of dimensions and fact table normally stands left but I cropped this out).

 

You want to keep the ratios of the dimensions as close to 0% as possible. I did this the following way. First take a look at the cube and it’s dimensions. Look how many charactiristics you have in total excluding the ones of the dimensions ‘Data Package’, ‘Time’ and ‘Unit’. If you have 13 or less characteristics, create as many dimensions as you have characteristics and separate them all under the new dimensions. Next go to the properties of the dimensions and make them Line Item Dimensions. In order to do this you will have to delete the data of the cube. Also when you make your dimensions line items, remember to go to transaction ‘SE38’ and execute ‘RSDG_TRFN_ACTIVATE’, Here you’ll have to reactivate all your transformations that load in data to the cube otherwise next load you’ll get a lot of errors.

If you have more then 13 characteristics in total, check the dimension that has a high ratio in the cube. You have to see the dimensions kinda like a grouping of characteristics that are related to each other. Try to find the characteristic that is out of place which seems to be very different from all others. Exclude this dimensions the same way as before and make it a ‘Line Item Dimension’ again.  This will reduce the loading time of the cube.

You can only make dimensions with 1 characteristic ‘Line Item Dimensions’. I highly recommend always doing this when possible because it has advantages in performance:

  • When loading transaction data, no IDs are generated for the entries in the dimension table. This number range operation can compromise performance precisely in the case where a degenerated dimension is involved.
  • A table- having a very large cardinality- is removed from the star schema. As a result, the SQL-based queries are simpler. In many cases, the database optimizer can choose better execution plans.

The only disadvantage to this is that like I mentioned you cannot have more than 1 characteristic in this dimension. This is why I said that if you have less than 13 characteristics to just split them because the advantages are just way better.

Here’s an extra trick I learned by goofing around with DTP’s. Standard your DTP will probably have a package size of 50.000 and a number of parallel processing of 3. Take a look at the infoprovider where the data is loaded in and check how many records are being processed. For example I had a DSO which loaded in 3.000 records. Which means it will load in 1datapackage of 3.000records. Try to change the package size to 1.000. Now there will be 3datapackages running and being processed at the same time due to parallel processing. This resulted in the runtime being 1/3th of the previous load.

 

This will be my blog about how I’m optimizing my process chains (flows) at the moment. As of this day I’m still working on these kind of things so if I would come across something new that might help the performance I will include it later on.

Here you can find my original blog in one piece: https://blogs.sap.com/2018/05/08/process-chain-flows-optimization/ .

If you have any other or better ways to improve the performance on process chains, feel free to let me know.

Sven Swennen

  • Blogs

    Open Hub Destination with BEx Query as an InfoProvider

    Introduction

    A SAP BW system is mostly known to consolidate data to give reporting or analytics capabilities to your business. However, there could be occasions where you want to transfer the data. One of the options to do this is to make use of Open Hub Destination. This was in the past a part of the InfoSpoke, but separated since SAP BW 7.X. The Open Hub Destination allows you to distribute data in database tables or flat files in a fully integrated and controlled SAP BW data flow:

    Read more >
  • Blogs

    Getting started with SAP HANA SDI in less than a day

    (The Twitter case) Part 2

    I was recently asked by a client’s marketing department to figure out how to bring Twitter data to their reporting environment. They are currently using a BW-on-HANA, so I immediately thought of HANA Smart Data Integration (SDI) as a possible candidate. In this blog I’ll cover the basics for the setup and show you that it’s actually quite easy to get up-and-running with SDI. In fact: I was pulling data from Twitter less than 3 hours after starting this SDI adventure 😊.

    Read more >
  • Blogs

    Getting started with SAP HANA SDI in less than a day

    (The Twitter case) Part 1

    I was recently asked by a client’s marketing department to figure out how to bring Twitter data to their reporting environment. They are currently using a BW-on-HANA, so I immediately thought of HANA Smart Data Integration (SDI) as a possible candidate. In this blog I’ll cover the basics for the setup and show you that it’s actually quite easy to get up-and-running with SDI. In fact: I was pulling data from Twitter less than 3 hours after starting this SDI adventure 😊.

    Read more >