Automatic Lineage Reporting
Automatic Lineage Reporting
Do you get these types of questions a lot?
- We’re changing something in ECC/APO/…, what’s the impact in BI?
- How many reports do we have right now?
- Are we already reporting on source/dataset X?
- Where does the data in this set of reports come from?
We also come across such questions very regularly at our clients. Especially when you’re in a BI team lead role, or when big projects/migrations are on the horizon, answering such questions can start to take up a decent amount of your time.
So we decided to create an automatic lineage reporting, to be able to answer those questions easily. In this blog I’ll share the basics of how we set this up with you.

How?
At my current client we have a BW-on-HANA as DWH, and both Analysis for Excel (via BIP) and Power BI (via HANA views) as reporting tools.

In this first phase, we wanted to get lineage information from the entire DWH part. So, both BW and HANA flows, starting from the BW data sources, right up to the BW query & HANA reporting views. (We plan to add information from the reporting tools in a later phase.) We decided to build the lineage reporting with HANA views as data model and Power BI as reporting tool. We explored several BW/HANA tables and found that already a lot of information was available and it’s just about joining them together in an efficient and correct way.
Used HANA/BW tables
The following tables were used in our solution.
Table name |
Description |
Filter |
RSDS |
Table containing all the datasources |
OBJVERS = ‘A’ |
RSDST |
Table containing the descriptions of the datasources |
LANGU = ‘E’ AND OBJVERS = ‘A’ |
RSDODSO |
Table containing all the (O)DSO’s |
OBJVERS = ‘A’ |
RSDODSOT |
Table containing the descriptions of the (O)DSO’s |
LANGU = ‘E’ AND OBJVERS = ‘A’ |
RSDCUBE |
Table containing all the Cubes |
OBJVERS = ‘A’ |
RSDCUBET |
Table containing the descriptions of the Cubes |
LANGU = ‘E’ AND OBJVERS = ‘A’ |
RSDADSO |
Table containing all the ADSO’s |
OBJVERS = ‘A’ |
RSDADSOT |
Table containing the descriptions of the ADSO’s |
LANGU = ‘E’ AND OBJVERS = ‘A’ AND TTYPE = ‘EUSR’ |
RSOHCPR |
Table containing all the Composite Providers |
OBJVERS = ‘A’ |
RSOHCPRT |
Table containing the descriptions of the Composite Providers |
LANGU = ‘E’ AND OBJVERS = ‘A’ AND COLNAME = ‘’ |
RSDIOBJ |
Table containing all the InfoObjects |
OBJVERS = ‘A’ |
RSDIOBJT |
Table containing the descriptions of the InfoObjects |
LANGU = ‘E’ AND OBJVERS = ‘A’ |
RSRREPDIR |
Table containing all the Queries |
OBJVERS = ‘A’ AND COMPTYPE = ‘REP’ |
BIMC_REPORTABLE VIEWS |
Table containing all HANA views |
/ |
BIMC_ALL_DIMENSIONS |
Table containing all fields per HANA view |
/ |
OBJECT_DEPENDENCIES |
Table containing the HANA view’s dependencies |
BASE_SCHEMA_NAME = ‘_SYS_BIC’ |
RVITRFN |
Table containing all the transformations |
/ |
RSDICMULTIIOB |
Table containing all the MultiProviders |
OBJVERS = ‘A’ |
RSOOBJXREF |
Table containing all component references (e.g. DSO’s in composite) |
TLOGO = ‘HCPR’ AND OBJVERS = ‘A’ AND ASC_TYPE = ‘005’ |
Hana views
We chose a modular/layered solution, so we can still use the different components for other use cases in the future. The following schema gives an indication of how everything was build up with different layers of calculation views.

TECH_BW_INFOPROV
- The 1st step is to join all relevant info per info provider type
- The 2nd step is to union everything together

TECH_BW_FLOWS
The flows view is the most complex one, as we trace back from the top level BW objects down to the bottom objects, with the help of the transformations.


We join the info provider view with the transformation table’s target field (TGT). This enables us to get the objects from the below level (SRC). We do this as many times as we have levels in our DWH flows. (+some extra levels to take into account possible longer flows in the future)

Note: An alternative to foreseeing extra dummy levels, would be writing recursive code in an sql script HANA view.
We also join in the multi provider and composite provider data.


We also filter here to make sure we don’t create a loop. (imagine the object being loaded to itself!)

In the end we have for each info provider a line with the entire flow downwards.

Then, as a last part, we add two new calculated fields, to derive the lowest level object without needing to add all the fields.


TECH_HANA_VIEWS
The HANA view is similar to the flows view, in that it has to trace back from top-level objects towards bottom level objects. However, our HANA setup has only 2 levels.

First, we define in which “subcatalog” the view is located. This subcatalog actually defines the layer of our layered HANA model. It can be either “Generated”, “Modeling” or “Reporting”.

We also join in information about the fields in the view.

Then, we join with the dependencies table to get the lower-level objects. We do this join on a calculated column which contains a concatenation of catalog & view name.

In the next step we’ll create two calculated columns to deduct the actual info provider name from the base object name

TECH_BW_DATASRC
The data source view is rather easy. We join two tables together, format the timestamps, and clean the semantics.

TECH_BW_QUERIES
The query view is also quite easy. We take data from the source table, format the timestamps, and clean the semantics.

TECH_FRONTEND
The frontend view is a full outer join of the query and the hana view.

TECH_BACKEND
The backend view is a left join of the flows and the data source view.

98_AUTO_LINEAGE
This view brings al our efforts together, and it is the one that we’ll report upon. It is a left join of the frontend and the backend view.

Power BI report
On top of the auto lineage view, we can now easily create a Power BI report that allows us to answer those typical lineage questions:
- Starting from a BW query or HANA view, we can check which info providers and data sources are involved
- Starting from a data source, we can check which info providers, HANA views and queries are impacted




Gunther Van Eyck
BI consultant