Blogs

Getting started with SAP HANA SDI in less than a day

09 november 2018

This is part 2 of the series. Herein, we’ll cover dataflow creation within SDI.

(Check out part 1 to find out how to setup SDI)

 

FYI: I’m going to assume that you have the necessary rights on your HANA box to perform all described steps. For more info on the needed privileges, check out section 2.1 of the official SAP HANA EIM Administration Guide.

Twitter Setup:

I won’t go over all the details here, but you basically go through the following process:

You’ll end up with something like this:

Bring in the Twitter data:

Let’s start with an overview of the dataflow…

There’s 2 objects that we have to create for our initial test-case:

  1. A “remote source”, which defines the connection to our Twitter App
  2. A “replication task”, which will take care of the data retrieval and persistence in HANA.

 

FYI: It’s time to actually logon to your HANA box now 😊. I’ll use the “Web based Development Workbench” interface. This interface is fully ‘online’, so you don’t need to install HANA Studio. In fact, some parts (replication task, flowgraph) are only available in the online interface.

 

Typical URL for the Web based Development Workbench: https://<hanaserver>:43<hanainstance>/sap/hana/ide/

 

We’ll use 2 parts of the workbench:

  • Catalog (to create the remote source and execute SQL commands)
  • Editor (to create the replication task)

1.    Creating the remote source

 

  • GOTO “Catalog”
  • GOTO Provisioning à Remote Sources

  • Name your remote source
  • Choose adapter “TwitterAdapter” and your agent
  • Choose Credentials Mode “Technical User” and then fill in the credentials from your Twitter App (API Key, API Secret, Access Token, Access Token Secret)
  • After saving you can test the connection

 

2.    Replicating the data

 

GRANT rights

You have created your own ‘remote source’ object now, which means that you are its owner, and thus you have all privileges on it. Next thing we’ll do, is create a replication task, and although it’s you that creates the ‘task’, in the background it’s the system user (_SYS_REPO) that will create the underlying objects. As some of those objects are dependent on the remote source, we’ll need to grant some rights to this system user.

 

  • Open an SQL console & Grant user “_SYS_REPO” the necessary rights to perform background tasks when activating a replication task

CODE

GRANT CREATE VIRTUAL TABLE ON REMOTE SOURCE “<your-source>” TO _SYS_REPO WITH GRANT OPTION;
GRANT CREATE REMOTE SUBSCRIPTION ON REMOTE SOURCE “<your-source>” TO _SYS_REPO WITH GRANT OPTION;

 

Create  Replication Task

  • GOTO “Editor”
  • Create a package if needed
  • Create a Replication Task

  • Name your task
  • Choose your own remote source
  • Use prefix “TB_VR_” for the generated virtual table (best practice)

  • Add Objects > Public_Stream
    • Use prefix “TB_” for the generated target table (best practice)
    • Use Realtime only for the behavior

 

FYI: We’ve chosen the ‘realtime’ option above which has some implications.

o   Our source will push new data when it becomes available. This means, depending on how much your subject is tweeted about, it might take a while before you see any data. It’s off course perfectly possible to create your own tweet about the subject, to test the functionality. (I’ve chosen ‘bitcoin’ as subject here, which typically gives results straight-away 😊)

o   We’ll have to use the CDC (Change Data Capture) tab to provide filters. It’s just the ‘realtime version of filtering’.

 

  • Fill in the CDC (Change Data Capture) parameter “Phrases to track”

 

  • Save
  • Right click on the Replication Task to Execute it.

Check the results:

  • GOTO “Catalog”
  • Refresh the Tables directory under your schema
  • Open Content of the newly created target table

  • As said above, you can tweet about your chosen subject (here ‘bitcoin’) and will see your own tweet popup here when you refresh.

That’s it! You now know how easy it is to start with SAP HANA SDI, and you can start exploring Twitter as a new source of data for your company/clients. Have fun and stay tuned for more…

  • Blogs

    Open Hub Destination with BEx Query as an InfoProvider

    Introduction

    A SAP BW system is mostly known to consolidate data to give reporting or analytics capabilities to your business. However, there could be occasions where you want to transfer the data. One of the options to do this is to make use of Open Hub Destination. This was in the past a part of the InfoSpoke, but separated since SAP BW 7.X. The Open Hub Destination allows you to distribute data in database tables or flat files in a fully integrated and controlled SAP BW data flow:

    Read more >
  • Blogs

    Getting started with SAP HANA SDI in less than a day

    (The Twitter case) Part 2

    I was recently asked by a client’s marketing department to figure out how to bring Twitter data to their reporting environment. They are currently using a BW-on-HANA, so I immediately thought of HANA Smart Data Integration (SDI) as a possible candidate. In this blog I’ll cover the basics for the setup and show you that it’s actually quite easy to get up-and-running with SDI. In fact: I was pulling data from Twitter less than 3 hours after starting this SDI adventure 😊.

    Read more >
  • Blogs

    Getting started with SAP HANA SDI in less than a day

    (The Twitter case) Part 1

    I was recently asked by a client’s marketing department to figure out how to bring Twitter data to their reporting environment. They are currently using a BW-on-HANA, so I immediately thought of HANA Smart Data Integration (SDI) as a possible candidate. In this blog I’ll cover the basics for the setup and show you that it’s actually quite easy to get up-and-running with SDI. In fact: I was pulling data from Twitter less than 3 hours after starting this SDI adventure 😊.

    Read more >