Transformation Flow in SAP Datasphere

Dimitrios

Written By: Dimitrios - 25 July 2024
(updated on: 12 August 2024)

Transformation flows are used in SAP Datasphere to load data from one or more sources and save the transformed result in the target table. You can also load only the changed records and do not have to constantly extract the full data set. This article explains how the transformation flows work in detail and what restrictions you need to be aware of.

Transformation flows explained

In contrast to views, which allow you to combine different tables on the fly and adapt the output, transformation flows are used when the result must be persisted in the database and a delta mechanism is required. They work in a similar way to transformations in SAP Business Warehouse, hence the name.

To implement the transformation logic, you can utilize a user-friendly graphical view interface. If you want to implement more advanced requirements, you can only use SQL or SQL script for transformation flows. If you want to use Python, you must use data flows. In addition to local tables and views that are available in the space, Open SQL schemas and remote tables located in BW Bridge spaces can also be used as a source. These must be integrated into the respective space beforehand.

Transformation Flow

Especially remarkable is the ability to load data in delta mode. This requires that delta capture is enabled in both the source and target tables. You must decide before deploying the table whether delta capture should be activated. Once the table has been deployed, you can no longer change this setting.

000-table-delta-capture_transformation_flow

When delta capture is active, two new columns are added to the table: Change_Type of type String and Change_Date of type Timestamp. These cannot be changed. Furthermore, you cannot define the new columns as keys. Furthermore, SAP advises against using these columns in transformation logics. As the delta entry table is an internal table whose structure can change at any time, it is not permitted for external data access.

001-table-columns_transformation_flow

Internally, the columns Change_Type and Change_Date are not part of the active table, but of the table for delta entry, which is created with the suffix _Delta. Technically speaking, the active table is a view that excludes the deleted values.


Download the whitepaper and find out
which product is best for your data warehousing strategy!

Neuer Call-to-Action


A table with activated delta capture offers several advantages compared to normal tables. For example, not all data always has to be processed during loading processes and transformations. The amount of processed data is reduced.

 

002-transformation-flow-initial-and-delta_transformation_flow

In the transformation flow settings, you can choose between the load types Initial Only and Initial and Delta. With the first option, the entire data set is loaded into the target table. With the Initial and Delta option, the entire data set is transferred to the target table during the first execution. In subsequent executions, only the delta changes are loaded.

In addition to reducing the volume, delta capture also allows deleted data records to be identified. If a data record is loaded from a non-delta-capable source table into the target table and then deleted in the source, it remains in the target table by default. Thanks to the delta mechanism, data records deleted in the source are also deleted in the target.

Restrictions

However, there are some restrictions that you must observe. For example, only one delta-capable table can currently be used as a source. It is not possible to link multiple delta-capable tables in the source view. Yet, this is a common requirement for more complex data warehousing scenarios.

003-only-one-source-table_transformation_flow

Moreover, the data preview is available in graphical view or SQL transformations only. If you utilize SQL Script for your transformation logic, data preview is not available.

Furthermore, views cannot be used for the delta mechanism. This is because the Change_Type and Change_Date columns are not available in the view. However, a view can be used in combination with a delta-capable table. Views can also be used as a source for a load process without delta.

In addition, if you use graphical views, the Change_Date column cannot be used in calculated columns as part of the transformation flow. This limits the modeling options. As a workaround, you can leverage SQL views, e.g. to implement a change log functionality. However, please keep in mind that SAP can change the internal table structure at any time.

004-change_date-cannot-be-used_transformation_flow

In addition to the graphical views, SQL views can also be used for transformations. However, the delta mechanism is switched off if the calculations are too complex and local tables are used.

Transformation Flow - Our Summary

Transformation flows cover the most common requirements and allow to utilise delta capabilities. This allows you to use SQL logic for transformations and work with delta tables.

However, we miss the option of using Python for more sophisticated transformations. In addition, only one delta-capable table can currently be used as a source. We would like to be able to use several tables with a delta mechanism in the future.

We would also like to have the option of using our own CDC (Change Data Capture) mechanism. For example, custom delta objects can be created in CDS views, which can then be used as a source in SAP Datasphere. This option is currently missing in SAP Datasphere.

Do you have questions about SAP Datasphere? Are you trying to build up the necessary know-how in your department or do you need support with a specific issue? Please do not hesitate to contact us. We look forward to exchanging ideas with you! 

Learn more about  SAP Datasphere

Topics: SAP Data Warehouse, Datasphere

Share article