Data Ingestion from SAP ERP via Azure DataFactory

1. Scaling data ingest for 100s or 1000s of SAP data sources

Scaling Azure DataFactory SAP setup for a larger number of tables is challenging. For each data source, the following settings must be defined manually:

all primary keys each incrementally-updated ingested table
SAP ODP source name (for CDS Views)
ingest mode (full or incremental - based on the data source type)

Our Accelerator helps you to automate all these steps fully using a simple configuration.

2. Automate incremental data load

We’ve prepared a python package to effectively merge the CDC events into a Delta table, apply SCD2 transformations to create flat tables registered in the Databricks Unity Catalog, and many more.

3. Enrich Databricks schema with SAP metadata, fix types

Supports SAP objects ingested using either CDSViews or extractors.

ADF-ingested table schema (TCURR table)

Table schema fixed by the Eviden SAP Accelerator

Automatically applied schema changes

Primary key definition
Missing nullability constraints (NULL or NOT NULL) applied
Column comments transfer
Data type fixes:

Most of the DATE fields are ingested as strings (in multiple formats)

DATS date (YYYYMMDD)
DATUM_INV integer-based date values

DECIMAL length fixes

4. Enable hierarchies extraction

Extracting SAP hierarchies for Cost Centers, Profit Centers and others is not supported. (see issue). Neither SAPI Extractors (0COSTCENTER_0101_HIER, 0GL_ACCOUNT_T011_HIER, …) nor extraction CDSViews (I_CostCenterHierarchyNode work).

Solution: Create a custom CDSView that allows ODP-based extraction.

💡

← Return to SAP ERP data integration to Databricks

Azure DataFactory Data Extraction