Data Ingestion from SAP ERP via Azure DataFactory
1. Scaling data ingest for 100s or 1000s of SAP data sources
Scaling Azure DataFactory SAP setup for a larger number of tables is challenging. For each data source, the following settings must be defined manually:
- all primary keys each incrementally-updated ingested table
- SAP ODP source name (for CDS Views)
- ingest mode (full or incremental - based on the data source type)
Our Accelerator helps you to automate all these steps fully using a simple configuration.
2. Automate incremental data load
We’ve prepared a python package to effectively merge the CDC events into a Delta table, apply SCD2 transformations to create flat tables registered in the Databricks Unity Catalog, and many more.
3. Enrich Databricks schema with SAP metadata, fix types
Supports SAP objects ingested using either CDSViews or extractors.
ADF-ingested table schema (TCURR table)
Table schema fixed by the Eviden SAP Accelerator
Automatically applied schema changes
- Primary key definition
- Missing nullability constraints (NULL or NOT NULL) applied
- Column comments transfer
- Data type fixes:
- Most of the
DATE
fields are ingested as strings (in multiple formats) DATS
date (YYYYMMDD)DATUM_INV
integer-based date valuesDECIMAL
length fixes
4. Enable hierarchies extraction
Extracting SAP hierarchies for Cost Centers, Profit Centers and others is not supported. (see issue). Neither SAPI Extractors (0COSTCENTER_0101_HIER
, 0GL_ACCOUNT_T011_HIER
, …) nor extraction CDSViews (I_CostCenterHierarchyNode
work).
Solution: Create a custom CDSView that allows ODP-based extraction.