SAP Accelerator for DataFactory
= Eviden SAP data ingestion accelerator for Azure DataFactory
1. Add new SAP data sources to Databricks in minutes
Adding new SAP data sources to Databricks using the standard Azure DataFactory workflow is complicated and time-consuming for people with limited SAP knowledge.
For each data source, the following settings must be defined manually:
- SAP ODP source name (for CDS Views)
- all primary keys each incrementally-updated ingested table
- ingest mode (full or incremental - based on the data source type)
Use our python package to simplify and automate the SAP data ingestion
2. Automate incremental data load
We’ve prepared a python package to effectively merge the CDC events into a Delta table, apply SCD2 transformations to create flat tables registered in the Databricks Unity Catalog, and many more.
3. Enrich Databricks schema with SAP metadata, fix types
Supports SAP objects ingested using either CDSViews or extractors.
ADF-ingested table schema (TCURR table)
Table schema fixed by the Eviden SAP Accelerator
Automatically applied schema changes
- Primary key definition
- Missing nullability constraints (NULL or NOT NULL) applied
- Column comments transfer
- Data type fixes:
- Most of the
DATE
fields are ingested as strings (in multiple formats) DATS
date (YYYYMMDD)DATUM_INV
integer-based date valuesDECIMAL
length fixes
4. Enable hierarchies extraction
Extracting SAP hierarchies for Cost Centers, Profit Centers and others is not supported. (see issue). Neither SAPI Extractors (0COSTCENTER_0101_HIER
, 0GL_ACCOUNT_T011_HIER
, …) nor extraction CDSViews (I_CostCenterHierarchyNode
work).
Solution: Create a custom CDSView that allows ODP-based extraction.