SAP ECC or S/4 data integration to Databricks
Motivation
- make RAW SAP data available in the datalake for advanced analytics and ML purposes
- SAP extracts and SAP BW data model contains a limited number of tables and columns/fields - nothing more than what is needed for reporting
- analytics team with SQL/python skills can build the analytical data model E2E without any dependencies on the SAP team with CDSViews, ABAP Extractors, … skills
- ELT > ETL → real datalake approach
- combine SAP and non-SAP data easily
- (near) real-time data availability if necessary
- incremental data load is always available (CDC - change data capture)
- not always possible with SAP Extractors or CDSViews
- reporting can be done via PowerBI/Tableau
- more stable performance on bigger data or complex reports
- selected BW components can be moved/re-engineered to Databricks
- Ingesting RAW SAP S/4 tables
- Metadata extraction (data types, table & column descriptions, associations, …)
- Data model building and reports/dashboards creation
- By using pre-built CDS-views in S/4 and converting them into Databricks views
- By using our pre-built SAP data model