Data Ingestion Tools Comparison
There are two main ways to get data out of an SAP ERP system reliably:
1. Environments, setup
Name | Environment | Run options | Installation needed | Independent of SAP SLT |
---|---|---|---|---|
Azure only | SaaS + agent | integration runtime needs to be installed on an on-premises computer or on a virtual machine (VM). | βοΈ (SLT is recommended and needed for SAP tables extraction) | |
SNP Glue | on-prem | installed as ABAP add-on on ERP server (no additional hardware required) | β (but can work with SLT if already in place) | |
Qlik Replicate | on-premSaaS + agent | full app installation (on-prem version) | β (see release log) | |
Azure, AWS, GCP | on-premSaaS | full app installation (on-prem version, Kubernetes) | ||
AWS only | SaaS | β (fully SaaS) | βοΈ | |
GCP only | SaaS | β (fully SaaS) | βοΈ | |
Asapio | Azure, AWS | on-prem | β | β |
SAP Datasphere | Azure, AWS, GCP | SaaS | β | β |
2. Costs
Name | Pricing / 100 sources | Free of additional costs | Independent of SAP SLT |
---|---|---|---|
β (integration runtime needs additional HW) | βοΈ (SLT is recommended and needed for SAP tables extraction) | ||
SNP Glue | # of ingestion pipelines/ingested tables (tiered) | β (installed as ABAP addon onto an existing SAP Netweaver machine) | β (but can work with SLT if already in place) |
Qlik Replicate | β (installation needs additional HW) | β (see release log) | |
β (installation needs additional HW) | |||
β (itβs SaaS) | βοΈ | ||
β (itβs SaaS) | βοΈ | ||
SAP Datasphere | ~$60k 1 + $5k/20 GBs outbound data transfer | β | β |
Asapio | # of ingestion pipelines/ingested tables (tiered) | β | β |
3. Historical SAP data extraction
Name | Historical data load (primary system) | Filtered historical load |
---|---|---|
via the SAP Application Layer | β (source) | |
SNP Glue | utilize standard SAP select options | |
Qlik Replicate | β | β |
ODP via oData only (slower) | ||
SAP Datasphere | β | β |
Asapio | β | β |
4. Continuous SAP data extraction
Name | Real-time support | Extractors support | BW data extraction support | New records (deltas) processing for CDSViews/HANA Calculation views | Application layer extraction | Database layer extraction |
---|---|---|---|---|---|---|
β (batches only, every 5 minutes) | β | β (via ODP) | β | β | β (via SLT and ODP) | |
SNP Glue | β (near-real time) | β | β | β (via ODP) | β | β (via triggers) |
Qlik Replicate | β (log-based + trigger based) | β | β (via ODP) | β (for the ODP connector only?) | β | β (via CDC, or via triggers for HANA) |
β | β | β | β | β | β | |
β (micro-batches only) | β | β (via ODP) | β (ODP via OData) | β (ODP only) | ||
β (via SLT) | β | β (via ODP) | β | β (via SLT) | ||
SAP Datasphere | β | β | β | β | β | β |
Asapio | β | β | β | β | β | β (via triggers) |
5. Writing data to target
How does the incoming data processing work?
The data ingestion tool generates smaller (real-time/micro-batching) or larger (hourly/daily processing) batches of data that are written into CSV, parquet, delta or some other format and then written into the cloud storage (AWS S3, Azure Blob Storage, GCP Cloud Storage).
As soon as the file (P&L planning data of the diagram) lands in the bronze/landing zone of the data lake, Databricks can pick the file and append it to the overall P&L planning table stored in the silver layer.