SAP to Databricks ingestion tools comparison

Data Ingestion Tools Comparison

icon
Feel free to contact us so we can help you select the best tool for your use-cases.

There are two main ways to get data out of an SAP ERP system reliably:

1. Environments, setup

Run options
Installation needed:
NameEnvironmentRun optionsInstallation neededIndependent of SAP SLT
Fivetran HVA

Azure, AWS, GCP

SaaS + agent

HighVolume Agent

☑️ (but for low-level tables only)

Azure, AWS, GCP

on-prem

full app installation

Azure only

SaaS + agent

integration runtime needs to be installed on an on-premises computer or on a virtual machine (VM).

☑️ (SLT is recommended and needed for SAP tables extraction)

SNP Glue

Azure, AWS, GCP

on-prem

installed as ABAP add-on on ERP server (no additional hardware required)

✅ (but can work with SLT if already in place)

Qlik Replicate

Azure, AWS, GCP

on-premSaaS + agent

full app installation (on-prem version)

✅ (see release log)

Azure, AWS, GCP

on-premSaaS

full app installation (on-prem version, Kubernetes)

AWS only

SaaS

❌ (fully SaaS)

☑️

GCP only

SaaS

❌ (fully SaaS)

☑️

Asapio

Azure, AWS

on-prem

SAP Datasphere

Azure, AWS, GCP

SaaS

2. Costs

Pricing
Free of additional costs
NamePricing / 100 sourcesFree of additional costsIndependent of SAP SLT
Fivetran HVA

consumption based (# of rows processed)

❌ (agent needs additional HW)

☑️ (but for low-level tables only)

consumption based (# of rows processed) or subscription based

❌ (installation needs additional HW)

# of runs, per hour of run

❌ (integration runtime needs additional HW)

☑️ (SLT is recommended and needed for SAP tables extraction)

SNP Glue

# of ingestion pipelines/ingested tables (tiered)

✅ (installed as ABAP addon onto an existing SAP Netweaver machine)

✅ (but can work with SLT if already in place)

Qlik Replicate

❌ (installation needs additional HW)

✅ (see release log)

# of capacity units / month

❌ (installation needs additional HW)

# of successful runs + amount of data processed

✅ (it’s SaaS)

☑️

paying for pipeline development and execution

✅ (it’s SaaS)

☑️

SAP Datasphere

~$60k 1 + $5k/20 GBs outbound data transfer

Asapio

# of ingestion pipelines/ingested tables (tiered)

3. Historical SAP data extraction

NameHistorical data load (primary system)Filtered historical load
Fivetran HVA

via the SAP Application Layer, High Volume Agent

via the SAP Application Layer and LDP Agent

via the SAP Application Layer

❌ (source)

SNP Glue

utilize standard SAP select options

Qlik Replicate

ODP via oData only (slower)

SAP Datasphere

Asapio

4. Continuous SAP data extraction

Real-time support
SAP Extractors, CDS Views support
BW data extraction support
New records (deltas) processing for CDSViews/HANA Calculation views
NameReal-time supportExtractors supportBW data extraction supportNew records (deltas) processing for CDSViews/HANA Calculation viewsApplication layer extractionDatabase layer extraction
Fivetran HVA

✅ (via CDC)

❌ (low-level tables only)

❌ (low-level tables only)

✅ (via CDC)

✅ (via CDC)

❌ (low-level tables only)

✅ (via CDC)

❌ (batches only, every 5 minutes)

✅ (via ODP)

✅ (via SLT and ODP)

SNP Glue

✅ (near-real time)

✅ (via ODP)

✅ (via triggers)

Qlik Replicate

✅ (log-based + trigger based)

✅ (via ODP)

✅ (for the ODP connector only?)

✅ (via CDC, or via triggers for HANA)

❌ (micro-batches only)

✅ (via ODP)

✅ (ODP via OData)

❌ (ODP only)

✅ (via SLT)

✅ (via ODP)

✅ (via SLT)

SAP Datasphere

Asapio

✅ (via triggers)

5. Writing data to target

Schema conversion
DeltaLake support
NameSchema conversionDeltaLake support
Fivetran HVA

✅ (source)

✅ (source)

✅ (source)

✅ (source)

SNP Glue

Qlik Replicate

✅ (source)

✅ (source)

❌ (custom connector needed)

❌ (parquet or CSV only)

SAP Datasphere

Asapio

❌ (CSV only)

How does the incoming data processing work?

The data ingestion tool generates smaller (real-time/micro-batching) or larger (hourly/daily processing) batches of data that are written into CSV, parquet, delta or some other format and then written into the cloud storage (AWS S3, Azure Blob Storage, GCP Cloud Storage).

As soon as the file (P&L planning data of the diagram) lands in the bronze/landing zone of the data lake, Databricks can pick the file and append it to the overall P&L planning table stored in the silver layer.

image
icon
Feel free to contact us so we can help you select the best tool for your use-cases.

Return to Data Ingestion

Application-level SAP data replicationDatabase-level SAP data replication