Logo

    SAP to Databricks ingestion tools comparison

    🏠
    Home

    Data Ingestion Tools Comparison

    icon
    Feel free to contact us so we can help you select the best tool for your use-cases.

    There are two main ways to get data out of an SAP ERP system reliably:

    • Database-level data replication
    • Application-level data replication.

    1. Environments, setup

    ‣
    Run options
    ‣
    Installation needed:
    Name
    Environment
    Run options
    Installation needed
    Independent of SAP SLT
    Azure DataFactory

    Azure only

    SaaS + agent

    integration runtime needs to be installed on an on-premises computer or on a virtual machine (VM).

    ☑️ (SLT is recommended and needed for SAP tables extraction)

    SNP Glue

    Azure, AWS, GCP

    on-prem

    installed as ABAP add-on on ERP server (no additional hardware required)

    ✅ (but can work with SLT if already in place)

    Qlik Replicate

    Azure, AWS, GCP

    on-premSaaS + agent

    full app installation (on-prem version)

    ✅ (see release log)

    SAP Data Intelligence

    Azure, AWS, GCP

    on-premSaaS

    full app installation (on-prem version, Kubernetes)

    AWS AppFlow

    AWS only

    SaaS

    ❌ (fully SaaS)

    ☑️

    Cloud Data Fusion

    GCP only

    SaaS

    ❌ (fully SaaS)

    ☑️

    Asapio

    Azure, AWS

    on-prem

    ✅

    ✅

    SAP Datasphere

    Azure, AWS, GCP

    SaaS

    ✅

    ✅

    2. Costs

    ‣
    Pricing
    ‣
    Free of additional costs
    Name
    Pricing / 100 sources
    Free of additional costs
    Independent of SAP SLT
    Azure DataFactory

    # of runs, per hour of run

    ❌ (integration runtime needs additional HW)

    ☑️ (SLT is recommended and needed for SAP tables extraction)

    SNP Glue

    # of ingestion pipelines/ingested tables (tiered)

    ✅ (installed as ABAP addon onto an existing SAP Netweaver machine)

    ✅ (but can work with SLT if already in place)

    Qlik Replicate

    ❌ (installation needs additional HW)

    ✅ (see release log)

    SAP Data Intelligence

    # of capacity units / month

    ❌ (installation needs additional HW)

    AWS AppFlow

    # of successful runs + amount of data processed

    ✅ (it’s SaaS)

    ☑️

    Cloud Data Fusion

    paying for pipeline development and execution

    ✅ (it’s SaaS)

    ☑️

    SAP Datasphere

    ~$60k 1 + $5k/20 GBs outbound data transfer

    ✅

    ✅

    Asapio

    # of ingestion pipelines/ingested tables (tiered)

    ✅

    ✅

    3. Historical SAP data extraction

    Name
    Historical data load (primary system)
    Filtered historical load
    Azure DataFactory

    via the SAP Application Layer

    ❌ (source)

    SNP Glue

    utilize standard SAP select options

    Qlik Replicate

    ✅

    ✅

    SAP Data Intelligence

    AWS AppFlow

    ODP via oData only (slower)

    Cloud Data Fusion

    SAP Datasphere

    ✅

    ✅

    Asapio

    ✅

    ✅

    4. Continuous SAP data extraction

    ‣
    Real-time support
    ‣
    SAP Extractors, CDS Views support
    ‣
    BW data extraction support
    ‣
    New records (deltas) processing for CDSViews/HANA Calculation views
    Name
    Real-time support
    Extractors support
    BW data extraction support
    New records (deltas) processing for CDSViews/HANA Calculation views
    Application layer extraction
    Database layer extraction
    Azure DataFactory

    ❌ (batches only, every 5 minutes)

    ✅

    ✅ (via ODP)

    ✅

    ✅

    ✅ (via SLT and ODP)

    SNP Glue

    ✅ (near-real time)

    ✅

    ✅

    ✅ (via ODP)

    ✅

    ✅ (via triggers)

    Qlik Replicate

    ✅ (log-based + trigger based)

    ✅

    ✅ (via ODP)

    ✅ (for the ODP connector only?)

    ✅

    ✅ (via CDC, or via triggers for HANA)

    SAP Data Intelligence

    ✅

    ✅

    ✅

    ✅

    ✅

    ✅

    AWS AppFlow

    ❌ (micro-batches only)

    ✅

    ✅ (via ODP)

    ✅ (ODP via OData)

    ❌ (ODP only)

    Cloud Data Fusion

    ✅ (via SLT)

    ✅

    ✅ (via ODP)

    ✅

    ✅ (via SLT)

    SAP Datasphere

    ✅

    ✅

    ✅

    ✅

    ✅

    ✅

    Asapio

    ✅

    ❌

    ❌

    ✅

    ✅

    ✅ (via triggers)

    5. Writing data to target

    ‣
    Schema conversion
    ‣
    DeltaLake support
    Name
    Schema conversion
    DeltaLake support
    Azure DataFactory

    ✅

    ✅ (source)

    SNP Glue

    ✅

    ✅

    Qlik Replicate

    ✅ (source)

    ✅ (source)

    SAP Data Intelligence

    ❌

    ❌ (custom connector needed)

    AWS AppFlow

    ❌ (parquet or CSV only)

    Cloud Data Fusion

    SAP Datasphere

    ✅

    ✅

    Asapio

    ✅

    How does the incoming data processing work?

    The data ingestion tool generates smaller (real-time/micro-batching) or larger (hourly/daily processing) batches of data that are written into CSV, parquet, delta or some other format and then written into the cloud storage (AWS S3, Azure Blob Storage, GCP Cloud Storage).

    As soon as the file (P&L planning data of the diagram) lands in the bronze/landing zone of the data lake, Databricks can pick the file and append it to the overall P&L planning table stored in the silver layer.

    image
    icon
    Feel free to contact us so we can help you select the best tool for your use-cases.

    ← Return to Data Ingestion

    Application-level SAP data replicationDatabase-level SAP data replication