Data Ingestion Project Schedule
This page describes an expected schedule of a full SAP → Databricks data ingestion project. You may also prefer our alternative use-case-driven approach.
1. PoC Phase
- Initial scope/data-set definition + other requirements
- Check connection onprem->cloud, bandwidth, Azure resources availability
- Check available versions of SAP tools and expected data-size
- Outcome: working connection SAP -> datalake, first sample table transferred
2. Initial Load phase
- Proper set-up/fine-tune of missing pieces: firewalls, permissions, connection, bandwidth
- Full performance testing
- Outcome: Required data-set ingested from SAP to datalake bronze
3. Go-To-Production phase
- Incremental load setup
- Access controls setup, governance, compliance
- Monitoring & alerting & operations setup
- Definition of standard procedures „How to add new data-set, change existing“, …
- Outcome: new data is being ingested to datalake reliably, everything is properly governed and compliant
4. Business-as-usual
- Running standard procedures
- Changing/improving existing standard procedures, creating the new ones
- Outcome: working on change management, iterative improvement