Integration with Databricks
Data from Databricks Lakehouse can be harmonized with SAP and non-sap data via SAP Datasphere's unified data models for use with richer analytics and other use cases.
Architecture
1. Integration with Databricks Delta Lake
Mode(s) of Integration: Federating data live into SAP Datasphere.
Delta Lake is an optimized storage layer that provides the foundation for tables in a lakehouse architecture on Databricks. It brings reliability to data lakes, ensuring ACID (Atomicity, Consistency, Isolation, Durability) transactions, scalable metadata handling, and unifying streaming and batch data processing.
Data from Databricks Delta Lake tables can be federated live into SAP Datasphere virtual remote models using SAP Datasphere's data federation architecture. This integration allows for the seamless augmentation of Databricks data with SAP business data in real-time. The federated data can be incorporated into unified semantic models, enabling efficient and real-time analytics through SAP Analytics Cloud dashboards.
The integration process involves:
- Connection Setup: Establishing a secure connection between SAP Datasphere and Databricks Delta Lake using supported connectors and authentication mechanisms.
- Data Federation: Configuring virtual tables in SAP Datasphere that reference the live data in Databricks Delta Lake without physically moving the data.
- Model Augmentation: Enhancing the federated data with SAP business data to create comprehensive and unified semantic models.
- Real-time Analytics: Utilizing SAP Analytics Cloud to build dashboards and reports that leverage the real-time, federated data for actionable insights.
This approach ensures that data remains consistent and up-to-date, providing a robust foundation for advanced analytics and decision-making processes.