
MAINGAU Energie faced the task of building a modern data platform that enables modern data workflows, integrates diverse existing data stores, and at the same time meets the security and compliance requirements of critical infrastructure. Security, scalability, and clear separation of environments were as central as seamless integration into existing data workflows.
Using modern cloud, DevOps, and DataOps practices, flexible data transformation was to be supported without unnecessary operational complexity or vendor lock-in.
The focus was therefore not only on building a classic data lake, but on developing a future-proof data platform where infrastructure and data pipelines are treated as maintainable, versioned software.
Based on these requirements, a fully automated data lake architecture was implemented on AWS. The implementation of a landing zone via AWS Tower and a newly established multi-account structure ensures clear separation of environments and robust security mechanisms through well-defined guardrails. Access management and security checks were further integrated via existing systems (e.g. Microsoft Defender for Cloud). All resources are provisioned via AWS CDK using Infrastructure as Code, enabling consistent and reproducible deployments across all environments.
Building on this, multiple Databricks workspaces were integrated in a bring-your-own-VPC model. This setup ensures clean network isolation and meets high requirements for security and governance.
For data processing, a modern ELT approach was chosen. Data is first loaded centrally and then transformed on the platform. Data ingestion is handled by a vendor-neutral solution, preserving flexibility and future-proofing.
External data is integrated via Delta Sharing and can be used securely and at scale without redundant data storage or additional operational overhead.
Data transformations are implemented using two established approaches:
Both variants are integrated into separate repositories and connected to test and production environments via automated CI/CD pipelines.
The result is a stable, scalable, and highly automated data platform that simplifies operations while creating room for data-driven development.
This has created a solid foundation on which data-based decisions can be made more quickly and new use cases can be implemented sustainably.

Solution Engineer
MAINGAU Energie