Let me tell you a story...
A few months ago, I walked into a room filled with frustrated faces. The Data team couldn’t trust the reports they were getting. The Network Analytics team was using data from tools that didn’t talk to each other. Operations? They had their own spreadsheets—most of which were outdated.
“Vish, can you fix this mess?” they asked.
I smiled. This wasn’t just a mess; it was an opportunity to create something transformative. I knew exactly what we needed: a centralised data hub. And to get it right, I turned to my trusted playbook—TOGAF—and a powerful tool called Databricks.
Here’s how it all unfolded.
Step 1: Setting the Vision
I started by asking a simple but crucial question: Why do we need this data hub?
The answer was clear:
- To unify scattered data between networks.
- To enable reliable, real-time insights.
- To make better decisions faster.
I gathered stakeholders together and painted a picture of success:
"Imagine one place where all your data lives, ready to answer your questions at the speed of thought."
Eyes lit up around the room. The mission was clear.
Step 2: Understanding the Business
TOGAF taught me to start with the business. I took the time to understand each department’s processes:
- Data team needed better forecasting tools to predict network usage.
- Network Analytics wanted personalised segmentation for grid health and power quality monitoring.
- Operations craved real-time insights into grid loads, outage management, and inventory tracking.
I mapped the data flows, pinpointed the bottlenecks, and identified where we needed improvements. This formed the foundation of our design.
Step 3: Designing the Architecture with DataMesh and Databricks
Here’s where DataMesh and Databricks took centre stage.
Based on my previous experience at a Health service in Australia, I introduced the team to DataMesh as an architectural concept that enables decentralised data ownership while maintaining central governance. The Lakehouse architecture was introduced to manage the mix of raw and structured data in one seamless platform.
- DataMesh allowed us to treat data as a product, with data domains focused on specific teams owning their respective data.
- Lakehouse architecture provided the best of both worlds: a data lake for raw data and a data warehouse for structured datasets that could be queried easily.
The architecture came to life in layers:
- Data Ingestion: Streamed data from multiple sources like iTron (SIQ/UIQ), SAP HANA, ODW, and GIS (SDW).
- Middleware: We used Zepben EnergyWorkBench (EWB) for transforming and syncing data across platforms.
- Connectors: Databricks Jobs and Confluent Kafka enabled seamless data movement across systems and ensured we could handle both batch and real-time data processing.
- Processing: Using Delta Lake for data transformation, we cleaned unstructured data into structured, query-ready tables.
- Analytics: LVA Dashboards, Strategic Network Apps, and Databricks SQL powered queries, while POSIT Shiny visualised the insights.
I mapped all these layers using TOGAF’s Architecture Development Method (ADM) to ensure the design was scalable, flexible, and aligned with Jemena's long-term goals.
Step 4: Building the Technology Stack
Our stack was cutting-edge:
- Compute: Databricks clusters auto-scaled to handle massive workloads with ease.
- Storage: AWS S3 served as our central, resilient data repository.
- Security: We used AWS Identity and Access Management (IAM) and Databricks Unity Catalog for access control, ensuring the right stakeholders had the right data at the right time.
Gone were the days of mystery spreadsheets—Unity Catalog tracked every piece of data, ensuring compliance with Australian data protection laws and giving full transparency into data lineage.
Step 5: Execution
With the architecture in place, I crafted a detailed phased roadmap to bring the vision to life. The Work Packages (WP) were structured to deliver value quickly and iteratively, ensuring continuous progress:
WP1: Discovery Phase
In this foundational phase, we focused on understanding the current data landscape. We assessed the existing systems, identified the gaps, and planned how to integrate disparate data sources. This allowed us to build a single source of truth and set the groundwork for the integration process.
WP2: Key Components Delivered
This work package saw the integration of SAP HANA and the creation of data products via Confluent Platform. Key components included:
- SAP HANA Integration: Daily extracts and uploads of Meter-CT-ratio and Usage Point NMI data to an S3 bucket.
- Confluent Platform Data Products: Stream processing capabilities were added to enrich meter-power-quality data with usage-point-info, which was then ingested into R-Server’s SQL Server DB for deeper analysis.
Key Outcomes:
- Prepared R-Server for high-volume data through the VVC rollout.
- Retired inefficient file-based integration for power quality data from SIQ.
WP3: Zepben and ODW Integration
Key integrations were set up in this phase:
- ODW Integration: Captured and streamed switch state changes to a Kafka topic.
- Zepben Network Model: We managed dynamic network models and integrated GIS data for the static model.
- Switch State Micro Service: This service ingested switch state events into the Energy Workbench Network Model.
Key Outcomes:
- CIM-compliant reusable Electricity Network Model was validated and made ready for the Grid Stability Program and DERMS use cases.
WP4: Databricks and Network Model Enhancements
In this phase, the Databricks Platform was operationalised, and significant integrations took place:
- Zepben Network Model was enhanced with circuit information.
- Time-Series database was introduced to handle Power Quality Data.
- Confluent Platform was integrated with Databricks to enrich Usage Point Info.
Key Outcomes:
- Bronze, Silver, and Gold layer data products were defined and implemented.
- SAP/HANA batch data was successfully ingested into Databricks.
- Operationalised the Databricks platform for data engineering, data science, and analytics.
WP5: LVA Dashboard MVP
In this work package, we delivered initial MVP use cases:
- LVA Dashboard MVP: A new dashboard leveraging Databricks visualisation tools.
- Additional Data Products: Enabling further use cases in the following work packages.
Key Outcomes:
- MVPs for key use cases like Dynamic Network Model and Power Quality Data.
- All platform components (Zepben EWB, Databricks, Time-Series DB) were built, tested, and deployed.
WP6: Expansion of LVA Dashboard and Strategic Analytics
This phase focused on expanding the LVA Dashboard to support additional use cases and replacing legacy R-Server + Shiny solutions with Databricks-based alternatives.
Key Outcomes:
- R-Server algorithm migration completed to the Databricks platform.
- LVA Dashboard was expanded to support new insights, enabling operational decision-making.
WP7: Full Integration and Grid Stability
The final work package focused on integrating additional data sources and creating advanced data products for Grid Stability and related analytics.
Key Outcomes:
- Integrated external data sources like Weatherzone, BOM, and Solcast.
- Built Grid Stability Solution using the network model and Databricks platform.
- Created a LIDAR Image Processor using Databricks’ image processing capabilities.
Step 6: Launch Day
The launch day was unforgettable.
I watched as the Network Analytics team pulled up their first real-time LVA (Low Voltage Analytics) Dashboard. “This is magic,” one of the analysts whispered, seeing the live data on grid performance and power quality. The Operations team was already diving into the new insights, using the Dynamic Network Model and power quality datafrom the Time-Series database to optimise network management.
In Operations, we saw the immediate impact: outage management was now more efficient, and the team could finally retire their outdated spreadsheets, knowing they had real-time access to critical data. The engineering teams also embraced the Databricks-powered insights, allowing them to perform deeper analysis on grid health and even plan better for future energy demand.
Behind it all was the seamless integration of TOGAF’s structured governance and Databricks’ powerful tools, turning chaos into clarity.
Step 7: Future-Proofing with TOGAF
Architecture isn’t a one-time job. TOGAF’s Architecture Change Management reminded me to always plan for the future.
We expanded the hub with:
- New data sources like power quality data, IoT sensors, and energy monitoring tools.
- Machine learning models with Databricks MLflow for predictive analytics.
- Sustainability metrics to track and optimise operations, in line with Jemena’s goals for carbon reduction.
This wasn’t just a data hub—it became a platform for endless possibilities.
Looking Back
Creating a centralised data hub wasn’t just about the technology; it was about solving real problems for real people. TOGAF gave me the structure, and Databricks provided the tools.
And the best part? Watching those frustrated faces light up when they realised what was possible.
That’s why I do what I do.