
A White Paper on Platform Strategy, Data Flow, and Workforce Design for the Next Five Years
- Date: July / August 2025
- Prepared for: Leaders driving the company’s clean energy transition, grid reliability, wildfire resilience, regulatory compliance, and customer experience objectives.
- Primary Contributor: Based on a conversation between Michael Lewis and Ted Tschopp.
Executive summary
Our utility was architected during the late stages of the Industrial Revolution to ship electrons everywhere and keep them on. That mission endures. What must change is how we accomplish it. The next five years will be defined by moving from paper-era workflows and fragmented platforms to opinionated, standardized digital platforms; edge‑to‑cloud data flows with preserved context; and an operating model that distinguishes specialist “mechanics” from business “drivers.”. At the center of all this will be AI.
Core theses
-
Data, not paper, must carry the work. Treat “edge → decision” latency as a first‑class reliability metric. If data can travel from Mars to JPL in ~20–30 minutes, there is no reason for multi‑day internal queues when we remove digital roadblocks.
-
Standardization beats sprawl. Hyperscaler ecosystems now make platform operations standardizable. AI diminishes the cost of language/platform switching, enabling us to become strongly opinionated about our build/run platforms while keeping vendor exit options as negotiating leverage.
-
Consolidation is the macro trend. As with automobiles and smartphones, the enterprise stack is consolidating to a handful of ecosystems and a “giant data blob in the cloud” with bespoke applications on top. Features across core databases are largely feature‑complete; selection should optimize operational depth, not novelty.
-
Two-speed vendor strategy. Maintain two strategic relationships (A/B) not for feature comparison but for executive optionality. A target “time‑to‑exit (TTE)” of ≤ 18 months for major platforms should be a contractual and architectural requirement.
-
Mechanics vs. drivers. Over time, more technology work becomes a business “driver” skill (e.g., software creation with AI co‑pilots, testing, portfolio/backlog management), while a smaller, deeper cadre of specialist “mechanics” focuses on infrastructure, operating systems, and high‑end data engineering.
Outcome for the mission
- Clean energy transition: Faster interconnection and DER orchestration via near‑real‑time telemetry and standardized data products.
- Grid reliability & wildfire resilience: Lower edge‑to‑decision latency for situational awareness; repeatable, automated runbooks.
- Regulatory compliance: Better data lineage and context preservation throughout the flow.
- Customer experience: Integrated, consistent digital processes that eliminate internal handoffs and wait states.
Context: From paper expectations to digital performance
For more than a century, we optimized for physical assets and reliability, supplementing with manual, paper‑anchored processes. In the information age, value creation increasingly depends on our ability to collect, contextualize, and move data from the edge through the enterprise without losing meaning or time. One of the key challenges is that we have computerized a lot of processes by turning the paper process into digital paper. For example a PDF traps text within an image thus making it opaque to downstream team members and the AI.
Key shift: “Reset expectations not around a piece of (digital) paper translating data from the edge, but around electronics, computers, and digital technologies moving that data end‑to‑end.”
Practical implication: treat data logistics (ingest → transform → govern → serve) like a grid asset. Every queue, rekey, hand‑off, or incompatible platform is a reliability risk and a cost center. See these as opportunities for Continuous Technology Modernization and Simplification.
Today’s reality: Heterogeneity born of throughput and preference
Recent projects illustrate why platform sprawl happens:
- UI in Power Platform (drag‑and‑drop speed for non‑coders)
- AI in Google (team familiarity)
- Storage in Azure (fastest path to stand up storage)
Add language preferences (Java vs. .NET) and cloud preferences (Google vs. AWS), and heterogeneity becomes a rational local optimization for throughput—but creates global complexity: harder support, shallower skill benches, inconsistent governance, and longer incident MTTR. Don’t optimize for local maximums, optimize for board room level metrics.
What’s changed: AI and hyperscalers enable an opinionated platform stance
Two structural changes now favor standardization:
-
AI‑assisted code and platform translation. Modern AI can translate between languages and frameworks (“nibbling the elephant”), reducing switching costs and enabling rapid retraining of talent. This blunts the “we prefer X” argument.
-
Hyperscaler standardization. Operating models, identity, observability, networking, and data services are now sufficiently mature and repeatable to run disciplined, opinionated platforms at scale. The only differentiation between hyperscalers these days is their AI offerings, not the rest of the stack.
Strategic conclusion: Consolidate onto one primary and one secondary ecosystem to deepen expertise, reduce variance, and preserve negotiating leverage.
Vendor strategy: Opinionated with optionality (the A/B model)
Principle: Keep two strategic relationships—not to chase features, but to avoid entrapment and maintain pressure for roadmap responsiveness (“what have you done for me lately?”).
- Target metric: Time‑to‑Exit (TTE) ≤ 18 months from any major platform for top 20 systems.
- Bake TTE into architecture, contracts, licensing terms, and runbooks.
- Use the B‑vendor as a relationship A/B test, not a feature bake‑off.
Procurement posture: Shift from buying tools “piece by piece” toward bundled “capability services” (collaboration, infrastructure, data/AI). Like transportation services, we don’t want to assemble the car—we want miles delivered with a clear exit ramp.
Target-state architecture: Edge‑to‑cloud data fabric with context preservation
Objective: Preserve context from sensor to decision with minimal human relays.
Key characteristics
- Edge ingestion: resilient collection from field sensors and operational systems.
- Context preservation: strong metadata, lineage, and time/space alignment so data remains meaningful as it moves.
- Common platforms: standardized CI/CD, identity, observability, and data services across the primary ecosystem; a minimal, mirrored pattern in the secondary.
- Enterprise “data blob”: a consolidated, singular, governed data platform (lake/lakehouse) as the system of gravity; bespoke applications sit on top, not alongside.
- AI everywhere: Agents for development, translation, testing, and operations; policy‑bound, with auditable prompts and outputs.
Operating model: “Mechanics and Drivers”
Hypothesis: As AI matures (2–10 year window), more work becomes a driver skill expected of many. A smaller, deeper mechanic workforce will own the complex, safety‑critical, or performance‑critical layers.
-
Driver (business‑embedded, democratized):
- Software creation (with AI co‑pilots and no/low‑code)
- Software testing & change approvals (automated pipelines, policy‑as‑code)
- Portfolio and backlog management
-
Mechanic (specialist, centralized or federated COEs):
- Enterprise Architecture
- Platform engineering (cloud, networking, identity, observability)
- Operating systems and runtime hardening
- High‑end data engineering (streaming, low‑latency, geospatial/temporal alignment)
- Safety‑critical/reliability engineering
Talent implication: People with Computer Science backgrounds cluster in the mechanic domain; Information Systems/Business talent shifts toward driver roles, augmented by AI. Chip‑level work remains with specialized vendors (TSMC, etc.).
Mission linkage: How this supports core objectives
Mission Objective | Digital Capability | Example Impact | Representative KPI |
---|---|---|---|
Clean energy transition | Edge→cloud data fabric; standardized platforms | Faster DER interconnect analysis and orchestration | Interconnect cycle time; % of DER telemetry integrated |
Grid reliability | Context‑preserving telemetry; automated runbooks | Lower detection and response times | Edge→decision latency; MTTD/MTTR |
Wildfire resilience | Real‑time sensing + analysis | Faster situational awareness and de‑energization decisions | Latency to wildfire risk signal; false‑positive rate |
Regulatory compliance | Data lineage and policy‑as‑code | Traceable decisions & reproducible reports | Evidence generation time; % controls automated |
Customer experience | Opinionated platforms and automation | Fewer handoffs; quicker, consistent outcomes | NPS/CSAT; “paper‑to‑digital” ratio |
Five‑year roadmap
Phase 0 (0–3 months): Commit and measure
- Publish Platform Tenets (see §9).
- Baseline edge→decision latency, platform count, tooling variance, time‑to‑exit for top 20 systems.
- Stand up an Architecture Review Board (ARB) with authority to enforce opinionated patterns.
Phase 1 (3–12 months): Rationalize and pilot TTE
- Select Primary and Secondary ecosystems; freeze net‑new builds to approved stacks.
- Pilot TTE ≤ 18 months by moving one meaningful workload between ecosystems with AI‑assisted translation.
- Launch data fabric v1 (ingest, catalog, lineage, serving) with a wildfire or reliability use case.
Phase 2 (12–24 months): Scale platform engineering
- Consolidate CI/CD, identity, observability, and logging.
- Migrate 80% of apps in active development to the standard stack.
- Bring software testing and change approvals into automated pipelines company‑wide.
- Establish Mechanics & Drivers role taxonomy and training pathways with AI copilots.
Phase 3 (24–36 months): Automate and de‑paper
- Hit <5 minutes median edge→decision latency for priority signals.
- Convert priority paper workflows to digital with policy‑as‑code controls.
- Negotiate bundled capability contracts with clear exit ramps tied to TTE metrics.
Phase 4 (36–60 months): Optimize and hold the line
- Keep platform count flat or down year‑over‑year despite new demand.
- AB‑test vendor relationship health (responsiveness, roadmap, support) annually.
- Sustain TTE ≤ 18 months readiness (test once per year on a live workload).
Platform tenets (opinionated and enforceable)
- Primary + Secondary only. All new workloads land on the primary platform unless the SDC grants a documented exception; secondary is for portability/testing.
- Data first. All systems publish to the enterprise data platform with context (metadata, lineage) preserved by default.
- Automate the path to production. CI/CD, policy‑as‑code, reproducible environments, automated testing.
- Secure by default. Identity‑centric controls; least privilege; auditable AI usage.
- Observable by design. Telemetry, tracing, SLOs; edge→decision latency is a tracked SLO.
- TTE as a requirement. Architecture and contracts must demonstrate ≤ 18 months exit for major platforms.
- No novelty without necessity. New tech must displace at least one existing platform and show measurable improvement on defined KPIs.
- AI‑assisted development. Co‑pilots and translators are the default; training provided; guardrails enforced.
Governance, contracts, and economics
- Contracts: Negotiate capability bundles (collaboration, infra, data/AI) with measurable service outcomes and exit clauses aligned to TTE.
- ARB authority: Empower to stop non‑compliant builds, remove duplicative platforms, and enforce data publishing/lineage standards.
- FinOps: Establish showback/chargeback tied to platform adherence and data product reuse.
- Risk: Track lock‑in risk via TTE, feature parity, and portability tests (annual live switch‑overs on a selected workload).
Organization & talent
- Mechanics (COEs): Platform engineering, SRE, high‑end data engineering, OS hardening.
- Drivers (business‑embedded): Product owners, analysts, and domain experts using AI‑assisted dev/testing, with platform guardrails.
- Training: AI literacy for all; deep certification for mechanics; pathway programs for drivers to upskill safely.
- Role clarity: Define RACI spanning portfolio planning → development → testing → promotion → operations.
Risks and mitigations
Risk | Description | Mitigation |
---|---|---|
Vendor lock‑in | Ecosystem gravity increases switching costs | Enforce TTE ≤ 18 months architecturally and contractually; annual portability tests |
Security & compliance | Faster flows increase blast radius | Policy‑as‑code, zero trust, auditable AI use, data lineage end‑to‑end |
Cultural resistance | Teams prefer their own tools | Opinionated standards + AI‑assisted retraining + exceptions that expire |
Skill gaps | Business drivers under‑prepared | Copilots, curated templates, training, paired delivery with mechanics |
Oversimplification | One‑size‑fits‑all platform misses edge cases | ARB exceptions with sunset dates; platform backlog for gaps |
KPIs and OKRs (illustrative)
- Edge→decision latency (p50/p95) for wildfire and reliability signals
- TTE readiness: # of top 20 systems with verified ≤ 18 months exit path
- Platform adherence: % of builds on standard stacks; platform count YoY
- Paper‑to‑digital conversion rate for priority workflows
- MTTD/MTTR for grid incidents influenced by digital telemetry
- Regulatory evidence cycle time and % of automated controls
- Customer NPS/CSAT for key journeys impacted by digitization
Practical examples (grounded in the conversation)
- Heterogeneous build: UI in Power Platform, AI in Google, storage in Azure—fast locally, expensive globally. Future state: land on the primary platform by default; use AI to translate and retrain; keep the secondary for portability.
- Negotiation leverage: Be able to credibly tell a primary vendor, “we can move in 18 months,” prompting responsiveness (e.g., modernization of tools like diagramming).
- Consolidation mindset: Don’t jump ecosystems for a single novel feature; assume parity will arrive; stick with the “Ford shop” unless mission‑critical impact is proven.
Timeline to the “driver era”
We should plan for a 2–10 year horizon where much of “IT” becomes a standard business skill, accelerated by AI. That does not eliminate IT—it raises the bar for the remaining specialist work and shifts more day‑to‑day creation and testing closer to the business.
Conclusion
To support the clean energy transition and a safer, more reliable, more customer‑centric grid, we must treat data like a grid asset, standardize on opinionated platforms, and design our workforce for the AI era. The strategic posture is consolidation with optionality: go deep on one ecosystem, keep a credible escape path with a second, and measure our success in latency, reliability, compliance velocity, and customer outcomes—not in the number of tools we own.
Appendix A — Selected verbatim excerpts (for traceability)
- “A lot of what we have here… is very traditional and was designed around optimizing for the Industrial Revolution… ship power everywhere and make sure that power never goes down.”
- “Data is collected at the edge… you’ve got to maintain that context as you’re pushing it through the organization… if we remove all the roadblocks… using digital and cloud… we end up with an organization that’s a lot more effective.”
- “We’ve seen the demos of AI taking .NET code and turning it into Java… it could nibble away at the elephant… that’s the argument for us becoming more strongly opinionated regarding the platforms we pick.”
- “We should have multiple relationships… not for the reasons we do it today… but to provide executives with optionality… ‘It’s only going to take us 18 months and we can get off you and move.’”
- “I’m personally of the opinion that we’re nearing the end of the information technology transformation… AI is kind of the final piece… It’s going to be one giant data blob in the cloud.”
- “We’re not an IT company… we’re in the business of shipping electrons for a fair and equitable price… that’s what we need to focus on.” “There are certain aspects of IT that are the mechanic… and others like the driver… at some point we’re just going to expect everybody in the company to know how to do it… software development and testing are probably in the business; infrastructure, operating systems, and high‑end data remain specialist.”