This is the Databricks entry in our Annie meets X series, next to Annie meets acli, Annie meets New Relic, Annie meets Elastic, and Annie meets gcx. The workflow uses Databricks' SQL Statement Execution API to write production-impact context into a Databricks table.

Summary

Databricks is becoming the governed place where enterprise teams build agents on data. Agent Bricks, Genie Code, Unity Catalog, AI Gateway, Lakeflow, and MLflow all point in the same direction: agents that can use enterprise data, follow governance, and do real work.

The missing context shows up when those agents touch production. A Databricks agent may know the table, notebook, pipeline, model trace, and permissions. It may not know which Kubernetes service consumes that table, which AWS resource backs the feature path, which team owns the service, which monitor is already firing, or what else could break if the remediation runs.

This workflow makes that boundary explicit. Anyshift's annie do workflow gathers production graph context, renders a reviewed handoff, and writes an impact report into Databricks through the SQL API. Databricks stays the governed surface. Anyshift supplies the live production truth inside it.

The Problem

In the demo, a Databricks workflow wants to remediate a failing Lakeflow pipeline:

checkout_features freshness failure
  Databricks surface: Genie Code / Lakeflow / Unity Catalog
  proposed action: patch and rerun the pipeline

Databricks has the right native context for the data system: the table, pipeline, SQL warehouse, governance, and query surface.

The production question is different:

What production services could this agent action affect?

That answer lives outside the lakehouse, across Kubernetes deployments, cloud resources, Terraform and Git changes, monitors, deploy history, and owners.

The Context Anyshift Adds

Schema showing Databricks data and agent context flowing through Anyshift's production graph into a Unity Catalog impact table with owners, dependencies, risk, and recommended action.

For the checkout feature pipeline, Anyshift maps the table and pipeline to the production systems that depend on them:

  • checkout-api, owned by payments-platform, consumes workspace.default.checkout_features for fraud scoring during checkout.
  • risk-worker, owned by risk, backfills the same feature path and replays scoring jobs after pipeline failures.
  • checkout-playground is skipped because it is a non-production namespace.

The useful part is not a bigger table. It is the relationship Databricks cannot derive from Unity Catalog alone:

Lakeflow pipeline
  -> Unity Catalog table
  -> fraud scoring features
  -> checkout-api Kubernetes deployment
  -> active checkout latency monitor
  -> payments-platform owner

How Anyshift Runs It

The engineer starts with a plain-English request:

ANNIE_EXPERIMENTAL=1 \
ANNIE_DO_DATABRICKS_IMPACT_JSON=/private/tmp/annie-databricks-demo-impact.json \
annie do --yes "write Databricks production impact before Genie Code remediates this Lakeflow failure"

The fixture above is only the public-demo shortcut for graph reasoning. The write path is real: annie do renders deterministic SQL and calls Databricks' SQL Statement Execution API against a live Databricks SQL warehouse.

The plan is reviewable before execution:

Plan: Write 2 Anyshift production-impact row(s) to Databricks for checkout feature pipeline remediation.
  1. Check Databricks SQL API auth
  2. Write Databricks SQL payload files
  3. Create Databricks impact table
  4. Insert Anyshift production impact rows
  5. Verify Databricks impact rows

The Review Boundary

The generated runbook starts with the production impact, not the API mechanics:

# Generated by annie do
# Databricks production impact: checkout feature pipeline remediation
# Databricks surface: Genie Code / Lakeflow / Unity Catalog
# Source PR: anyshift-backend#2044
# Investigation: checkout_features feeds a production fraud model used by checkout-api while checkout latency and feature freshness alerts are active.
# Context written to Databricks:
#   - checkout-api (payments-platform) -> workspace.default.checkout_features
#       evidence: feature_store.yaml maps checkout_features to checkout-api
#       dependency: Kubernetes deployment checkout-api consumes fraud-score features generated by the Lakeflow pipeline and backed by an S3 feature path
#   - risk-worker (risk) -> Lakeflow pipeline checkout_features
#       evidence: risk-worker env references CHECKOUT_FEATURE_TABLE

The Databricks credentials stay local. The runbook requires DATABRICKS_HOST, DATABRICKS_TOKEN, and DATABRICKS_SQL_WAREHOUSE_ID, then writes SQL through Databricks' own API.

Databricks SQL Warehouses page showing the Serverless Starter Warehouse used by the Anyshift workflow.

It Lands In Databricks

The live demo created this Databricks table:

workspace.default.anyshift_production_impact_reports

Catalog Explorer recognized the schema and generated a description for it: a table of production impact assessments with generated time, requested changes, associated services, teams, risk, and recommended actions.

Databricks Catalog Explorer showing the anyshift_production_impact_reports table, generated AI description, and columns written by the Anyshift workflow.

The verification query returned two rows:

report_id                  service_name    owner_team          risk_level  recommended_action
annie-dbx-checkout-2044    checkout-api    payments-platform   high        pause autonomous pipeline remediation, route to payments-platform, and require owner approval
annie-dbx-checkout-2044    risk-worker     risk                medium      notify risk owner before rerunning the backfill
Databricks Query History showing the CREATE TABLE, INSERT, and SELECT statements executed through curl against the SQL warehouse.

That is the handoff. A Databricks agent, SQL query, Genie workflow, or MLflow trace can now read the production impact before the remediation continues.

Why Data Lineage Is Not Enough

Unity Catalog tells the agent what data it can use and how that data is governed. That is necessary.

It does not, by itself, tell the agent that a table feeds a Kubernetes checkout service with an active latency alert, that a Terraform change touched the feature bucket policy, or that the owner should approve before an autonomous rerun.

That is the difference:

Unity Catalog lineage:      table -> model -> notebook/job
Anyshift production graph:  table -> service -> runtime -> owner -> monitor -> blast radius

Databricks governs the agent and the data. Anyshift gives the agent production context before it acts.

The Handoff

The workflow keeps a clear boundary:

Anyshift graph
  production services, owners, monitors, dependencies, recent changes

annie do
  reviewed runbook + SQL payloads

Databricks SQL API
  workspace.default.anyshift_production_impact_reports

Databricks users and agents
  query the impact report before remediation

This makes the integration useful without pretending Anyshift is a data catalog, lineage product, or Databricks replacement.

What Teams Get

Data teams keep working in Databricks. Platform teams keep their production truth in Anyshift.

When a Databricks agent wants to fix a pipeline, rerun a backfill, tune a model endpoint, or change a workflow, it can first read:

  • which production services are affected
  • which owners should approve
  • which cloud and Kubernetes resources are in the path
  • which monitors are already unhealthy
  • whether the action should proceed, narrow scope, or stop for review

The agent gets more than permission. It gets production judgment.

Where This Goes Next

MLflow trace enrichment. Attach the same Anyshift impact report to an MLflow agent trace, so the decision and production context are evaluated together.

Unity AI Gateway guardrail. Make Anyshift a governed pre-action check before sensitive Databricks agent tool calls.

Agent Bricks tool. Expose Anyshift as the production-context tool a Databricks agent calls when it needs to know what a data or remediation action could affect.

For Databricks, the win is simple: governed agents that know the enterprise data, plus live production context before they act.