Observability in GenAI Agents: From MLflow Traces to Quality Dashboards in Databricks

Building Generative IA (GenAI) agents has shifted from being a novelty to a business necessity. However, once the “Hello World” of your RAG (Retrieval-Augmented Generation) is working, you face the real production challenge: How do I know what is really happening inside my agent? Recently, Databricks has evolved Inference Tables, enabling the capture of payloads and performance metrics directly from Model Serving. However, when we need a deep analysis of the agent’s reasoning—its “thoughts,” document retrieval, and intermediate evaluations—MLflow Traces remain the richest source of truth for understanding the Chain of Thought and intermediate steps. ...

04-02-2026 · 6 min · Daniel Alcaide