
Open-source AI observability platform for tracing, evaluation, and prompt management
Arize Phoenix is an open-source AI observability platform designed for experimentation, evaluation, and troubleshooting of LLM applications. It provides OpenTelemetry-based tracing, LLM evaluation benchmarks, dataset versioning, experiment tracking, and prompt management with version control. Phoenix integrates with major AI frameworks including LangChain, LlamaIndex, DSPy, and direct API calls to OpenAI, Anthropic, and other providers. The platform helps teams identify issues in their AI pipelines, compare model performance, and iterate on prompts — all without vendor lock-in thanks to its OpenTelemetry foundation. Available as open-source self-hosted or via Arize AX managed SaaS.
Standards-based distributed tracing for LLM applications, ensuring portability across observability platforms
Built-in evaluation tools to benchmark model performance, measure accuracy, and compare outputs across different models
Version-controlled prompt management with history tracking, comparison, and rollback capabilities
Track and version datasets used for evaluation, enabling reproducible experiments and systematic improvement
Track experiments across model versions, prompts, and configurations to identify optimal AI pipeline setups
Native integrations with LangChain, LlamaIndex, DSPy, OpenAI, Anthropic, and other major AI frameworks
Trace and debug complex LLM pipelines to identify latency bottlenecks, hallucinations, and prompt failures
Evaluate and compare outputs across different LLM providers and model versions to select the best fit for your use case
Iterate on prompts with version control, A/B testing, and systematic evaluation to improve AI application quality
Monitor production AI systems with real-time tracing, alerting on quality degradation, and tracking usage patterns
Best for ML engineering teams who need to compare models within the context of their full AI pipeline — not just in isolation. The observability-first approach reveals performance differences that benchmarks miss.
Best for enterprises that want to integrate AI content quality into their existing observability and DevOps practices
Start using Arize Phoenix today and boost your productivity.
Visit Website
AI-powered SQL client that turns natural language into database queries