Talk

Langfuse, OpenLIT, and Phoenix: Observability for the GenAI Era

Thursday, May 29

15:35 - 16:05
RoomLasagna
LanguageEnglish
Audience levelIntermediate
Elevator pitch

This session explores Langfuse, OpenLIT, and Phoenix, highlighting their unique strengths and practical applications in monitoring, debugging, and scaling AI-driven systems. By the end, you’ll gain valuable insights to better understand your applications and optimize your generative AI workflows.

Abstract

Large Language Models (LLMs) are becoming core components of modern digital products. However, their non-deterministic nature means that their behaviour cannot be fully predicted or tested before deployment. This makes observability an essential practice for building and maintaining applications with generative AI features.

This session focuses on observability in LLM-based systems.

We start by explaining why monitoring and understanding your application is key to ensuring quality, reliability, and scalability. We’ll then analyze three leading tools for observability in this domain: Langfuse, OpenLIT, and Phoenix. Each has unique strengths and challenges that make it suitable for different use cases.

By navigating sample codebases and real-world apps, we’ll explore:

  • How Langfuse provides detailed tracing and quality monitoring through developer-friendly APIs. While it supports multi-step workflows effectively, it lacks support for the OpenTelemetry protocol and can be difficult to customize for non-standard use cases.
  • Why OpenLIT, built on OpenTelemetry, offers strong observability for distributed systems. Although it is the least mature of the three tools, it integrates well with established observability stacks and has promising potential for future growth.
  • Where Phoenix fits into the process by combining experimentation and debugging capabilities with evaluation pipelines. Its strength lies in development-focused observability, but it has limitations in handling real-time tracing once systems are in production.

This talk will provide a clear, straightforward comparison of these tools, helping you understand which option best fits your LLM applications.

You’ll leave with practical insights into how observability can enhance the reliability and performance of your generative AI systems.

TagsPerformance and scalability techniques, Debugging and troubleshooting, Applications
Participant

Fabbiani

Emanuele is an engineer, researcher, and entrepreneur with a passion for artificial intelligence.

He earned his PhD by exploring time series forecasting in the energy sector and spent time as a guest researcher at EPFL in Lausanne. Today, he is co-founder and Head of AI at xtream, a boutique company that applies cutting-edge technology to solve complex business challenges.

Emanuele is also a contract professor in AI at the Catholic University of Milan. He has published eight papers in international journals and contributed to over 30 international conferences worldwide. His engagements include AMLD Lausanne, ODSC London, WeAreDevelopers Berlin, PyData Berlin, PyData Paris, PyCon Florence, the Swiss Python Summit in Zurich, and Codemotion Milan.

Emanuele has been a guest lecturer at Italian, Swiss, and Polish universities.