🔭 DeepEval - Open-Source Evals with Tracing
What is DeepEval?
DeepEval is an open-source evaluation framework for LLMs (Github).
What is Confident AI?
Confident AI (the deepeval platfrom) offers an Observatory for teams to trace and monitor LLM applications. Think Datadog for LLM apps. The observatory allows you to:
- Detect and debug issues in your LLM applications in real-time
- Search and analyze historical generation data with powerful filters
- Collect human feedback on model responses
- Run evaluations to measure and improve performance
- Track costs and latency to optimize resource usage
Quickstart
import os
import time
import litellm
os.environ['OPENAI_API_KEY']='<your-openai-api-key>'
os.environ['CONFIDENT_API_KEY']='<your-confident-api-key>'
litellm.success_callback = ["deepeval"]
litellm.failure_callback = ["deepeval"]
try:
response = litellm.completion(
model="gpt-3.5-turbo",
messages=[
{"role": "user", "content": "What's the weather like in San Francisco?"}
],
)
except Exception as e:
print(e)
print(response)
info
You can obtain your CONFIDENT_API_KEY
by logging into Confident AI platform.