🤔 Let’s be honest. No one has the time or energy to sort through thousands of traces to debug why your AI app failed. We just shipped OpikAssist, an AI-powered trace analysis tool that answers the question "Why did my LLM app fail?" Ask questions like: → "Diagnose this trace failure" → "What caused the performance issue?" → "Identify anomalies in this workflow" And get LLM-powered insights + actionable fixes, right in your Opik dashboard. We actually used Opik’s core functionality to build OpikAssist, proving AI pilots CAN make it to production with the right observability approach. Check it out within the Opik UI → https://lnkd.in/gQ78AJ8v?
About us
Comet is an end-to-end model evaluation platform built with developers in mind. Track and compare your training runs, log and evaluate your LLM responses, version your models and training data, and monitor your models in production — all in one platform. Backed by thousands of users and multiple Fortune 100 companies, Comet provides insights and data to build better, more accurate AI models while improving productivity, collaboration, and visibility across teams.
- Website
-
https://www.comet.com
External link for Comet
- Industry
- Software Development
- Company size
- 51-200 employees
- Headquarters
- New York, NY
- Type
- Privately Held
- Founded
- 2017
- Specialties
- Machine Learning, Data Science, Developer Tools, and Software
Products
Comet
Data Science & Machine Learning Platforms
Comet provides an end-to-end model evaluation platform for AI developers, with best-in-class LLM evaluations, experiment tracking and production monitoring. - Debug and evaluate your LLM applications with Opik - Track and visualize your training runs with Experiment Management - Monitor ML model performance in production with Production Monitoring - Store and manage your models with Model Registry - Create and version datasets with Artifacts The best part? Comet is free for individuals and academics!
Locations
-
Primary
100 6th Ave
New York, NY 10013, US
Employees at Comet
Updates
-
Comet reposted this
Most eval methods stop at single responses. But real reliability comes from evaluating whole conversations. That is why "Building Conversational AI Agents with Thread Level Eval Metrics" was one of the hands-on workshops selected by our volunteer Steering Committee for the 6th Annual MLOps World | GenAI Summit (Oct 8-9, Austin) Co-hosted by Tony Kipkemboi, Head of Developer Relations, CrewAI, and Claire Longo, Lead AI Researcher, Comet, this session will show you how to: ✅ Use CrewAI to define multi agent workflows and tool integrations ✅ Apply Comet Opik to design thread level eval metrics that capture full conversations ✅ Combine orchestration and evaluation into a repeatable workflow that improves agent quality If you're a serious AI technologist this session will help you and your bot projects reach the next level. DYK? MLOps World is the only major AI tech event programmed by practitioners, for practitioners. You’ll find hands-on workshops, technical deep dives, and real-world case studies hosted by some of the world’s most esteemed AI teams, including JFrog, TikTok, DICK'S Sporting Goods, Google DeepMind, Outerbounds, and Fujitsu. Check out the full agenda and get tickets (few still available): https://lnkd.in/gVbbyAR2
-
-
Comet reposted this
In modern software development, we don't just guess if our code works. We write unit tests, run integration tests, and build CI/CD pipelines. We replaced manual guesswork with rigorous, automated validation. So why are many of us still in the "guesswork" phase with LLM prompts? The common workflow is a manual loop : tweak a prompt, test it, eyeball the result, and tweak it again. This is artisanal, slow, and doesn't scale. A prompt that works today might break tomorrow with a slight model update. It’s not an engineering discipline. The paradigm shift we need is Systematic Prompt Optimization. This is the move from "prompt art" to "prompt science." It’s about treating a prompt not as a magic incantation, but as a key component of a system that can be algorithmically tested, measured, and improved. The framework for this is surprisingly simple and powerful: 1./ Hypothesis (Your Base Prompt): Your initial, best-guess prompt. 2./ Ground Truth (An Evaluation Dataset): A set of inputs and ideal outputs that define success for your use case. 3./ Objective Function (An Evaluator): A measurable score for success (e.g., accuracy, semantic similarity, factuality). 4./ Optimizer: An algorithm that intelligently searches the vast space of possible prompt variations to find the one that maximizes your objective function. This approach is a repeatable, data-driven process. It allows you to prove why one prompt is better than another and ensures your system is robust. I've been exploring frameworks that enable this, and Comet's Opik is a fascinating, concrete example of this principle in action. It provides the optimizer and structure to automate this entire loop. Check here: https://lnkd.in/dZEfCW6S By adopting this mindset, we're not just writing better prompts. We're building more reliable, maintainable, and predictable AI systems. What steps is your team taking to bring more engineering discipline to your work with LLMs? #llm #ai #optimization #agents
-
-
This week, Ollie joined our product and engineering teams in Rome for their R&D offsite 🇮🇹 Nearly a year ago we launched Opik. Like Rome, great products aren’t built in a day -- they take vision, persistence, and teamwork. We're grateful for this group’s dedication. Here are a few moments from their well-deserved week in the Eternal City.
-
-
Can’t make it to one of our in-person events? We’re kicking off a new virtual workshop series 🥳 The first one is on Sept 24 with Claire Longo, who’s leading a session on creating evals and feedback loops for conversational AI agents. Come learn with other builders and see how to: ✔️ Log traces that actually show what your app is doing ✔️ Design LLM-as-a-Judge metrics that mimic human reasoning rsvp 🔗 https://luma.com/uupy7jxr
-
-
Comet reposted this
🦾 Interested in learning how to evaluate conversational Agents using Comet Opik?Comet is hosting a series of virtual workshops, and the first one is all about creating evals and feedback loops for conversational #AI Agents. When: September 24th, 2025 Where? Virtual and free! In this workshop, we will demonstrate how to log traces, annotate sessions with expert insights, and design LLM-as-a-Judge metrics that mimic human reasoning, turning domain expertise into a repeatable feedback loop. Join us for a great conversation. My favorite part of these workshops is the live Q&A where we can share ideas and learn from eachother 👯 . 👉 rsvp: https://luma.com/uupy7jxr
-
-
Excited for this year's MLOps World Summit! See you in Austin in a little more than a month 🤠 🚀
Distracted Father, Founder of Toronto Machine Learning Series (TMLS) & MLOps World; Machine Learning in Production
One of the highest scored reviewed submissions we received this year! ↓ Because everyone’s implementing LLMs, and few understand the math that should guide their applications towards real-world problems. A great opportunity for practitioners here. Claire Longo will introduce a scientifically rigorous approach to benchmarking and comparing models, based on statistical hypothesis testing. Learn a method to quantify and measure the impact of different models on your use cases, so you can better evaluate and evolve agentic design patterns. Whether you’re building RAG agents, real-time LLM apps, or reasoning pipelines, you’ll leave with a new lens for designing agents. Thank you Claire, and Comet (creators of OPIK) ! We're excited for your session. Read full abstract here and come join ✅ • 𝗺𝗹𝗼𝗽𝘀𝘄𝗼𝗿𝗹𝗱.𝗰𝗼𝗺/𝘀𝗽𝗲𝗮𝗸𝗲𝗿𝘀/ 𝗙𝗼𝗼𝗱, 𝗱𝗿𝗶𝗻𝗸𝘀, 𝗽𝗮𝗿𝘁𝗶𝗲𝘀, 𝘄𝗼𝗿𝗸𝘀𝗵𝗼𝗽𝘀 𝗮𝗰𝗿𝗼𝘀𝘀 𝟮 𝗱𝗮𝘆𝘀! 6th Annual MLOps World | GenAI Summit 🗓️ Oct 8-9th 🍾 Austin Renaissance Hotel
-
-
Comet reposted this
Congratulations to Medhaswi Paturu on winning Comet’s raffle at Tuesday night’s AI Tinkerers Demo Day in NYC! 🎉 We love celebrating curiosity and creativity in our community. Join us at an upcoming event for your chance to win a Star Wars LEGO set and connect with other builders pushing the boundaries of #AI. 🚀
-
-
Comet reposted this
🦉 Ollie is on his way to the Big Apple for Demo Night with AI Tinkerers! 🍎 If you're in the NYC area, come join us tonight for an evening of lightning demos, community Q&A, and networking (did I mention free pizza?) 👤 𝗪𝗵𝗼: Builders, tinkerers, and curious minds in AI ⚡𝗪𝗵𝗮𝘁: Lightning demos (15 mins each), followed by Q&A and community networking (pizza included 🍕) 📍𝗪𝗵𝗲𝗿𝗲: betaworks, 29 Little West 12th Street, NYC 📆 𝗪𝗵𝗲𝗻: Tuesday, August 26, 2025, 6–9 PM ET Space is limited so RSVP at the link in the comments below. See you soon! Comet Auth0
-
This summer we welcomed 5 new members to the Comet team across the US, UK, and Greece ☀️ Each one brings impressive experience to their role and unique passions outside of work, from board games to golf, hiking, and even professional soccer. We’re so glad to have Peter, Steven, James, Stephen, and Nasos on the team. Please join us in welcoming them! 👋 Nasos Lamprokostopoulos James Gonzalez Peter Gray Steven Miller Stephen McGhie