Comet’s cover photo
Comet

Comet

Software Development

New York, NY 18,495 followers

Where AI Developers Build

About us

Comet is an end-to-end model evaluation platform built with developers in mind. Track and compare your training runs, log and evaluate your LLM responses, version your models and training data, and monitor your models in production — all in one platform. Backed by thousands of users and multiple Fortune 100 companies, Comet provides insights and data to build better, more accurate AI models while improving productivity, collaboration, and visibility across teams.

Website
https://www.comet.com
Industry
Software Development
Company size
51-200 employees
Headquarters
New York, NY
Type
Privately Held
Founded
2017
Specialties
Machine Learning, Data Science, Developer Tools, and Software

Products

Locations

Employees at Comet

Updates

  • View organization page for Comet

    18,495 followers

    🤔 Let’s be honest. No one has the time or energy to sort through thousands of traces to debug why your AI app failed. We just shipped OpikAssist, an AI-powered trace analysis tool that answers the question "Why did my LLM app fail?" Ask questions like:  → "Diagnose this trace failure"  → "What caused the performance issue?" → "Identify anomalies in this workflow" And get LLM-powered insights + actionable fixes, right in your Opik dashboard. We actually used Opik’s core functionality to build OpikAssist, proving AI pilots CAN make it to production with the right observability approach. Check it out within the Opik UI → https://lnkd.in/gQ78AJ8v?

    • No alternative text description for this image
  • Comet reposted this

    Most eval methods stop at single responses. But real reliability comes from evaluating whole conversations. That is why "Building Conversational AI Agents with Thread Level Eval Metrics" was one of the hands-on workshops selected by our volunteer Steering Committee for the 6th Annual MLOps World | GenAI Summit (Oct 8-9, Austin) Co-hosted by Tony Kipkemboi, Head of Developer Relations, CrewAI, and Claire Longo, Lead AI Researcher, Comet, this session will show you how to: ✅ Use CrewAI to define multi agent workflows and tool integrations ✅ Apply Comet Opik to design thread level eval metrics that capture full conversations ✅ Combine orchestration and evaluation into a repeatable workflow that improves agent quality If you're a serious AI technologist this session will help you and your bot projects reach the next level. DYK? MLOps World is the only major AI tech event programmed by practitioners, for practitioners. You’ll find hands-on workshops, technical deep dives, and real-world case studies hosted by some of the world’s most esteemed AI teams, including JFrog, TikTok, DICK'S Sporting Goods, Google DeepMind, Outerbounds, and Fujitsu. Check out the full agenda and get tickets (few still available): https://lnkd.in/gVbbyAR2

    • No alternative text description for this image
  • Comet reposted this

    View profile for Mayank A.

    Follow for Your Daily Dose of AI, Software Development & System Design Tips | Exploring AI SaaS - Tinkering, Testing, Learning | Everything I write reflects my personal thoughts and has nothing to do with my employer. 👍

    In modern software development, we don't just guess if our code works. We write unit tests, run integration tests, and build CI/CD pipelines. We replaced manual guesswork with rigorous, automated validation. So why are many of us still in the "guesswork" phase with LLM prompts? The common workflow is a manual loop : tweak a prompt, test it, eyeball the result, and tweak it again. This is artisanal, slow, and doesn't scale. A prompt that works today might break tomorrow with a slight model update. It’s not an engineering discipline. The paradigm shift we need is Systematic Prompt Optimization. This is the move from "prompt art" to "prompt science." It’s about treating a prompt not as a magic incantation, but as a key component of a system that can be algorithmically tested, measured, and improved. The framework for this is surprisingly simple and powerful: 1./ Hypothesis (Your Base Prompt): Your initial, best-guess prompt. 2./ Ground Truth (An Evaluation Dataset): A set of inputs and ideal outputs that define success for your use case. 3./ Objective Function (An Evaluator): A measurable score for success (e.g., accuracy, semantic similarity, factuality). 4./ Optimizer: An algorithm that intelligently searches the vast space of possible prompt variations to find the one that maximizes your objective function. This approach is a repeatable, data-driven process. It allows you to prove why one prompt is better than another and ensures your system is robust. I've been exploring frameworks that enable this, and Comet's Opik is a fascinating, concrete example of this principle in action. It provides the optimizer and structure to automate this entire loop. Check here: https://lnkd.in/dZEfCW6S By adopting this mindset, we're not just writing better prompts. We're building more reliable, maintainable, and predictable AI systems. What steps is your team taking to bring more engineering discipline to your work with LLMs? #llm #ai #optimization #agents

    • No alternative text description for this image
  • View organization page for Comet

    18,495 followers

    This week, Ollie joined our product and engineering teams in Rome for their R&D offsite 🇮🇹 Nearly a year ago we launched Opik. Like Rome, great products aren’t built in a day -- they take vision, persistence, and teamwork. We're grateful for this group’s dedication. Here are a few moments from their well-deserved week in the Eternal City.

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • View organization page for Comet

    18,495 followers

    Can’t make it to one of our in-person events? We’re kicking off a new virtual workshop series 🥳 The first one is on Sept 24 with Claire Longo, who’s leading a session on creating evals and feedback loops for conversational AI agents. Come learn with other builders and see how to: ✔️ Log traces that actually show what your app is doing ✔️ Design LLM-as-a-Judge metrics that mimic human reasoning rsvp 🔗 https://luma.com/uupy7jxr

    • No alternative text description for this image
  • Comet reposted this

    View profile for Claire Longo

    AI Researcher at Comet | Mathematician | Startup Advisor | ex-Arize AI 📈 | ex-Twilio ☎️ | Advocate for women in tech 👯♀️

    🦾 Interested in learning how to evaluate conversational Agents using Comet Opik?Comet is hosting a series of virtual workshops, and the first one is all about creating evals and feedback loops for conversational #AI Agents. When: September 24th, 2025 Where? Virtual and free! In this workshop, we will demonstrate how to log traces, annotate sessions with expert insights, and design LLM-as-a-Judge metrics that mimic human reasoning, turning domain expertise into a repeatable feedback loop. Join us for a great conversation. My favorite part of these workshops is the live Q&A where we can share ideas and learn from eachother 👯 . 👉 rsvp: https://luma.com/uupy7jxr

    • No alternative text description for this image
  • View organization page for Comet

    18,495 followers

    Excited for this year's MLOps World Summit! See you in Austin in a little more than a month 🤠 🚀 

    View profile for David Scharbach

    Distracted Father, Founder of Toronto Machine Learning Series (TMLS) & MLOps World; Machine Learning in Production

    One of the highest scored reviewed submissions we received this year! ↓ Because everyone’s implementing LLMs, and few understand the math that should guide their applications towards real-world problems. A great opportunity for practitioners here. Claire Longo will introduce a scientifically rigorous approach to benchmarking and comparing models, based on statistical hypothesis testing. Learn a method to quantify and measure the impact of different models on your use cases, so you can better evaluate and evolve agentic design patterns. Whether you’re building RAG agents, real-time LLM apps, or reasoning pipelines, you’ll leave with a new lens for designing agents. Thank you Claire, and Comet (creators of OPIK) ! We're excited for your session. Read full abstract here and come join ✅ • 𝗺𝗹𝗼𝗽𝘀𝘄𝗼𝗿𝗹𝗱.𝗰𝗼𝗺/𝘀𝗽𝗲𝗮𝗸𝗲𝗿𝘀/ 𝗙𝗼𝗼𝗱, 𝗱𝗿𝗶𝗻𝗸𝘀, 𝗽𝗮𝗿𝘁𝗶𝗲𝘀, 𝘄𝗼𝗿𝗸𝘀𝗵𝗼𝗽𝘀 𝗮𝗰𝗿𝗼𝘀𝘀 𝟮 𝗱𝗮𝘆𝘀! 6th Annual MLOps World | GenAI Summit 🗓️ Oct 8-9th 🍾 Austin Renaissance Hotel

    • No alternative text description for this image
  • Comet reposted this

    View profile for Abby Morgan

    AI Growth Engineer @ Comet Opik | Technical Writer | Community Organizer | Mentor

    🦉 Ollie is on his way to the Big Apple for Demo Night with AI Tinkerers! 🍎 If you're in the NYC area, come join us tonight for an evening of lightning demos, community Q&A, and networking (did I mention free pizza?) 👤 𝗪𝗵𝗼: Builders, tinkerers, and curious minds in AI ⚡𝗪𝗵𝗮𝘁: Lightning demos (15 mins each), followed by Q&A and community networking (pizza included 🍕) 📍𝗪𝗵𝗲𝗿𝗲: betaworks, 29 Little West 12th Street, NYC 📆 𝗪𝗵𝗲𝗻: Tuesday, August 26, 2025, 6–9 PM ET Space is limited so RSVP at the link in the comments below. See you soon! Comet Auth0

  • View organization page for Comet

    18,495 followers

    This summer we welcomed 5 new members to the Comet team across the US, UK, and Greece ☀️ Each one brings impressive experience to their role and unique passions outside of work, from board games to golf, hiking, and even professional soccer. We’re so glad to have Peter, Steven, James, Stephen, and Nasos on the team. Please join us in welcoming them! 👋 Nasos Lamprokostopoulos James Gonzalez Peter Gray Steven Miller Stephen McGhie

Similar pages

Browse jobs

Funding

Comet 5 total rounds

Last Round

Series B

US$ 50.0M

See more info on crunchbase