Dr. Michael Fröhlich’s Post

Software Engineering at Langfuse

Now you know which scores represent "Quality" in your AI agent. But next: Applying it to every AI call in production? That’s the real game. You can’t check thousands of user interactions manually. You need automation. 3 fundamental ways: 1. Human Annotation → Ground truth. Small sample, deep accuracy. 2. Rule-based Checks → Black-and-white. Fast. Cheap. Every call. 3. LLM-as-a-Judge → Scales nuance (e.g. helpfulness, relevance). Combine all 3 → Continuous, reliable, scalable evals. That’s how you stop hoping your AI works… and know it does. Diving into AI Observability & Evals (5/6) #AIObservability #Tracing #LLM #AI

1 Comment

Alexis Gamboa

Co-Founder at loopid.com

Dr. Michael Fröhlich If you had to choose: Human annotations vs LLM Judge? We struggle with customers not being able to invest enough time curating the agent's behaviour, specially when scaling. How far you think we can go with mostly LLM-as-a-Judge?

1 Reaction

To view or add a comment, sign in

More Relevant Posts

Umair Azhar

Senior Software Engineer | CTO-level Builder | Rails Specialist | Product-minded Full-stack | DevOps | AI Integrations | Prompt Engineer | Web3
4d
Report this post
When you ask an AI assistant for something simple… You often get: - Extra stuff you didn’t ask for - Overcomplicated outputs - Or even a “Pro plan” surprise This is the gap between human intent and AI interpretation. The challenge isn’t the AI’s capability; it’s alignment and clarity. The best engineers don’t just prompt. They debug, refine, and iterate until the AI delivers exactly what’s needed. How do you make sure your AI outputs are what you actually want? #AIDevelopment #AIIteration #TechTips #MachineLearning #AI #ArtificialIntelligence #Innovation
Like Comment
To view or add a comment, sign in
Canduit

333 followers
1w
Report this post
Do you have AI integrated into your company’s product or internal tooling? If not, you are missing out. If you want to stay ahead of the competition, AI is a must. You can use it to reach unprecedented levels of speed and efficiency, be it through AI-generated insights based on real data, or workflow automation through AI agents. This goes way beyond simple automation to create intelligent systems that learn, adapt, and aid in data-driven decision-making. If it all sounds interesting, we can handle the AI integration for you. We leverage state-of-the-art models like GPT-5 or Claude 4 to set our clients up for success. #ArtificialIntelligence #AI #analytics #ai_agent #claude #gpt
Like Comment
To view or add a comment, sign in
Atiqur Rahman

🚀 Frontend-Focused MERN Stack Developer | React, Node.js, Express, MongoDB | Building Responsive & Interactive Web Applications
1w Edited
Report this post
🤖 AI isn't the future anymore — it's the present. From smarter workflows to data-driven insights, Artificial Intelligence is rapidly becoming the backbone of competitive advantage. This post from Canduit captures how integrating AI into products and internal tools can unlock efficiency, agility, and innovation. The question isn't “Should we adopt AI?” — it's “How fast can we scale with it?” 🚀 #ArtificialIntelligence #AI #analytics #ai_agent #Automation #Innovation #DataDriven #AIagents #GPT #Claude #BusinessGrowth
Canduit

333 followers
1w

Do you have AI integrated into your company’s product or internal tooling? If not, you are missing out. If you want to stay ahead of the competition, AI is a must. You can use it to reach unprecedented levels of speed and efficiency, be it through AI-generated insights based on real data, or workflow automation through AI agents. This goes way beyond simple automation to create intelligent systems that learn, adapt, and aid in data-driven decision-making. If it all sounds interesting, we can handle the AI integration for you. We leverage state-of-the-art models like GPT-5 or Claude 4 to set our clients up for success. #ArtificialIntelligence #AI #analytics #ai_agent #claude #gpt
Like Comment
To view or add a comment, sign in
Tungsten Automation

59,426 followers
1w
Report this post
The rapid adoption of AI tools is creating a fragmentation challenge for businesses, as individual teams pick different models, datasets and AI architectures to work with. In turn, this introduces risks due to inconsistent a quality control, oversight and accuracy levels in AI responses. Learn how AI Knowledge Bases can addresses these challenges, by providing a shared memory for agents & AI chat: https://ow.ly/8sNu50WUryq #AI #TungstenAutomation #Blog #AIAdoption #BusinessIntelligence #AIEthics #DataManagement #MachineLearning
Like Comment
To view or add a comment, sign in
Kirk Borne, Ph.D. Kirk Borne, Ph.D. is an Influencer

LinkedIn Top Voice, Thinkers360 Top 25 Overall Thought Leader, Founder of Data Leadership Group (Data Scientist. Top Influencer. Speaker. Trainer. Consultant. Astrophysicist). Advisor to PrimeAI and other AI startups.
2w
Report this post
Explore these amazing books on #AI Systems, #LLMs, #GenAI, and AI Agents, by Valentina Alto from Packt Publishing: (1) AI Agents in Practice — Design, Implement, and Scale Autonomous #AI Systems for Production [JUST PUBLISHED]: https://amzn.to/4p98LYl (2) Practical Generative AI with ChatGPT: https://amzn.to/4oXSXHI (3) Building LLM-Powered Applications: https://amzn.to/4iNimjQ (4) Modern Gen AI with ChatGPT and OpenAI Models: https://amzn.to/4cmEm2U

4 Comments
Like Comment
To view or add a comment, sign in
Brian Silverstein

Staff Research Scientist/Engineer @ ServiceNow | Anomaly Detection, Dynamic Translation, R&D
3w
Report this post
Hot AI take #1: The next great move in AI will not be an improvement in the power of the LLM itself. There is already evidence that “bigger is better” is meeting its limitations — smaller models are being trained which are matching the capabilities of big models. Properly created ML anomaly detection pipelines are a great example of that. The next great move is properly wrapping the LLM in ways that treat it like a child. ReAct and similar wrappers which give an LLM a limited set of options are going to provide a great tool set in the near future. If you feel the urge to think of LLMs as thinking, then view them like the 6 year old you left unattended in a kitchen and being surprised they can’t cook. #AI #MachineLearning #Innovation
Like Comment
To view or add a comment, sign in
Jim Johnson
1w
Report this post
The rapid adoption of AI tools is creating a fragmentation challenge for businesses, as individual teams pick different models, datasets and AI architectures to work with. In turn, this introduces risks due to inconsistent a quality control, oversight and accuracy levels in AI responses. Learn how AI Knowledge Bases can addresses these challenges, by providing a shared memory for agents & AI chat: https://ow.ly/9nhR30sPvYM #AI #TungstenAutomation #Blog #AIAdoption #BusinessIntelligence #AIEthics #DataManagement #MachineLearning
Like Comment
To view or add a comment, sign in
Christine Whitmarsh, M.S., B.S.N.

Human First, AI Brand Storyteller. Creating the language codes & iterative coaching AIs need to work with us, not against us (and vice versa).
3w
Report this post
Interesting points about enterprise & AI by The AI Exchange (great newsletter, link in comments). "We’ve noticed a few of the same repeating patterns stopping teams in their tracks, regardless of company size and industry: >Random acts of AI >Shiny tool syndrome >No clear ownership with AI adoption >The “we must get our processes perfect before using AI” trap Notice how none of these are tech or tool problems? They’re operational problems. The exciting thing about operational problems is that having the right people and the right processes usually solves them." #AI #enterprise #business

1 Comment
Like Comment
To view or add a comment, sign in
Vijay N M ♿️

Ex-LinkedIn Trust & Safety Consultant | AI Generalist in Progress | Inclusion Advocate | Ex-ANZ, HSBC, JPMorgan | Open to Trust & Safety, Responsible AI & Digital Safety Roles
4d
Report this post
3 checks for building Human-First AI: 1️⃣ Diverse data → Bias starts at input 2️⃣ Edge-case testing → Who gets excluded when systems fail? 3️⃣ Continuous audits → AI needs monitoring, not one-time fixes Safer AI → Trusted AI. #HumanFirstAI #AITrust #AIEthics #InclusionInTech
Like Comment
To view or add a comment, sign in
Josepia Tobias

Researcher | Data Analyst | Industrial Engineer | Business Analytics & Statistical Modeling | VA Graphic Designer | Social Media Marketing | Driving Data-Driven Decisions
1w
Report this post
I found a cheat code for working with AI. Everyone talks about using automation and AI for efficiency, but I've learned that completely relying on it can sometimes backfire. I used to let the AI do the first pass on a document and I'd spend more time double-checking generic feedback than I would have by just doing the work myself. My process is different now. I do a thorough manual check first. Then, I give the AI my initial findings as a brief. This teaches the AI what to look for and helps it deliver specific, non-generic feedback. It's a complete game-changer. My time is used more efficiently, and the final result is far more accurate. The biggest lesson? A machine can't replace a careful eye, but it can absolutely supercharge it. #DataAccuracy #AI #Automation #DataIntegrity #Research
Like Comment
To view or add a comment, sign in

4,405 followers

147 Posts

View Profile Follow

Dr. Michael Fröhlich’s Post

More Relevant Posts

Explore content categories