Frame by Frame - July 2025
Welcome to Frame by Frame, where we decode the computer vision breakthroughs that are transforming industries. We cut through the hype to deliver insights that help you build better models, avoid costly mistakes, and stay ahead in the AI race.
VGGT: CVPR 2025 best paper winner reimagines 3D vision
Imagine reconstructing detailed 3D scenes from hundreds of images in under a second. A model from Meta Al stole the show at CVPR 2025 promising exactly that, and could transform how we build AR experiences, train robots, and create 3D content.
VGGT (Visual Geometry Grounded Transformer) reduces multi-view 3D reconstruction to a single feed-forward pass, outputting camera parameters, depth maps, dense point clouds, and point tracks with unprecedented speed and accuracy. Unlike traditional geometry-heavy approaches that require minutes or hours, this neural network handles complex scenes in real-time while outperforming existing methods.
Upcoming computer vision events
Women in AI: July 24, 2025 | 9–11 AM PT - Save your spot!
Food Waste Estimation Hackathon - Computer Vision for Sustainability: August 1, 2025 | 10 AM – 8 PM CET - Save your spot!
Understanding Visual Agents: August 7, 2025 | 9 AM PT - Save your spot!
Outsourcing your data annotation is handing competitors the keys 🔑
When Meta’s stake in a labeling vendor sent Big Tech scrambling to protect their datasets, a hard truth hit home: the minute you ship raw data to an outsourcer, you surrender the moat that makes your models unique. Voxel51 Co-Founder and CEO Brian Moore lays out the hidden strategic costs of third-party annotation—and how in‑house, foundation‑model‑powered labeling flips the script, slashing spend while keeping your data sovereign.
Annotation is dead. Long live annotation.
Jason Corso, Co-Founder and Chief Scientist at Voxel51, joined the AI Automotive podcast to break down the critical role of annotation in AI applications, and how foundation models are dramatically reducing the cost of labeling visual datasets.
Part of a team building visual AI?
FiftyOne is the most powerful visual AI and computer vision data platform. Talk to an expert today to learn how you can supercharge your AI development workflows.