Robotics In Science Projects

Explore top LinkedIn content from expert professionals.

  • View profile for Gadi Singer

    Chief AI Scientist, Confidential Core AI | IEEE MICRO AI Columnist | Former VP & Director, Emergent AI Research, Intel Labs

    8,863 followers

    Drawing insights from biological signal processing, neuromorphic computing promises a substantially lower power solution to improve energy efficiency of visual odometry (VO) in robotics. Published in Nature Machine Intelligence, this novel approach develops a VO algorithm built from neuromorphic building blocks called resonator networks. Demonstrated on Intel’s Loihi neuromorphic chip, the network generates and stores a working memory of the visual environment, while at the same time estimating the changing location and orientation of the camera. The system outperforms deep learning approaches on standard VO benchmarks in both precision and efficiency – relying on less than 100,000 neurons without any training. This work is a key step in using neuromorphic computing hardware for fast and power-efficient VO and the related task of simultaneous localization and mapping (SLAM), enabling robots to navigate reliably.   A companion paper explores how the neuromorphic resonator network can be applied to visual scene understanding. By formulating the generative model based on vector symbolic architectures (VSA), a scene can be described as a sum of vector products, which can then be efficiently factorized by a resonator network to infer objects and their poses. The work demonstrates a new path for solving problems of perception and many other complex inference problems using energy efficient neuromorphic algorithms and Intel hardware. Congratulations to researchers from the Institute of Neuroinformatics, University of Zurich and ETH Zurich, Accenture Labs, Redwood Center for Theoretical Neuroscience at UC Berkeley, and Intel Labs.   Learn more about neuromorphic VO: https://lnkd.in/gJCVVMCz   Learn how the VSA framework was developed for neuromorphic visual scene understanding based on a generative model (companion paper): https://lnkd.in/gjAENfpp   #iamintel #Neuromorphic #Robotics

  • View profile for Aishwarya Srinivasan
    Aishwarya Srinivasan Aishwarya Srinivasan is an Influencer
    599,243 followers

    If you’re an AI engineer/ or aspiring to be one, and looking to build strong, technical portfolio projects, this one’s for you. I’ve pulled together 5 practical project ideas that go beyond toy examples and give you real exposure to open-weight models, multimodal inputs, long-context reasoning, tool-use via MCP, and even on-device AI. Each project includes an open-source reference repo so you don’t have to start from scratch, you can fork, build, and iterate. 1️⃣ Autonomous Browser Agent Turn any website into an API. Give the agent a natural-language goal, it plans, acts in a real browser, and returns structured output. → Model: DeepSeekV3 via Fireworks AI Inference → Planner: LangGraph → Browser control: Playwright MCP server or browser-use → Optional memory: mem0 🔗 Repo: shubcodes/fireworksai-browseruse 2️⃣ 1M-token Codebase Analyst Load massive repos like PyTorch into a single 1M-token window and answer deep questions about architecture and logic, no brittle chunking. → Model: Llama 4 Maverick served via Fireworks AI (KV-cache paging) → Long-context tuning: EasyContext → Interface: Gradio or VS Code extension 🔗 Repos: adobe-research/NoLiMa, jzhang38/EasyContext 3️⃣ Multimodal Video-QA & Summariser Ingest long-form videos and output timeline-aligned summaries and Q&A. Combine visual frames with ASR transcripts for deep comprehension. → Model: MVU (ICLR ’25) or HunyuanVideo → Retrieval: LanceDB hybrid search → Serving: vLLM multimodal backend + FFmpeg 🔗 Repo: kahnchana/mvu 4️⃣ Alignment Lab (RLHF / DPO) Fine-tune a 7B open-weight model using preference data and evaluate its behavior with real alignment benchmarks. → Framework: OpenRLHF with Fireworks AI endpoints → Evaluation: RewardBench, trlX → Dataset: GPT-4o-generated preference pairs 🔗 Repo: OpenRLHF/OpenRLHF 5️⃣ Local-first Voice Assistant with Memory Build a privacy-first voice assistant that runs fully offline, remembers users, and syncs memory when online. → Model: Mobile-optimized Llama 3.2 with ExecuTorch or Ollama → ASR and TTS: Whisper.cpp + WhisperSpeech → Memory: mem0 via OpenMemory MCP 🔗 Repo: mem0ai/mem0 My two cents: → Don’t wait for the “perfect” starting point. Fork one of these repos, add a feature, refactor the flow, swap the model. That’s how you learn. → If you’re stuck starting from scratch, lean on these foundations and build iteratively. → You don’t need to be perfect at coding everything, you can pair up with tools like Cursor, or use coding copilots like Claude, or GitHub Copilot to break through blockers. → Prefer working on visible, end-to-end workflows. Even better if you can ship a demo or write a detailed blog post about what you learned. → If you’re not ready to build a full product, even contributing to an existing open-source agent or LLM inference repo is a great start. Happy building 🚀

  • View profile for Manish Mazumder
    Manish Mazumder Manish Mazumder is an Influencer

    ML Research Engineer • IIT Kanpur CSE • LinkedIn Top Voice 2024 • NLP, LLMs, GenAI, Agentic AI, Machine Learning

    69,212 followers

    STOP making 10 random projects, instead solve 1-2 high impactful business problems to learn end-to-end project. Trust me, this will make your resume 20x powerful. Here are some ideas: Consider this project — An end-to-end NLP pipeline for customer support ticket routing — where you can train, deploy, and integrate into a live dashboard. Here’s what you get to do in ONE project: - Data cleaning from raw CSVs + feature engineering - Design a scalable ML pipeline (train/test split, evaluation, retraining logic) - Compare classical NLP (TF-IDF + Logistic Regression) with BERT - Containerize the model with Docker - Deploy via FastAPI - Create a Streamlit dashboard for stakeholders - Write documentation + presented it to a non-technical mentor One project will teach you: - Data engineering - Software development - Product management - Good communication skill And that’s what real-world AI needs. If you're just getting started to AI / ML: - Solve a real problem (even a small one) - Use messy, real-world data - Involve modeling and deployment - You can explain end-to-end project like a story Trust me: 1 great project >> 10 toy datasets. If you want to learn more end-to-end project ideas, I have listed down multiple projects from where you can choose and build [Link in comment].

  • View profile for Sahar Mor

    I help researchers and builders make sense of AI | ex-Stripe | aitidbits.ai | Angel Investor

    41,000 followers

    Relying on one LLM provider like OpenAI is risky and often leads to unnecessary high costs and latency. But there's another critical challenge: ensuring LLM outputs align with specific guidelines and safety standards. What if you could address both issues with a single solution? This is the core promise behind Portkey's open-source AI Gateway. AI Gateway is an open-source package that seamlessly integrates with 200+ LLMs, including OpenAI, Google Gemini, Ollama, Mistral, and more. It not only solves the provider dependency problem but now also tackles the crucial need for effective guardrails by partnering with providers such as Patronus AI and Aporia. Key features: (1) Effortless load balancing across models and providers (2) Integrated guardrails for precise control over LLM behavior (3) Resilient fallbacks and automatic retries to guarantee your application recovers from failed LLM API requests (4) Adds minimal latency as a middleware (~10ms) (5) Supported SDKs include Python, Node.JS, Rust, and more One of the main hurdles to enterprise AI adoption is ensuring LLM inputs and outputs are safe and adhere to your company’s policies. This is why projects like Portkey are so useful. Integrating guardrails into an AI gateway creates a powerful combination that orchestrates LLM requests based on predefined guardrails, providing precise control over LLM outputs. Switching to more affordable yet performant models is a useful technique to reduce cost and latency for your app. I covered this and eleven more techniques in my last AI Tidbits Deep Dive https://lnkd.in/gucUZzYn GitHub repo https://lnkd.in/g8pjgT9R

  • View profile for Akshet Patel 🤖

    Robotics Engineer | Creator

    45,446 followers

    Optimizing Visual Odometry: The Role of Field-of-View and Optics in Diverse Environments "Benefit of Large Field-of-View Cameras for Visual Odometry" This research evaluates the impact of camera field-of-view (FoV) and optics (e.g., fisheye and catadioptric) on the performance of visual odometry (VO) algorithms. Large FoV cameras are found to improve VO performance in indoor and cluttered environments by capturing more visual information. Narrower FoVs are more effective in structured outdoor environments, such as urban canyons, because they minimize distortions and enhance feature accuracy. A state-of-the-art VO pipeline was developed to validate the performance of fisheye and catadioptric cameras in synthetic and real-world experiments. Experiments confirmed that the optimal camera design depends on the operational environment, with large FoVs better suited for enclosed spaces and smaller FoVs for outdoor scenarios. The findings highlight the importance of camera selection for robust and accurate motion estimation in mobile robotics. Video - https://lnkd.in/efdNM7c2 Paper - https://lnkd.in/eeq5mzmi -------------------------------- Join my WhatsApp Robotics Channel - https://lnkd.in/dYxB9iCh Join our Robotics Community - https://lnkd.in/e6twxYJF Opportunity_22: https://lnkd.in/eJwD3wN9 -------------------------------- #robotics

  • View profile for Jacob Effron

    Managing Director at Redpoint Ventures

    15,244 followers

    In today's Unsupervised Learning I dive deep into self-driving cars and everything AI x hardware with Vincent Vanhoucke, Distinguished Engineer at Waymo and former Head of Robotics at DeepMind. Vincent has spent years at the intersection of AI and robotics, shaping how machines perceive, plan, and act in the physical world. From self-driving cars navigating complex cityscapes to the future of generalist robots, Vincent breaks down the real challenges — and unexpected breakthroughs — in bringing AI out of the cloud and onto the streets. Some highlights: 1️⃣ The milestones that matter in AI x robotics In self-driving the challenge has shifted from getting cars to drive autonomously to handling the long tail of rare, unpredictable edge cases that emerge over millions of miles driven. Vincent highlighted that the development of physically realistic world models, enabling robots and autonomous vehicles to simulate and train for countless real-world scenarios with high fidelity would be game-changing. Ultimately, scaling and real-world deployment, rather than isolated lab successes, are the true markers of progress in AI-driven robotics. 2️⃣ The Impact of LLMs on robotics Vincent shared how LLMs and VLMs have had a transformative impact on robotics by introducing world knowledge into AI systems, significantly enhancing their perception and reasoning capabilities. Unlike traditional models that rely solely on sensor data from specific environments, LLMs can provide contextual understanding, allowing robots to recognize objects or situations they've never directly encountered — like identifying unfamiliar police cars in a new city or recognizing rare accident scenarios. This semantic awareness helps self-driving cars and robots better interpret complex, real-world environments. By scaling up and leveraging LLMs, robotics can now bridge the gap between raw data perception and higher-level reasoning, pushing machines closer to human-like understanding. 3️⃣ How Waymo enters new cities When Waymo enters a new city, their focus is on ensuring the system can handle local nuances. The core models are designed to be highly portable across different environments, but specific elements—like recognizing unique emergency vehicle designs or adapting to new traffic patterns—require validation to maintain safety and reliability. A significant part of the process involves extensive evaluation and testing, often using simulations to explore edge cases rather than simply gathering more real-world data. Additionally, Waymo works closely with regulators and local communities to ensure compliance and public trust. Vincent emphasizes that the biggest hurdle isn’t always technical but about gaining social acceptance. A truly fascinating conversation on topics I've wanted to cover for awhile check out the full discussion below: YouTube: https://lnkd.in/gEFNDDR6 Spotify: https://bit.ly/4gXP8gK Apple: https://bit.ly/4gU2HNX

  • View profile for Moumita Paul

    Solving vision-driven, real-world autonomy.

    4,166 followers

    What if robots could react, not just plan? A good read: https://lnkd.in/gEGSp_5U This paper proposes a Deep Reactive policy (DRP), a visuo-motor neural motion policy designed for generating reactive motions in diverse dynamic environments, operating directly on point cloud sensory input. Why does it matter? Most motion planners in robotics are either: Global optimizers: great at finding the perfect path, but they are way too slow and brittle in dynamic settings. Reactive controllers: quick on their feet, but they often get tunnel vision and crash in cluttered spaces. DRP claims to bridge the gap. And what makes it different? 1. IMPACT (transformer core): pretrained on 10 million generated expert trajectories across diverse simulation scenarios. 2. Student–teacher fine-tuning: fixes collision errors by distilling knowledge from a privileged controller (Geometric Fabrics) into a vision-based policy. 3. DCP-RMP (reactive layer): basically a reflex system that adjusts goals on the fly when obstacles move unexpectedly. Results are interesting for real-world evaluation: Static environments: Success Rate: DRP 90% | NeuralMP 30% | cuRobo-Voxels 60% Goal Blocking: Success Rate: DRP 100% | NeuralMP 6.67% | cuRobo-Voxels 3.33% Goal Blocking: Success Rate: DRP 92.86% | NeuralMP 0% | cuRobo-Voxels 0% Dynamic Goal Blocking: Success Rate: DRP 93.33% | NeuralMP 0% | cuRobo-Voxels 0% Floating Dynamic Obstacle: Success Rate: DRP 70% | NeuralMP 0% | cuRobo-Voxels 0% What stands out from the results is how well DRP handles dynamic uncertainty, the very scenarios where most planners collapse. NeuralMP, which relies on test-time optimization, simply can’t keep up with real-time changes, dropping to 0 in tasks like goal blocking and dynamic obstacles. Even cuRobo, despite being state-of-the-art in static planning, struggles once goals shift or obstacles move. DRP’s strength seems to come from its hybrid design: the transformer policy (IMPACT) gives it global context learned from millions of trajectories, while the reactive DCP-RMP layer gives it the kind of “reflexes” you normally don’t see in learned systems. The fact that it maintains 90% success even in cluttered or obstructed real-world environments suggests it isn’t just memorizing scenarios; it has genuinely learned a transferable strategy. That being said, the dependence on high-quality point clouds is a bottleneck. In noisy or occluded sensing conditions, performance may degrade. Also, results are currently limited to a single robot platform (Franka Panda). So this paper is less about replacing classical planning and more about rethinking the balance between experience and reflex. 

  • View profile for Nicholas Nouri

    Founder | APAC Entrepreneur of the year | Author | AI Global talent awardee | Data Science Wizard

    131,225 followers

    NVIDIA researchers are using the Apple Vision Pro headset to control humanoid robots in real-time. Imagine putting on a headset and suddenly feeling as if you're inside a robot's body, controlling its movements with your own. According to the researchers, that's exactly the experience - they describe it as feeling "immersed" in another body, much like the movie Avatar. 𝐒𝐨, 𝐇𝐨𝐰 𝐃𝐨𝐞𝐬 𝐓𝐡𝐢𝐬 𝐖𝐨𝐫𝐤? Let me break it down: - Human Demonstration with Apple Vision Pro: Operators wear the Apple Vision Pro headset to control humanoid robots. This provides initial demonstration data as they perform tasks the robot needs to learn. - RoboCasa Simulation Framework: This is a simulation tool that takes the real-world data from the human demonstrations and multiplies it by generating a variety of virtual environments. Think of it as creating numerous practice scenarios without needing more human input. - MimicGen Data Augmentation: Building on that, MimicGen creates new robot motion paths based on the human demonstrations. It's like giving the robot creativity to try new ways of performing tasks. - Quality Filtering: The system automatically filters out any failed attempts, ensuring the robot learns only from successful actions. This process turns limited human input into a vast, high-quality dataset. 𝐖𝐡𝐲 𝐈𝐬 𝐓𝐡𝐢𝐬 𝐚 𝐁𝐢𝐠 𝐃𝐞𝐚𝐥? Traditionally, training robots requires a lot of human time and effort, which can be expensive and slow. NVIDIA's approach can multiply robot training data by 1,000 times or more using simulations. By leveraging powerful GPUs (graphics processing units), researchers can substitute computational power for costly human labor. Just as large language models (like those behind advanced chatbots) have rapidly improved by scaling up training data, this method could lead to advances in robot capabilities and adaptability. We're talking about robots that can learn and adapt much more quickly than before. The ability to efficiently scale training data means we could see rapid advancements in how robots perform complex tasks, interact with environments, and maybe even integrate into our daily lives sooner than we thought. Do you see this as a step forward in robotics and AI? How might this impact the future of work and technology? #innovation #technology #future #management #startups

  • View profile for Felix Fester

    Product X - Exploring the Future of AI Adoption | Robotics & Industrial Automation | Tech Innovation

    9,765 followers

    [Self-learning Robots Post] Robotics researchers are delving into the nuances of self-learning robots, and the R3M project stands at the forefront of this exciting frontier. R3M, short for a Reusable Representation for Robotic Manipulation, is a visual representation model pre-trained on diverse human video datasets that significantly enhances the efficiency of robot learning. By focusing on the temporal dynamics of scenes, semantic relevance, and compactness, R3M serves as a robust perception module for robots performing various manipulation tasks. Three standout achievements from R3M's physical tests include: 1️⃣ Consistently outperforming other visual representations like CLIP and ImageNet across different viewpoints and dataset sizes, proving its robustness and versatilityty. 2️⃣ In simulated environments, R3M successfully executed tasks such as assembling objects and operating kitchen appliances, demonstrating its adaptability to different domain. 3️⃣ Real-world tests in a cluttered apartment setting showed that with just 20 demonstrations, R3M enabled a robot to perform complex tasks like folding towels and cooking prep with impressive success rates, showcasing its potential for practical home use. R3M is shaping up to be a universal model that could become the standard for robot manipulation tasks. It’s a good step towards robots that learn like us, through observation and imitation, but with the speed and precision that only a machine can achieve. #Robotics #MachineLearning #R3M #Innovation #SelfLearningRobots #AI #artificialintelligence #futureoftech

  • View profile for Robert 지영 Liebhart

    🇰🇷Koreas #1 Robotics Voice | Your Partner for Robotics in Korea🇰🇷 | 💡🤖 Join 75,000+ followers, 50mio views | Contact for collaboration!

    74,824 followers

    🌐 "𝗘𝘅𝗽𝗹𝗼𝗿𝗶𝗻𝗴 𝘁𝗵𝗲 𝗦𝗸𝗶𝗲𝘀: 𝗜𝗻𝘀𝗶𝗴𝗵𝘁𝘀 𝗶𝗻𝘁𝗼 𝗠𝗶𝗱-𝗔𝗶𝗿 𝗥𝗼𝗯𝗼𝘁 𝗔𝘀𝘀𝗲𝗺𝗯𝗹𝘆" Dive into the fascinating world of aerial robotics with Penn Engineering's GRASP Lab and their project- the ModQuad Fleet. These robots demonstrate the potential of modular design in drone technology: 🔹 𝗠𝗶𝗱-𝗔𝗶𝗿 𝗔𝘀𝘀𝗲𝗺𝗯𝗹𝘆 𝗔𝗰𝗵𝗶𝗲𝘃𝗲𝗱: Discover the ModQuad robots, capable of connecting and assembling while flying, showcasing an innovative approach to aerial construction. 🔹 𝗖𝗼𝗹𝗹𝗮𝗯𝗼𝗿𝗮𝘁𝗶𝘃𝗲 𝗥𝗼𝗯𝗼𝘁𝗶𝗰𝘀 𝗶𝗻 𝗔𝗰𝘁𝗶𝗼𝗻: See how these drones work together in mid-air, reflecting collaborative strategies inspired by nature. 🔹 𝗩𝗲𝗿𝘀𝗮𝘁𝗶𝗹𝗲 𝗮𝗻𝗱 𝗗𝘆𝗻𝗮𝗺𝗶𝗰 𝗗𝗲𝘀𝗶𝗴𝗻: Appreciate the adaptability of these drones, designed to tackle a variety of tasks, from construction to exploration. 🔹 𝗔 𝗦𝘁𝗲𝗽 𝗙𝗼𝗿𝘄𝗮𝗿𝗱 𝗶𝗻 𝗥𝗼𝗯𝗼𝘁𝗶𝗰𝘀 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵: Recognizing the contributions of leading researchers in the field, the ModQuad project highlights the continuous evolution of robotics. 🔹 𝗣𝗼𝘁𝗲𝗻𝘁𝗶𝗮𝗹 𝗔𝗽𝗽𝗹𝗶𝗰𝗮𝘁𝗶𝗼𝗻𝘀 𝗶𝗻 𝗩𝗮𝗿𝗶𝗼𝘂𝘀 𝗙𝗶𝗲𝗹𝗱𝘀: Understand how these developments can be applied in scenarios ranging from engineering to emergency response.

Explore categories