Vaibhav Srivastav’s Post

View profile for Vaibhav Srivastav

whatever needs doing @ Hugging Face

Google COOKED yet again - Multimodal Gemma3n 4B and 2B now available in Transformers, vLLM, MLX AND Llama.cpp 🤯 The model can see, hear and type - all in 140 languages ⚡ Check out the models here: https://lnkd.in/gdnyse7q Best part: You can fine-tune it in a FREE google colab 🤗 Enjoy!

  • No alternative text description for this image
lalo morales

Technology & AI Enthusiast | Healthcare Specialist | Self Publisher | Full Stack Developer | Air Force Veteran

2mo

Gemini

Like
Reply

4B Parma multi model is insane!

Like
Reply
Shreyans Bhansali

Building vertical AI agents | Freeing teams from busywork | Makersfuel x AskCodi

3mo

Gemma 3n sounds wild, but I’m wondering how its multimodal performance stacks up against open models like Yi or OpenMoE. Anyone tried pushing it beyond the demo scope yet? Curious how it handles noisy inputs.

Like
Reply
Lyndon Chang ⚙️

Creating Automated Workflows for Smarter Businesses | Automa Solutions DM me “AUTOMA” if you are looking for help.

2mo

Google is 👑 data

Like
Reply

Very impressive light model from Google: Gemma 3n 4B/2B ! Short resume from the docs( https://ai.google.dev/gemma/docs/gemma-3n#parameters ) * Chatbot Arena Elo score: 1303 Gemma 3n (4B), 1223 Phi 4 ( 14B) ( for text only should be interesting to see the Elo Score with the quantized version like https://huggingface.co/unsloth/gemma-3n-E4B-it-GGUF/blob/main/gemma-3n-E4B-it-Q6_K.gguf) 😀. * open weights and licensed for responsible commercial use * Offline agentic use and built for Privacy, no connection required * Audio input: speech recognition, translation, and audio data analysis. * Visual and text input: Multimodal capabilities let you handle vision, sound, and text * PLE caching: Per-Layer Embedding (PLE) parameters contained in these models can be cached to fast, local storage to reduce model memory run costs. * MatFormer architecture: Matryoshka Transformer architecture allows for selective activation of the models parameters per request. * Conditional parameter loading: Bypass loading of vision and audio parameters in the model to save memory resources. * Wide language support: trained in over 140 languages. * 32K token context: Substantial input context

Like
Reply
Chris Farish

Building High-Impact Agentic AI R&D Teams | Connecting Talent in AI Agents, RL, and LLMs (Pre/Post Training) | Generative AI Head-Hunter 🦥 🤗

3mo

🧑🍳 🦙

Like
Reply

Very cool models for solo builders!

Like
Reply
Florian Bansac

AI - Agents - FinTech

2mo

Google is coming back for the crown! Come share what you build and learn with 5,000+ of us in the AI Agents group on linkedin: https://www.linkedin.com/groups/6672014

Like
Reply
Oriol Fernando Palacios Durand

Full Stack Developer | Web3, AI & Modern Web Architecture | Stacks, React, Astro, LLMs

3mo

I'm eager to use it to enhance all kinds of products and services, making them more human surpassing that computer like feel most systems have nowadays. A new era of edge AI products is sure to arise!

Like
Reply
See more comments

To view or add a comment, sign in

Explore content categories