Google COOKED yet again - Multimodal Gemma3n 4B and 2B now available in Transformers, vLLM, MLX AND Llama.cpp 🤯 The model can see, hear and type - all in 140 languages ⚡ Check out the models here:… | Vaibhav Srivastav

whatever needs doing @ Hugging Face

3mo

Google COOKED yet again - Multimodal Gemma3n 4B and 2B now available in Transformers, vLLM, MLX AND Llama.cpp 🤯 The model can see, hear and type - all in 140 languages ⚡ Check out the models here: https://lnkd.in/gdnyse7q Best part: You can fine-tune it in a FREE google colab 🤗 Enjoy!

17 Comments

lalo morales

Technology & AI Enthusiast | Healthcare Specialist | Self Publisher | Full Stack Developer | Air Force Veteran

2mo

Gemini

Neil Bhutada

2mo

4B Parma multi model is insane!

Shreyans Bhansali

Building vertical AI agents | Freeing teams from busywork | Makersfuel x AskCodi

3mo

Gemma 3n sounds wild, but I’m wondering how its multimodal performance stacks up against open models like Yi or OpenMoE. Anyone tried pushing it beyond the demo scope yet? Curious how it handles noisy inputs.

Lyndon Chang ⚙️

Creating Automated Workflows for Smarter Businesses | Automa Solutions DM me “AUTOMA” if you are looking for help.

2mo

Google is 👑 data

Yash A.

3mo

Impressive

Angelo Carpi

3mo

Very impressive light model from Google: Gemma 3n 4B/2B ! Short resume from the docs( https://ai.google.dev/gemma/docs/gemma-3n#parameters ) * Chatbot Arena Elo score: 1303 Gemma 3n (4B), 1223 Phi 4 ( 14B) ( for text only should be interesting to see the Elo Score with the quantized version like https://huggingface.co/unsloth/gemma-3n-E4B-it-GGUF/blob/main/gemma-3n-E4B-it-Q6_K.gguf) 😀. * open weights and licensed for responsible commercial use * Offline agentic use and built for Privacy, no connection required * Audio input: speech recognition, translation, and audio data analysis. * Visual and text input: Multimodal capabilities let you handle vision, sound, and text * PLE caching: Per-Layer Embedding (PLE) parameters contained in these models can be cached to fast, local storage to reduce model memory run costs. * MatFormer architecture: Matryoshka Transformer architecture allows for selective activation of the models parameters per request. * Conditional parameter loading: Bypass loading of vision and audio parameters in the model to save memory resources. * Wide language support: trained in over 140 languages. * 32K token context: Substantial input context

Chris Farish

Building High-Impact Agentic AI R&D Teams | Connecting Talent in AI Agents, RL, and LLMs (Pre/Post Training) | Generative AI Head-Hunter 🦥 🤗

3mo

🧑🍳 🦙

Stefan van Rest

AI Engineer

2mo

Very cool models for solo builders!

Florian Bansac

AI - Agents - FinTech

2mo

Google is coming back for the crown! Come share what you build and learn with 5,000+ of us in the AI Agents group on linkedin: https://www.linkedin.com/groups/6672014

Oriol Fernando Palacios Durand

Full Stack Developer | Web3, AI & Modern Web Architecture | Stacks, React, Astro, LLMs

3mo

I'm eager to use it to enhance all kinds of products and services, making them more human surpassing that computer like feel most systems have nowadays. A new era of edge AI products is sure to arise!

See more comments

To view or add a comment, sign in

LinkedIn respects your privacy

Vaibhav Srivastav’s Post

More from this author

VBlog - June 2021

Co-Operative Game Theory and It's Applications

Explore content categories