Just came across RAG-Anything - an open-source all-in-one multimodal RAG framework. It goes beyond traditional text-based RAG by supporting: 📄 PDFs, Office docs, images, tables, and equations 🔍 Multimodal queries (text + visuals + structured data) ⚡ MinerU-powered parsing for complex layouts 🔗 Knowledge graph construction with cross-modal relationships If you're exploring multimodal retrieval for research papers, financial reports, or technical docs, this repo is worth checking out. Have you tried building multimodal RAG systems yet?
nice find. this multimodal stuff is where the real fun is. looking forward to seeing what people build with it. have you tinkered with it yourself?
AI Researcher & Engineer | Applied Mathematics
6dGithub: https://github.com/HKUDS/RAG-Anything