YouTube nnabla channelの次の動画で利用したスライドです。
【DeepLearning研修】Transfomerの基礎と応用 --第4回 マルチモーダルへの展開
https://youtu.be/av1IAx0nzvc
【参考文献】
・Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
https://arxiv.org/pdf/2312.17172
・A Generalist Agent
https://arxiv.org/pdf/2205.06175
・Flamingo: a Visual Language Model for Few-Shot Learning
https://arxiv.org/pdf/2204.14198
・NExT-GPT: Any-to-Any Multimodal LLM
https://arxiv.org/pdf/2309.05519
・MUTEX: Learning Unified Policies from Multimodal Task Specifications
https://arxiv.org/pdf/2309.14320
・On the Opportunities and Risks of Foundation Models
https://arxiv.org/pdf/2108.07258
・RT-1: ROBOTICS TRANSFORMER FOR REAL-WORLD CONTROL AT SCALE
https://arxiv.org/pdf/2205.06175
・ViNT: A Foundation Model for Visual Navigation
https://arxiv.org/pdf/2306.14846
・Do As I Can and Not As I Say: Grounding Language in Robotic Affordances
https://arxiv.org/pdf/2204.01691
・RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
https://arxiv.org/pdf/2307.15818
・Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware
https://arxiv.org/pdf/2304.13705
・Open X-Embodiment: Robotic Learning Datasets and RT-X Models
https://arxiv.org/pdf/2310.08864
・【AI技術研修】nnabla-rlによる深層強化学習入門 第1回「深層強化学習とは?」
https://youtu.be/KZ0pwIIBKYU?si=AabrkXkCvNjJjR0R
・Mastering the game of Go with deep neural networks and tree search
https://doi.org/10.1038/nature16961
・Outracing champion Gran Turismo drivers with deep reinforcement learning
https://doi.org/10.1038/s41586-021-04357-7
・A Survey on Transformers in Reinforcement Learning
https://arxiv.org/pdf/2301.03044
・Decision Transformer: Reinforcement Learning via Sequence Modeling
https://arxiv.org/pdf/2106.01345
・TRANSFORMER-BASED WORLD MODELS ARE HAPPY WITH 100K INTERACTIONS
https://arxiv.org/pdf/2303.07109
Related topics: