ai llm transformer machine learning visiontransformer deep learning 人工知能 社内勉強会 機械学習 deeplearning 大規模言語モデル paper discussion aiエンジニア 生成ai llm agents large language model point clouds llms normalization エンジニア zero-shot xtts tts 自然言語処理 chatgpt hallucination ハルシネーション jit compilation programming cpython faster python project python open-source octo engineer robot policy anygpt self-attention mlp visual inspection biological vision image cnn pixel-focused attention vision transformers transnext multilingual 多言語 音声合成 spectrogram neural codec zeroshot text-to-speech model computervision technology ai development ai engineer 3dreconstruction sota 3dvision r&d cvpr 2025 vit depth estimation image recognition 画像認識 言語モデル 物体認識 object recognition robot ロボット imagegeneration selfattention pixellevelmodeling artificial intelligence 画像生成 深層学習 ディープラーニング architecture アーキテクチャ mamba rnn mvs multi-view stereo vggt π0.5 transfusion wavefit speech restoration data centric ai conversation low latency full duplex real-time dialogue speech-text foundation model text generation マルチモーダルai efficient data processing image-language integration blip-3 multimodal a usm deepmind miipher cot prompt プロンプト設計 chainofthought web design restful architecture api royfielding hypermedia rest vlm embodied ai vision-language-action model multi-modal rag reciperag #machinelearning #llm #ragsystem vision-language model inductivebias image-text processing llava multimodal model thechnology 最新技術 マルチモーダル gpt multimodal generative ai pre-trained pose estimation object tracking 3d reconstruction image matching diffusion models miipher-2 pruning model compression reinforcement learning dqn mixture of experts skywork moe pixeltransformer pointllm recipe generation camera pose estimation
See more