- GPT-3 is a large language model developed by OpenAI, with 175 billion parameters, making it the largest neural network ever created at the time.
- GPT-3 is trained on a massive dataset of unlabeled text using an auto-regressive approach, allowing it to perform tasks without any fine-tuning through zero-, one-, or few-shot learning by conditioning on examples or instructions.
- Evaluation showed GPT-3 outperforming state-of-the-art models on several benchmarks in zero- and few-shot settings, demonstrating strong generalization abilities from its massive pre-training.
Related topics: