This document introduces big data and provides an overview of key concepts. Big data refers to large, complex datasets that cannot be processed by traditional software. It is characterized by volume, velocity, variety, and veracity. Hadoop is an open-source framework for storing and processing big data across clusters of computers using MapReduce. Hive provides a data warehouse infrastructure to process structured data in Hadoop, while MapReduce is a programming model for parallel processing of large datasets.