This document provides an overview of big data and Hadoop. It discusses the history and origins of big data from Google's search engine architecture. It then introduces Hadoop, including HDFS and MapReduce, and describes the main components of the Hadoop ecosystem. The document outlines Hadoop distributions like Cloudera and provides examples of using Cloudera for file formats, compression and reading data as a database. It also discusses ETL vs ELT and demonstrates Talend for ETL/ELT tools with database, batch and streaming jobs.