This document discusses using Hadoop on a Eucalyptus cloud to analyze big data. It begins by describing the big data problem and how cloud computing provides a solution. It then explains how to establish an infrastructure as a service cloud using Eucalyptus and the benefits of using virtualization. Hadoop is introduced as a platform for big data and how it uses MapReduce and HDFS. Security threats to clouds like XML signature attacks and script injection attacks are also outlined. Examples of applications that use Hadoop like social network analysis are provided.
Related topics: