This paper surveys machine learning techniques aimed at optimizing the performance of the Apache Hadoop framework, which is crucial for processing big data due to its complex configuration parameters. It discusses existing machine learning applications, identifies critical issues in the Hadoop system, and proposes a deep learning approach to enhance performance. Various algorithms, including random forests and support vector regression, are analyzed for their effectiveness in self-tuning Hadoop configurations.