SlideShare a Scribd company logo
Installing Hortonworks for
Windows
Intro
• I installed Hortonworks for Windows on my local Hyper-V machine.
• The following Slides introduce you to the steps for installing on your
machine.
• The entire content can also be found on my blog:
• http://www.bloomconsultingbi.com/2013/10/installationhortonworks-hadoop-13-part.html
• http://www.bloomconsultingbi.com/2013/10/installationhortonworks-hadoop-13-part_22.html
• Enjoy~!
So today we are going to install Hadoop 1.3 single node cluster onto a
Hyper-V system.

Download the files from the Hortonworks website:
http://hortonworks.com/products/hdp-windows/
Version 1.3
Download Install File
Click the link to begin the download. Unzip
the file, creates a folder:
MSI File
See the text file "clusterproperties.txt"
Install and load Hyper-V (Windows 8). Create
a new VM. Load Windows 2012 Server.
Start the Server: Be sure to Create a Network
Adapter, I created an "Internal" adapter:
Then set the network configuration (Version
4):
Next I copied the files up to the VM Server. Then
begin the install. Using the Hortonworks page as a
reference:
Pre-requisites
• Next open the Hortonworks page to view the pre-requisites for the install...
•
• http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-Win1.3.0/bk_installing_hdp_for_windows/content/win-chap2-singlenode.html
•
• http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-Win1.3.0/bk_installing_hdp_for_windows/content/win-getting-ready-2-3-1.html
•
• Download Python:
•
• http://www.python.org/download/
Python
Create a folder on the VM, I named it
HWHadoop13:
Copy the Python install to the VM as well,
and update the Path variable...
Open PowerShell as Administrator. Rewrite
the line of code in PowerShell...Execute...
Python 2.7.5
*** MESSAGE TO READER ***
Be sure to add the Python executable path to the Environment Variable "PATH"...
Use the following instructions to manually install Python in your local environment:

1.Download Python from here to the workspace directory.
2.Update the PATH environment variable. Using Administrator privileges. From the Powershell window,
execute the following commands as Administrator user:
msiexec /qn /norestart /log %WORKSPACE%python-2.7.5.log /i %WORKSPACE%python-2.7.5.msi setx PATH "$env:path;C:Python27" /m

o%WORKSPACE%

o$env

is the full workspace directory path.

is the Environment setting for your cluster.

where
Note
Important

Ensure the downloaded Python MSI name
matches python-2.7.5.msi. If not, change the
above command to match the MSI file name.
Next download the C++ 2010 Redistributable
Package...
Copy the file to the HWHadoop (Your home
directory for Hadoop) folder...
Type this in the PowerShell command line...
Microsoft Visual C++ 2010 Redistributable
Package (64-bit)
1.Use the instructions provided here to download Microsoft Visual C++ 2010
Redistributable Package (64-bit) to the workspace directory.
2.Execute the following command from Powershell with Administrator privileges:
%WORKSPACE%vcredist_x64.exe /q /norestart

For example:
C:prereqsvcredist_x64.exe /q /norestart
Now, download the Microsoft Framework...
Microsoft.NET framework 4.0

*** MESSAGE TO READER ***
Be sure to connected to the internet, because it has to pull some files off the web,
if you're not connected, the install will fail...
1.Use the instructions provided here to download Microsoft.NET framework 4.0 to the workspace directory.
2.Execute the following command from Powershell with Administrator privileges:
%WORKSPACE%slavesetupdotNetFx40_Full_setup.exe /q /norestart /log %WORKSPACE%/dotNetFx40_Full_setup.exe
.net Framework
And now for the JDK:
• JDK 6.31 or higher

• *** MESSAGE TO READER ***
• During the installation process, it threw an error. Turns out you can
not have spaces in the path for JAVA_HOME. So uninstall and reinstall to new directory, i.e. C:Java instead of C:Program Files...
Use the instructions provided below to manually
install JDK to the workspace directory:
1.Check the version. From a command shell or Powershell window, type:
java -version

2.(Optional): Uninstall the Java package if the JDK version is less than v1.6 update 31.
3.Go to Oracle Java SE 6 Downloads page and accept the license.
Download the JDK installer to the workspace directory.
Note
Important
Ensure that no whitespace characters are present
in the installation directory's path.
For example, C:Program Files is not allowed.
Next
From Powershell with Administrator privileges, execute the following commands:

%WORKSPACE%jdk-6u31-windows-x64.exe /qn /norestart /log %WORKSPACE%jdk-6u31-windows-x64.log INSTALLDIR=C:javajdk1.6.0_31 setx JAVA_HOME
"C:javajdk1.6.0_31" /m where %WORKSPACE%

is the full workspace directory path.
Note
Important
Ensure the downloaded JDK .exe file's name matches with jdk-6u31-windows-x64.exe. If not, change the above
command to match the EXE file name.
For example:
C:prereqsjdk-6u31-windows-x64.exe /qn /norestart/log C:prereqsjdk-6u31-windows-x64.log
INSTALLDIR=C:javajdk1.6.0_31
Note
Oracle
http://www.oracle.com/technetwork/java/javasebusiness/downloads/java-archive-downloads-javase6-419409.html#jdk-6u31-oth-JPR
Only problem is you have to have an Oracle account or you must create one.
execute the Power Shell command...

http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-Win-1.3.0/bk_installing_hdp_for_windows/content/win-chap2-singlenode.html
Java_Home path
• After the pre-requisites are loaded, Python, DotNet, C++
Redistributables, Oracle JDK, you are now ready to proceed.
First, you'll want to set the JAVA_HOME path in the Environmental
Variables:
System Properties
Bug
• Please keep in mind, there is a bug here, you may not have a "SPACE"
in your path, so you are advised to change the path to something like
this, after you re-install the Java JDK.
Environment Variables
Next, set the PATH to include the Python
executable...
You will also want to set the HOSTS file to
translate the DNS from IP to Server name:
From the DOS prompt type hostname to
obtain your hostname:
Open the HOSTS file in Notepad and apply the
necessary change,
Now you'll want to Open all Ports:
Next
• Next you want to modify your ClientProperties.txt file, replace the
generic info with actual values, I believe it worked better with IP
Address rather than HostName... however, the screen capture had
the HostName...
View
And finally, begin the install of Hortonworks
Hadoop 1.3 for Windows:
Folders
• You will need to add some folders to you C: as you progress, I
experienced many errors and had to add the folders each time, here's
a view of some of the folder structure (not complete):
Folders
After some trial and error, we have
successfully loaded the application:
Start the services:
You can run the smoke test:
Workaround
• Mine failed here, and it turns out the HDFS was never formatted so to
help you out here's the article that explains how to format the HDFS
drive:
• http://hortonworks.com/community/forums/topic/namenodecannot-be-started-after-successful-hdp-1-3-installation/
• WORKAROUND:
1. Open the “Hadoop Command Line” Command Prompt shortcut.
2. Run the following command that sets up the NameNode
directories: “hadoop namenode -format”
As you can see here, the list of Services, you
may have to manually start the ones which did
not start automatically:
Here's another view of the C: folder structure:
And here's the Task/Job tracker web page:
Here's the Log web page:
And lastly, the working file system web page:
And here's the shortcuts on the desktop:
Finished
• And that concludes this presentation.
• Happy Hadooping~!
Jonathan Bloom
Current Position:
Senior BI Consultant
• Twitter:
• @SQLJon
• Linked-in:
• http://www.linkedin.com/BloomConsultintBI
• Email:
• JBloom@agilebay.com

More Related Content

PPTX
Hortonworks Sandbox Startup Guide for VirtualBox
PPTX
Hadoop on Windows 8
PDF
Install Drupal on Wamp Server
PDF
Virtual CD4PE Workshop
PPTX
Varying wordpressdevelopmentenvironment wp-campus2016
PDF
How to make your Webpack builds 10x faster
PDF
Upgrading or migrating to a higher AEM version - Planning and process
PPTX
Varying WordPress Development Environment WordCamp Cincinnati 2016
Hortonworks Sandbox Startup Guide for VirtualBox
Hadoop on Windows 8
Install Drupal on Wamp Server
Virtual CD4PE Workshop
Varying wordpressdevelopmentenvironment wp-campus2016
How to make your Webpack builds 10x faster
Upgrading or migrating to a higher AEM version - Planning and process
Varying WordPress Development Environment WordCamp Cincinnati 2016

What's hot (20)

PDF
Introduction to ansible
PPTX
Varying WordPress Development Environment WordCamp Columbus 2016
PDF
Managing a WordPress Site as a Composer Project by Rahul Bansal @ WordCamp Na...
PDF
The development environment
PDF
Lean Drupal Repositories with Composer and Drush
PDF
Take home your very own free Vagrant CFML Dev Environment - Presented at dev....
PDF
Introduction to jenkins
PDF
Aem maintenance
PPTX
Introduction to apache maven
PPT
Intro to CakePHP 1.3
PDF
Configuring Highly Scalable Compile Masters, Vasco Cardoso, AWS
PDF
Instant ColdFusion with Vagrant
PDF
Webinar: OpenStack Accelerates Software Development
PPTX
Aegir Introduction
PDF
Introduction to docker
PPT
Why you should be using Aegir: The Drupal-oriented hosting system
PDF
Introducing WordPress Multitenancy (Wordcamp Vegas/Orlando 2015/WPCampus)
PPTX
Patch Management on Windows with Puppet
PDF
April, 2021 OpenNTF Webinar - Domino Administration Best Practices
PPTX
Short-Training asp.net vNext
Introduction to ansible
Varying WordPress Development Environment WordCamp Columbus 2016
Managing a WordPress Site as a Composer Project by Rahul Bansal @ WordCamp Na...
The development environment
Lean Drupal Repositories with Composer and Drush
Take home your very own free Vagrant CFML Dev Environment - Presented at dev....
Introduction to jenkins
Aem maintenance
Introduction to apache maven
Intro to CakePHP 1.3
Configuring Highly Scalable Compile Masters, Vasco Cardoso, AWS
Instant ColdFusion with Vagrant
Webinar: OpenStack Accelerates Software Development
Aegir Introduction
Introduction to docker
Why you should be using Aegir: The Drupal-oriented hosting system
Introducing WordPress Multitenancy (Wordcamp Vegas/Orlando 2015/WPCampus)
Patch Management on Windows with Puppet
April, 2021 OpenNTF Webinar - Domino Administration Best Practices
Short-Training asp.net vNext
Ad

Similar to Installing Hortonworks Hadoop for Windows (20)

PDF
instaling
PDF
instaling
PDF
instaling
PDF
instaling
PPT
Mantis Installation for Windows Box
PPT
Mantis Installation for Windows Box
PPT
Its3 Drupal
PPT
Its3 Drupal
PDF
Wamp & LAMP - Installation and Configuration
PPTX
Extracting twitter data using apache flume
PDF
Installation instruction of Testlink
PDF
Single node hadoop cluster installation
DOCX
BLCN532 Lab 1Set up your development environmentV2.0.docx
PPT
Serving Moodle Presentation
PPT
APACHE
PPT
Diva23
PDF
02 Hadoop deployment and configuration
PPT
PPT
PDF
Howto Pxeboot
instaling
instaling
instaling
instaling
Mantis Installation for Windows Box
Mantis Installation for Windows Box
Its3 Drupal
Its3 Drupal
Wamp & LAMP - Installation and Configuration
Extracting twitter data using apache flume
Installation instruction of Testlink
Single node hadoop cluster installation
BLCN532 Lab 1Set up your development environmentV2.0.docx
Serving Moodle Presentation
APACHE
Diva23
02 Hadoop deployment and configuration
Howto Pxeboot
Ad

More from Jonathan Bloom (8)

PPTX
What is a Data Scientist?
PPTX
Intro to Hadoop
PPTX
Intro to Hybrid Data Warehouse
PPTX
Intro to Big Data
PPTX
Intro to Report Developer Role
PPTX
Intro to EDW
PPTX
Intro to Power BI for Office 365
PPTX
SSRS for DBA's
What is a Data Scientist?
Intro to Hadoop
Intro to Hybrid Data Warehouse
Intro to Big Data
Intro to Report Developer Role
Intro to EDW
Intro to Power BI for Office 365
SSRS for DBA's

Recently uploaded (20)

PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
MYSQL Presentation for SQL database connectivity
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Encapsulation theory and applications.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
KodekX | Application Modernization Development
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Big Data Technologies - Introduction.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
“AI and Expert System Decision Support & Business Intelligence Systems”
MYSQL Presentation for SQL database connectivity
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
sap open course for s4hana steps from ECC to s4
Reach Out and Touch Someone: Haptics and Empathic Computing
Encapsulation theory and applications.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
The AUB Centre for AI in Media Proposal.docx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Unlocking AI with Model Context Protocol (MCP)
Chapter 3 Spatial Domain Image Processing.pdf
Spectroscopy.pptx food analysis technology
KodekX | Application Modernization Development
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Review of recent advances in non-invasive hemoglobin estimation
Big Data Technologies - Introduction.pptx

Installing Hortonworks Hadoop for Windows

  • 2. Intro • I installed Hortonworks for Windows on my local Hyper-V machine. • The following Slides introduce you to the steps for installing on your machine. • The entire content can also be found on my blog: • http://www.bloomconsultingbi.com/2013/10/installationhortonworks-hadoop-13-part.html • http://www.bloomconsultingbi.com/2013/10/installationhortonworks-hadoop-13-part_22.html • Enjoy~!
  • 3. So today we are going to install Hadoop 1.3 single node cluster onto a Hyper-V system. Download the files from the Hortonworks website: http://hortonworks.com/products/hdp-windows/ Version 1.3
  • 5. Click the link to begin the download. Unzip the file, creates a folder:
  • 7. See the text file "clusterproperties.txt"
  • 8. Install and load Hyper-V (Windows 8). Create a new VM. Load Windows 2012 Server.
  • 9. Start the Server: Be sure to Create a Network Adapter, I created an "Internal" adapter:
  • 10. Then set the network configuration (Version 4):
  • 11. Next I copied the files up to the VM Server. Then begin the install. Using the Hortonworks page as a reference:
  • 12. Pre-requisites • Next open the Hortonworks page to view the pre-requisites for the install... • • http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-Win1.3.0/bk_installing_hdp_for_windows/content/win-chap2-singlenode.html • • http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-Win1.3.0/bk_installing_hdp_for_windows/content/win-getting-ready-2-3-1.html • • Download Python: • • http://www.python.org/download/
  • 14. Create a folder on the VM, I named it HWHadoop13:
  • 15. Copy the Python install to the VM as well, and update the Path variable...
  • 16. Open PowerShell as Administrator. Rewrite the line of code in PowerShell...Execute... Python 2.7.5 *** MESSAGE TO READER *** Be sure to add the Python executable path to the Environment Variable "PATH"... Use the following instructions to manually install Python in your local environment: 1.Download Python from here to the workspace directory. 2.Update the PATH environment variable. Using Administrator privileges. From the Powershell window, execute the following commands as Administrator user: msiexec /qn /norestart /log %WORKSPACE%python-2.7.5.log /i %WORKSPACE%python-2.7.5.msi setx PATH "$env:path;C:Python27" /m o%WORKSPACE% o$env is the full workspace directory path. is the Environment setting for your cluster. where
  • 17. Note Important Ensure the downloaded Python MSI name matches python-2.7.5.msi. If not, change the above command to match the MSI file name.
  • 18. Next download the C++ 2010 Redistributable Package...
  • 19. Copy the file to the HWHadoop (Your home directory for Hadoop) folder...
  • 20. Type this in the PowerShell command line...
  • 21. Microsoft Visual C++ 2010 Redistributable Package (64-bit) 1.Use the instructions provided here to download Microsoft Visual C++ 2010 Redistributable Package (64-bit) to the workspace directory. 2.Execute the following command from Powershell with Administrator privileges: %WORKSPACE%vcredist_x64.exe /q /norestart For example: C:prereqsvcredist_x64.exe /q /norestart
  • 22. Now, download the Microsoft Framework... Microsoft.NET framework 4.0 *** MESSAGE TO READER *** Be sure to connected to the internet, because it has to pull some files off the web, if you're not connected, the install will fail... 1.Use the instructions provided here to download Microsoft.NET framework 4.0 to the workspace directory. 2.Execute the following command from Powershell with Administrator privileges: %WORKSPACE%slavesetupdotNetFx40_Full_setup.exe /q /norestart /log %WORKSPACE%/dotNetFx40_Full_setup.exe
  • 24. And now for the JDK: • JDK 6.31 or higher • *** MESSAGE TO READER *** • During the installation process, it threw an error. Turns out you can not have spaces in the path for JAVA_HOME. So uninstall and reinstall to new directory, i.e. C:Java instead of C:Program Files...
  • 25. Use the instructions provided below to manually install JDK to the workspace directory: 1.Check the version. From a command shell or Powershell window, type: java -version 2.(Optional): Uninstall the Java package if the JDK version is less than v1.6 update 31. 3.Go to Oracle Java SE 6 Downloads page and accept the license. Download the JDK installer to the workspace directory.
  • 26. Note Important Ensure that no whitespace characters are present in the installation directory's path. For example, C:Program Files is not allowed.
  • 27. Next From Powershell with Administrator privileges, execute the following commands: %WORKSPACE%jdk-6u31-windows-x64.exe /qn /norestart /log %WORKSPACE%jdk-6u31-windows-x64.log INSTALLDIR=C:javajdk1.6.0_31 setx JAVA_HOME "C:javajdk1.6.0_31" /m where %WORKSPACE% is the full workspace directory path.
  • 28. Note Important Ensure the downloaded JDK .exe file's name matches with jdk-6u31-windows-x64.exe. If not, change the above command to match the EXE file name. For example: C:prereqsjdk-6u31-windows-x64.exe /qn /norestart/log C:prereqsjdk-6u31-windows-x64.log INSTALLDIR=C:javajdk1.6.0_31
  • 29. Note
  • 30. Oracle http://www.oracle.com/technetwork/java/javasebusiness/downloads/java-archive-downloads-javase6-419409.html#jdk-6u31-oth-JPR Only problem is you have to have an Oracle account or you must create one. execute the Power Shell command... http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-Win-1.3.0/bk_installing_hdp_for_windows/content/win-chap2-singlenode.html
  • 31. Java_Home path • After the pre-requisites are loaded, Python, DotNet, C++ Redistributables, Oracle JDK, you are now ready to proceed. First, you'll want to set the JAVA_HOME path in the Environmental Variables:
  • 33. Bug • Please keep in mind, there is a bug here, you may not have a "SPACE" in your path, so you are advised to change the path to something like this, after you re-install the Java JDK.
  • 35. Next, set the PATH to include the Python executable...
  • 36. You will also want to set the HOSTS file to translate the DNS from IP to Server name:
  • 37. From the DOS prompt type hostname to obtain your hostname:
  • 38. Open the HOSTS file in Notepad and apply the necessary change,
  • 39. Now you'll want to Open all Ports:
  • 40. Next • Next you want to modify your ClientProperties.txt file, replace the generic info with actual values, I believe it worked better with IP Address rather than HostName... however, the screen capture had the HostName...
  • 41. View
  • 42. And finally, begin the install of Hortonworks Hadoop 1.3 for Windows:
  • 43. Folders • You will need to add some folders to you C: as you progress, I experienced many errors and had to add the folders each time, here's a view of some of the folder structure (not complete):
  • 45. After some trial and error, we have successfully loaded the application:
  • 47. You can run the smoke test:
  • 48. Workaround • Mine failed here, and it turns out the HDFS was never formatted so to help you out here's the article that explains how to format the HDFS drive: • http://hortonworks.com/community/forums/topic/namenodecannot-be-started-after-successful-hdp-1-3-installation/ • WORKAROUND: 1. Open the “Hadoop Command Line” Command Prompt shortcut. 2. Run the following command that sets up the NameNode directories: “hadoop namenode -format”
  • 49. As you can see here, the list of Services, you may have to manually start the ones which did not start automatically:
  • 50. Here's another view of the C: folder structure:
  • 51. And here's the Task/Job tracker web page:
  • 52. Here's the Log web page:
  • 53. And lastly, the working file system web page:
  • 54. And here's the shortcuts on the desktop:
  • 55. Finished • And that concludes this presentation. • Happy Hadooping~!
  • 56. Jonathan Bloom Current Position: Senior BI Consultant • Twitter: • @SQLJon • Linked-in: • http://www.linkedin.com/BloomConsultintBI • Email: • JBloom@agilebay.com