Winutils Exe Hadoop For Mac

Failed to locate the winutils binary in the hadoop binary path. Could not locate executable null bin winutils.exe in the Ha doop binaries. At org.apache. Download Hadoop 2.7's winutils.exe and place it in a directory C:InstallationsHadoopbin. Now set HADOOPHOME = C:InstallationsHadoop environment variables.

Winutils.exe In The Hadoop Binaries
Winutils.exe Hadoop 2.7
Winutils Exe Download
Install Winutils
Winutils For Hadoop 2.7
Winutils Exe Hadoop For Mac Download

Whether you want to unit test your Spark Scala application using Scala Tests or want to run some Spark application on Windows, you need to perform a few basics settings and configurations before you do so. Rct3 free mac. In this post, I will explain the configurations that will help you start your journey to run your spark application seamlessly on your windows machines. Let’s get started –

First, note that you don’t need Hadoop installation in your windows machine to run Spark. You need a way to use POSIX like file access operations in windows which is implemented using winutils.exe using some Windows APIs.

Step 1. Download winutils.exe binary from this link – https://github.com/steveloughran/winutils, and place it on a folder like this – – C:/hadoop/bin, make sure you are downloading the same version as on which your Spark version is compiled against. You can check the version of Hadoop your spark version was compiled with using pom of spark binary you are using – https://search.maven.org/artifact/org.apache.spark/spark-parent_2.11/2.4.4/pom

Step 2. set HADOOP_HOME and PATH – In your environment variables either using Control Panel ( available to all apps – recommended option) or on command prompt ( for the current session) – set HADOOP_HOME as C:/hadoop or the path inside which you created bin directory where winutils.exe is present.

Winutils.exe In The Hadoop Binaries

Winutils.exe Hadoop 2.7

Next is to add %HADOOP_HOME%/bin to the PATH.

That’s all !!

Winutils Exe Download

Now you can run any spark app on your local windows machine in IntelliJ, Eclipse, or in spark-shell. Please comment below for any issues!

Install Winutils

More Spark Posts –

Apache Spark is a lightening fast cluster computing engine conducive for big data processing. I was trying to get hands on Spark, But I could not find any installers to use in the window 7. That was disappointing to me as all the packages were for Mac (or) Linux OS. So this gave me even move interest to find out what Spark can offer. I invested around two days on internet searching for options to install and configure it on a windows based environment. Finally, I was able to install and configure Spark in Window 7. I want to share the steps to configure for a working installation of Apache Spark.

Winutils For Hadoop 2.7

To install Spark on a windows based environment the following prerequisites should be fulfilled first.

Download and install Scala version 2.10.4 from here only if you are a Scala user otherwise this step is not required. If you are not a scala user then you also do not need to setup the scala path as the environment variable
Download a pre-built Spark binary for Hadoop. I chose Spark release 1.2.1, package type Pre-built for Hadoop 2.3 or later from here.
Download and install winutils.exe and place it in any location in the D drive. Actually, the official release of Hadoop 2.6 does not include the required binaries (like winutils.exe) which are required to run Hadoop. Remember, Spark is a engine built over Hadoop.
Please un-zip the spark package and winutils.exe into “C” drive.
First part of the Work is done. Here come the connectivity part.
This is the most important step. If the Path variable is not properly setup, you will not be able to start the spark shell. Now how to access the path variable?
- Right click on Computer- Left click on Properties
- Click on Advanced System Settings
- Under Start up & Recovery, Click on the button labelled as “Environment Variable”
- You will see the window divided into two parts, the upper part will read User variables for username and the lower part will read System variables. We will create two new system variables, So click on “New” button under System variable
- Set the variable name as JAVA_HOME and values as ‘C:Program FilesJavajdk1.7.0_79‘ ( I recommend to check your java version and add according)
- Similarly, create a new system variable and name it as ‘HADOOP_HOME‘ and values as ‘C:winutils‘
- Create a new system variable and name it as ‘SPARK_HOME‘ and values as ‘C:SPARKBIN‘
- Now, all you have to do is append these four system variables namely JAVA_HOME, PYTHON_PATH, HADOOP_HOME & SPARK_HOME to your Path variable. Which can be done as follows ‘%JAVA_HOME%BIN;%HADOOP_HOME%; %SPARK_HOME%;‘ Click on Ok to close the Environment variable window and then similarly on System properties window.

To run spark on windows environment

Open up the command prompt terminal, Change directory to the location where the spark directory is and Navigate into the bin directory like cd bin.
Run the command spark-shell and you should see the spark logo with the scala prompt
Spark should start.
Once you see the Spark started, Open up the web browser and type localhost:4040 in the address bar and you shall see the Spark shell application UI
To quit Spark, at the command prompt type ‘exit’

Winutils Exe Hadoop For Mac Download

That is all to install and run a standalone spark cluster on a windows based environment. Hope this helps