Even though I'm running Windows 10, I followed the directions at http://nishutayaltech.blogspot.in/2015/04/how-to-run-apache-spark-on-windows7-in.html with success, with the following notes:
- I used the latest versions of Scala and Spark, and I already had Java 8 and Python 3.4 installed.
- I used a Spark prebuilt package for Hadoop; initially I was confused about the language surrounding what to do with the environment variables, but realized this meant:
- Create a new environment variable SPARK_HOME and set it to where I unzipped Spark (in this case, C:\spark-2.0.1-bin-hadoop2.7)
- Edit the PATH environment variable to add %SPARK_HOME%\bin
- Since I have Spark prebuilt for Hadoop 2.7, I downloaded the winutils.exe for hadoop-2.7.1 at https://github.com/steveloughran/winutils/tree/master/hadoop-2.7.1/bin
- I set log4j.properties to only show WARN and above, per http://stackoverflow.com/questions/28189408/how-to-reduce-the-verbosity-of-sparks-runtime-output
And, success! running spark-shell, opening the Spark UI, and running the SparkPi example.
Up next: choosing and setting up a Scala IDE
No comments:
Post a Comment