Running Apache Spark Standalone

Fire can be run on Spark Standalone cluster. In this case, Hadoop does not need to be installed.

Installing Spark Standalone

We need to install Scala

Install Apache Spark

  • Download Spark

  • Extract, create a new directory under the /usr/local called spark and copy the extracted connect into it

    • tar xf spark-2.1.0-bin-hadoop2.7.tgz
    • mkdir /usr/local/spark
    • cp -r spark-2.1.0-bin-hadoop2.7/* /usr/local/spark
  • Setup some Environment variables before you start spark-shell ( in .bash_profile)

    • export SPARK_EXAMPLES_JAR=/usr/local/spark/examples/jars/spark-examples_2.11-2.0.0.jar
    • PATH=$PATH:$HOME/bin:/usr/local/spark/bin
  • Start you Scala Shell and run Spark

    • Go to sparkflows home directory
    • cd /usr/local/spark/bin
    • ./spark-shell
    Standalone spark
  • You can start a standalone master server by executing:

    • ./sbin/start-master.sh ( from spark home directory)
  • Once started, the master will print out a spark://HOST:PORT URL

  • You can also find this URL on the master’s web UI,

    Standalone spark

Setup Spark Slave(Worker) Node

  • Go to SPARK_HOME/conf/ directory.
  • Edit the file spark-env.sh – Set SPARK_MASTER_HOST
    • If spark-env.sh is not present, spark-env.sh.template would be present. Make a copy of spark-env.sh.template with name spark-env.sh and add/edit the field SPARK_MASTER_HOST. Part of the file with SPARK_MASTER_HOST
    • cp ./conf/spark-env.sh.template ./conf/spark-env.sh
  • Add a line in spark-env.sh :
    • SPARK_MASTER_HOST=’MASTER_HOST_IP’

Start spark as slave

  • Goto SPARK_HOME/sbin and execute the following command.
    • ./start-slave.sh spark://MASTER_HOST_IP:7077

Installing Fire

We install Fire on the master node.

  • Download Fire Jar from website

  • Go to below directory:

    • cd fire-x.y.z
    • Update the port of Fire-ui & Fire to 8090 & 8082 as default port 8080 & 8081 is used by standalone spark, we can chose any other also.
    • From fire-x.y.z directory, we need to go conf/application.properties and update the port No.
    Standalone spark
  • Create database & run fire & fire-ui server

    • ./create-h2-db.sh
    • ./run-fire.sh start
    • ./run-fire-server.sh start

Configuring Fire

Below are the configuration for Fire to submit the jobs to the Spark Standalone Cluster.

  • Once The server fire & fire-ui start

configurations in spark

The following configurations have to be set appropriately

  • Go to administration section and open Spark configuration there we need to add Below details in specific setup like below:
    • spark.master: spark://Master_host_ip:7077
    • spark.deploy-mode: client
    • spark.sql-context: SQLContext
    • After above updates save the configurations.
    Standalone spark

Now goto application and try to run any workflows

Standalone spark