Databricks Integration Steps

Fire Insights integrates with Databricks. It submits jobs to the Databricks clusters using the REST API of Databricks and have the results displayed back in Fire Insights.

Fire also fetches the list of Databases and Tables from Databricks, making it easier for the user to build their workflows and execute them. In addition fire displays the list of Databricks clusters running for the user.

Databricks can be running on Azure or on AWS.

Below are the steps for Integrating Fire Insights with your Databricks Clusters.

Install Fire Insights

Install Fire Insights on any machine. The machine has to be reachable from the Databricks cluster.

Upload Fire Core Jar to Databricks

Fire Insights jar has to be uploaded to Databricks. Fire Insights jobs running on Databricks make use of this jar file.

Upload fire-x.y.z/fire-core-lib/fire-spark_2_3-core-3.1.0-jar-with-dependencies.jar to Databricks. Upload it under Workspace as a Library on to Databricks.

1. Login to Databricks Cluster

2. Click on workspace in the left side pane

Databricks

3. Create a new Library

Databricks

4. Upload fire-spark_2_4-core-3.1.0-jar-with-dependencies.jar from your machine by Clicking on Drop JAR here

Databricks

5. Once fire-spark_2_4-core-3.1.0-jar-with-dependencies.jar is uploaded, click on Create

Databricks
  • Check the box with Install automatically on all clusters, in order to avoid uploading manually to every cluster.
Databricks

Configure the Uploaded Library in Fire Insights

Configure the path of the uploaded fire core jar library in Databricks in Fire Insights.

This has to be done under Administration/Configuration.

Databricks

Configure app.postMessageURL in Fire Insights

Configure app.postMessageURL to be the IP of the machine so that it is reachable from jobs running on the Databricks cluster.

Databricks

Install Databricks JDBC Driver

Fire needs the Databricks JDBC Driver to be installed. Install it in the fire-user-lib and fire-server-lib folder of the Fire installation.

You can download the Databricks JDBC Driver from the Databricks site :

The driver is available as a zip file. eg: SimbaSparkJDBC-2.6.3.1003.zip

  • Unzip the downloaded file. It will create a directory like SimbaSparkJDBC-2.6.3.1003
  • Copy the jdbc jar file named SparkJDBC4.jar into fire-x.y.z/fire-user-lib and fire-x.y.z/fire-server-lib

Create your REST API token in Databricks

Create your token in Databricks. It would be used in making REST API calls to Databricks from Fire Insights.

1. Login to your Databricks Account

2. Click on Account icon in right corner top

Databricks

3. Click on User Settings

Databricks

4. Click on Generate New Token

Databricks

5. Add comment & Lifetime(days) for token expiry & Click on Generate

Databricks

6. Copy the token generated. Click on DONE

Databricks

Create Databricks Connection in Fire Insights

Create a connection in Fire Insights to Databricks.

It can be created by the Administrator under Administration/Global Connections. These connections are available for everyone to use.

It can also be created by any user with their Application. In this case, it is only available to the Application and its users.

  • Specify your Databricks Token.
  • Specify the Databricks JDBC URL of your cluster in Databricks.
Databricks

Now we are ready to start using the Databricks Connection in Fire Insights to:

  • Browse DBFS
  • View your Databricks Clusters
  • Browse your Databricks Databases & Tables
  • Create Workflows which Read from and Write to Databricks