Fire ArchitectureΒΆ

Sparkflows Fire Architecture

Fire consists of three core components:

  • Web Browser for defining end-to-end workflows for building data products and applications
    • Users interact with the web based drag and drop user interface for creating Datasets and Workflows
    • Workflows leverage the exhaustive set of functional and operational nodes such as Data Profiling, Data Cleaning, ETL, NLP, OCR, Machine Learning etc. displayed in the user interface.
  • Web Server running on an Edge node in a Apache Spark Cluster
    • For running the workflows, they are submitted to the web server. The web server submits the workflow to the Apache Spark cluster as a spark job using spark-submit. The results of the workflow execution are streamed back and displayed in the Browser.
    • Web Server provides a host of other features likes interactive execution, schema inference and propagation, user permissions and roles, LDP integration etc.
  • Apache Spark cluster on which the workflows are executed as Spark jobs
    • Workflows are saved in a JSON string.
    • Workflows can also be submitted on the spark cluster through spark-submit via a command line interface