Datasets REST API

Overview

The Dataset REST APIs, allow you to manage the Datasets.

Below are the various Dataset APIs available in Fire Insights, They should be executed after you have logged into Fire Insights.

GET List of Datasets by Application

Returns the list of Datasets for the logged in user for a given application id:

curl -X GET --header 'Accept: application/json' --header 'api_key: cookies' 'http://localhost:8080/api/v1/datasets?sortPara=dsc&projectId=1'

Create / Update Dataset

If id value is not passed, new dataset will be created:

JSON

{
  "id": 13,
  "version": 0,
  "name": "spam",
  "header": true,
  "path": "data\/spam.csv",
  "delimiter": ",",
  "schemaModel": {
    "schemaColList": [
      {
        "colName": "label",
        "colType": "DOUBLE",
        "colFormat": "",
        "colMLType": "NUMERIC"
      },
      {
        "colName": "message",
        "colType": "STRING",
        "colFormat": "",
        "colMLType": "TEXT"
      },
      {
        "colName": "id",
        "colType": "DOUBLE",
        "colFormat": "",
        "colMLType": "NUMERIC"
      }
    ]
  }
}

Curl

curl-X POST --header 'Content-Type: application/json' --header 'Accept: /' -d     '{"id":13,"version":0,"name":"spam","header":true,"path":"data/spam.csv","delimiter":",","schemaModel":{"schemaColList":[{"colName":"label","colType":"DOUBLE","colFormat":"","colMLType":"NUMERIC"},{"colName":"message","colType":"STRING","colFormat":"","colMLType":"TEXT"},{"colName":"id","colType":"DOUBLE","colFormat":"","colMLType":"NUMERIC"}]}}' localhost:8080/dataset/save -b /tmp/cookies.txt

Delete Dataset

  • “datasetId”: “98”
  • “projectId”: “33”

An example request for Deleting dataset:

curl -X DELETE --header 'Accept: text/plain' 'http://localhost:8080/api/v1/datasets/98?projectId=33'

An example response:

Dataset with id 98 deleted successfully

Get Dataset by Id

  • “datasetId”: “65”
  • “projectId”: “33”

An example request for Getting dataset by Id:

curl -X GET --header 'Accept: application/json' 'http://localhost:8080/api/v1/datasets/65?projectId=33'

An example response:

{
   "id": 65,
   "userId": 33,
   "uuid": "1e13ec2a-4094-405e-a6e7-ffed3bd027f7",
   "version": 0,
   "name": "Test-dataset",
   "category": null,
   "description": "Test",
   "header": true,
   "readOptions": null,
   "path": "/user/sparkflows/Clickthru.csv",
   "delimiter": ",",
   "datasetType": "CSV",
   "filterLinesContaining": null,
   "datasetSchema": "{colNames:[\"Timestamp\",\"UserId\",\"IP Address\",\"Product Id\"],colTypes:[\"STRING\",\"INTEGER\",\"STRING\",\"INTEGER\"],colFormats:[\"\",\"\",\"\",\"\"],colMLTypes:[\"TEXT\",\"NUMERIC\",\"TEXT\",\"NUMERIC\"]}",
   "dateCreated": 1566880637842,
   "dateLastUpdated": 1566880637846,
   "permission": null,
   "readOptionsModel": null,
   "schemaModel": {
    "schemaColList": [
    {
      "colName": "Timestamp",
      "colType": "STRING",
      "colFormat": "",
      "colMLType": "TEXT"
    },
    {
      "colName": "UserId",
      "colType": "INTEGER",
      "colFormat": "",
      "colMLType": "NUMERIC"
    },
    {
      "colName": "IP Address",
      "colType": "STRING",
      "colFormat": "",
      "colMLType": "TEXT"
    },
    {
      "colName": "Product Id",
      "colType": "INTEGER",
      "colFormat": "",
      "colMLType": "NUMERIC"
    }
  ]
  },
    "sampleData": {
    "headers": [
    "Timestamp",
    "UserId",
    "IP Address",
    " Product Id"
  ],
  "cells": [
    [
      "9:03 AM",
      "275",
      "207.51.113.192",
      "1"
    ],
    [
      "12:57 AM",
      "586",
      "62.34.98.94",
      "2"
    ],
    [
      "2:45 AM",
      "508",
      "20.237.172.182",
      "3"
    ],
    [
      "2:13 PM",
      "378",
      "69.215.255.150",
      "4"
    ],
    [
      "9:27 AM",
      "965",
      "56.101.183.251",
      "5"
    ],
    [
      "8:18 AM",
      "263",
      "9.151.97.180",
      "6"
    ],
    [
      "9:40 AM",
      "670",
      "101.195.1.186",
      "7"
    ],
    [
      "7:14 AM",
      "447",
      "232.29.216.95",
      "8"
    ],
    [
      "12:57 AM",
      "33",
      "85.119.50.62",
      "9"
    ],
    [
      "12:56 AM",
      "589",
      "185.132.243.178",
      "10"
    ],
    [
      "11:04 PM",
      "22",
      "120.212.232.218",
      "11"
    ],
    [
      "8:29 PM",
      "504",
      "226.70.25.117",
      "12"
    ],
    [
      "5:18 PM",
      "228",
      "213.53.100.18",
      "13"
    ],
    [
      "2:56 PM",
      "536",
      "60.65.25.167",
      "14"
    ],
    [
      "3:57 AM",
      "46",
      "149.156.17.120",
      "15"
    ],
    [
      "8:05 AM",
      "812",
      "23.213.182.107",
      "16"
    ],
    [
      "12:02 PM",
      "980",
      "93.20.165.16",
      "17"
    ],
    [
      "12:53 PM",
      "915",
      "24.180.112.147",
      "18"
    ],
    [
      "11:32 AM",
      "814",
      "110.81.139.11",
      "19"
    ],
    [
      "11:01 PM",
      "429",
      "115.123.246.193",
      "20"
    ]
  ]
  },
"json": "{\"id\":65,\"userId\":33,\"uuid\":\"1e13ec2a-4094-405e-a6e7-ffed3bd027f7\",\"version\":0,\"name\":\"Test-dataset\",\"description\":\"Test\",\"header\":true,\"path\":\"/user/sparkflows/Clickthru.csv\",\"delimiter\":\",\",\"datasetType\":\"CSV\",\"datasetSchema\":\"{colNames:[\\\"Timestamp\\\",\\\"UserId\\\",\\\"IP Address\\\",\\\"Product Id\\\"],colTypes:[\\\"STRING\\\",\\\"INTEGER\\\",\\\"STRING\\\",\\\"INTEGER\\\"],colFormats:[\\\"\\\",\\\"\\\",\\\"\\\",\\\"\\\"],colMLTypes:[\\\"TEXT\\\",\\\"NUMERIC\\\",\\\"TEXT\\\",\\\"NUMERIC\\\"]}\",\"dateCreated\":\"Aug 27, 2019 4:37:17 AM\",\"dateLastUpdated\":\"Aug 27, 2019 4:37:17 AM\",\"schemaModel\":{\"schemaColList\":[{\"colName\":\"Timestamp\",\"colType\":\"STRING\",\"colFormat\":\"\",\"colMLType\":\"TEXT\"},{\"colName\":\"UserId\",\"colType\":\"INTEGER\",\"colFormat\":\"\",\"colMLType\":\"NUMERIC\"},{\"colName\":\"IP Address\",\"colType\":\"STRING\",\"colFormat\":\"\",\"colMLType\":\"TEXT\"},{\"colName\":\"Product Id\",\"colType\":\"INTEGER\",\"colFormat\":\"\",\"colMLType\":\"NUMERIC\"}]},\"projectId\":33}",
"projectId": 33
 },

Get Dataset Count

Returns the count of datasets available:

curl -X GET --header 'Accept: application/json' --header 'api_key: cookies' 'http://localhost:8080/api/v1/datasets/count'

Get sample data

Delimiter and header are optional values

  • path: data/spam.csv
  • schema: {“colNames”:[“0.0”,”this is not a spam”,”3.0”],”colTypes”:[“DOUBLE”,”STRING”,”DOUBLE”],”colFormats”:[“”,”“,”“],”colMLTypes”:[“NUMERIC”,”TEXT”,”NUMERIC”]}

CURL:

curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' --header 'api_key: cookies' -d
'{"colNames":["0.0","this is not a spam","3.0"],"colTypes":["DOUBLE","STRING","DOUBLE"],"colFormats":["","",""],"colMLTypes":["NUMERIC","TEXT","NUMERIC"]}' http://localhost:8080/api/v1/datasets/sample-data

Returns schema of the files in the given path using the given delimiter

  • delimiter and header are optional values
  • path:data/spam.csv
  • schema: {“colNames”:[“0.0”,”this is not a spam”,”3.0”],”colTypes”:[“DOUBLE”,”STRING”,”DOUBLE”],”colFormats”:[“”,”“,”“],”colMLTypes”:[“NUMERIC”,”TEXT”,”NUMERIC”]}

CURL:

curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' --header 'api_key: cookies' -d
'{"colNames":["0.0","this is not a spam","3.0"],"colTypes":["DOUBLE","STRING","DOUBLE"],"colFormats":["","",""],"colMLTypes":["NUMERIC","TEXT","NUMERIC"]}' http://localhost:8080/api/v1/datasets/schema

Get Latest Five Datasets

Returns the latest updated datasets:

curl -X GET --header 'Accept: application/json' --header 'api_key: cookies' 'http://localhost:8080/api/v1/datasets/latest'

Get the list of files/directories in the given path

  • path:data/transaction.csv

CURL:

curl   -X GET --header 'Content-Type: application/json' --header 'Accept: application/json' -d 'data/transaction.csv' 'http://localhost:8080/filesInPathJSON -b /tmp/cookies.txt'