Analytic Stack Setup

This page references the repository under https://gitlab.com/rainbow-project1/rainbow-installation

Data Storage and Analytic Services on Master

For the data storage and processing stack, users initially must execute a docker-compose file on master that includes all necessary services.

Navigate to the repository and go to the *analytic-stack/master* directory where the docker-compose.yaml file is provided.

Next we provide details about the parameters of the docker-compose file. For providing the details, edit the *.env* file inside the directory.

NODE_IPV6="..." # Node's IPV6
NODE_IPV4="..." # Node's IPV4
PROVIDER_HOSTS="..." # The IPs of the nodes that the system will retrieve its data (all nodes' ips)
NODE_HOSTNAME="..." # The hostname or ip of the node
STORM_NIMBUS_CONFIG_FILE="..." # The path of Storm Nimbus configuration file
STORAGE_PLACEMENT="..." # Enables and disables the placement algorithm of the storage. Default is False.
STORAGE_DATA_FOLDER="...." # The folder that the data of storage will be stored

Storm Nimbus and Storm UI Configurations

Generally, the configuration of Nimbus needs no alteration. However, users can update the following file accordingly. Furthermore, users can add also other configurations of Storm Framework. Finally, users can introduce other scheduling strategies (including RAINBOW’s strategies) via this confiugration file. For instance, if users set storm.scheduler equals to ResourceAwareScheduler and its strategy to be EnergyAwareStrategy, the execution will try to minimize the energy consumption.

Configuration file exists in *analytic-stack/master/storm* directory

A typical configuration file is the following:

storm.zookeeper.servers:
     - "cluster-head-IP" # update with master's IPv4
nimbus.seeds: [ "cluster-head-IP" ]  # update with master's IPv4
storm.log.dir: "/logs"
storm.local.dir: "/data"
storm.local.hostname: "cluster-head-IP" # update with master's IPv4
supervisor.slots.ports:
     - 6700
     - 6701
     - 6702
     - 6703
nimbus.thrift.max_buffer_size: 20480000
supervisor.thrift.max_buffer_size: 20480000
worker.childopts: "-Xmx%HEAP-MEM%m -Xloggc:artifacts/gc.log -XX:+HeapDumpOnOutOfMemoryError    XX:HeapDumpPath=artifacts/heapdump"
topology.component.cpu.pcore.percent: 1000.0
topology.component.resources.onheap.memory.mb: 512.0
#storm.scheduler: "org.apache.storm.scheduler.resource.ResourceAwareScheduler"
#topology.scheduler.strategy: "eu.rainbowh2020.Schedulers.EnergyAwareStrategy"

Stack Execution

In order to execute the stack, users have only to run the following command in the *analytic-stack/master* directory.

docker-compose up

And the system will start all services. We should note that users have to run firstly the services of master and after that all the other storage and processing services on the rest of the nodes.

Monitoring, Data Storage and Analytic Services on Edge Nodes

For the data storage and processing stack on Edge Nodes, users must execute a docker-compose file on all nodes.

Navigate to the repository and go to the *analytic-stack/nodes* directory where the docker-compose.yaml file is provided.

Next we provide details about the parameters of the docker-compose file.

MONITORING_CONFIGURATION_FILE="..." # The path of monitoring agent configuration file
STORAGE_RAINBOW_HEAD="..." # Cluster head's IPV6
STORAGE_NODE_NAME="..." # The hostname or ip of the node
STORAGE_PLACEMENT="..." # Enables and disables the placement algorithm of the storage. Default is False.
STORAGE_DATA_FOLDER="...." # The folder that the data of storage will be stored

Monitoring Agent Configurations

node_id: "node_id"  # user need to provide a node id (e.g. hostname)

sensing-units:
  general-periodicity: 1s # general sensing rate
  DefaultMonitoring: # Node-level metrics will be enable, users can disable this by removing it
    periodicity: 1s
#    disabled-groups: # metric-groups that the system will not start at all metric groups include CPU, memory, disk, network
##      - "disk"
#    metric-groups: # override the sensing preferences on specific groups
#      - name: "memory"
#        periodicity: 15s # change a static periodicity
#      - name: "cpu"
  UserDefinedMetrics: # specific implementation of sensing interface for user-defined metrics
    periodicity: 1s
    sources:
      - "/"
  ContainerMetrics: # Container-level monitoring metrics
    periodicity: 1s

dissemination-units: # users can enable multiple dissemination units however in rainbow we use Ignite as storage
 IgniteExporter:
    hostname: ignite-server
    port: 50000

#adaptivity:   # optional adaptivity properties
#  sensing:  # adaptivity in sensing units
#    DockerProbe:  # e.g., enable adaptivity for the container metrics
#      target_name: demo_test|cpu_ptc  # and set as target metric the cpu percentage of demo_test container
#      minimum_periodicity: 1
#      maximum_periodicity: 15
#      confidence: 0.95
#  dissemination:  # adaptivity in dissemination
#    all:  # the system sends adaptively all metrics to the storage
#      minimum_periodicity: 1
#      maximum_periodicity: 15
#      confidence: 0.95
#    metric_id:  # or it can send adaptively only specific metrics
#    - minimum_periodicity: 5s
#      maximum_periodicity: 35s
#      confidence: 95
#

For more information about the monitoring and its configuration please check its repository https://gitlab.com/rainbow-project1/rainbow-monitoring

Storm Worker Configurations

Generally, we need only to update the IPs of the Storm Worker configurations. However, users can update the following file accordingly to alter the processing characteristics of a node. Specifically, users can add other configurations from Storm Framework (since we utilize storm as execution engine).

Configuration file exists in *analytic-stack/nodes/storm* directory.

A typical configuration file is the following:

nimbus.seeds: ["cluster-head-IP"] # The cluster head IPV4
ui.port: 8080
storm.zookeeper.servers:
   - "cluster-head-IP" # The cluster head IPV4
storm.local.hostname: "node-IP" # As hostname we need to provide the node's IPV4
supervisor.slots.ports:
   - 6700
   - 6701
   - 6702
   - 6703

Stack Execution

In order to execute the stack, users have only to run the following command:

docker-compose up

For the execution on ARM processors, user need to execute specific docker-compose files. For instance, for 32 bit arm processors users need to run the following command:

docker-compose -f docker-compose-arm32.yaml up

or for 64 bit arm processors, users should execute the following command:

docker-compose -f docker-compose-arm64.yaml up

Ports and Networking

Apache Storm needs specific ports to be open namely or users can configure other ports in the storm configurations:

Default Port

Storm Config

Client Hosts/Processes

Server

2181

storm.zookeeper.port

Nimbus, Supervisors, and Worker processes

Zookeeper

6627

nimbus.thrift.port

Storm clients, Supervisors, and UI

Nimbus

6628

supervisor.thrift.port

Nimbus

Supervisors

8080

ui.port

Client Web Browsers

UI

8000

logviewer.port

Client Web Browsers

Logviewer

3772

drpc.port

External DRPC Clients

DRPC

3773

drpc.invocations.port

Worker Processes

DRPC

3774

drpc.http.port

External HTTP DRPC Clients

DRPC

670{0,1,2,3}

supervisor.slots.ports

Worker Processes

Worker Processes

Furthermore, Storage Agents need ports 50000, 47500, 47100 to be open as well.