Analytic Stack Setup¶

This page references the repository under https://gitlab.com/rainbow-project1/rainbow-installation

Data Storage and Analytic Services on Master¶

For the data storage and processing stack, users initially must execute a docker-compose file on master that includes all necessary services.

Navigate to the repository and go to the *analytic-stack/master* directory where the docker-compose.yaml file is provided.

Next we provide details about the parameters of the docker-compose file. For providing the details, edit the *.env* file inside the directory.

NODE_IPV6="..." # Node's IPV6
NODE_IPV4="..." # Node's IPV4
PROVIDER_HOSTS="..." # The IPs of the nodes that the system will retrieve its data (all nodes' ips)
NODE_HOSTNAME="..." # The hostname or ip of the node
STORM_NIMBUS_CONFIG_FILE="..." # The path of Storm Nimbus configuration file
STORAGE_PLACEMENT="..." # Enables and disables the placement algorithm of the storage. Default is False.
STORAGE_DATA_FOLDER="...." # The folder that the data of storage will be stored

Storm Nimbus and Storm UI Configurations¶

Generally, the configuration of Nimbus needs no alteration. However, users can update the following file accordingly. Furthermore, users can add also other configurations of Storm Framework. Finally, users can introduce other scheduling strategies (including RAINBOW’s strategies) via this confiugration file. For instance, if users set storm.scheduler equals to ResourceAwareScheduler and its strategy to be EnergyAwareStrategy, the execution will try to minimize the energy consumption.

Configuration file exists in *analytic-stack/master/storm* directory

A typical configuration file is the following:

storm.zookeeper.servers:
     - "cluster-head-IP" # update with master's IPv4
nimbus.seeds: [ "cluster-head-IP" ]  # update with master's IPv4
storm.log.dir: "/logs"
storm.local.dir: "/data"
storm.local.hostname: "cluster-head-IP" # update with master's IPv4
supervisor.slots.ports:
     - 6700
     - 6701
     - 6702
     - 6703
nimbus.thrift.max_buffer_size: 20480000
supervisor.thrift.max_buffer_size: 20480000
worker.childopts: "-Xmx%HEAP-MEM%m -Xloggc:artifacts/gc.log -XX:+HeapDumpOnOutOfMemoryError    XX:HeapDumpPath=artifacts/heapdump"
topology.component.cpu.pcore.percent: 1000.0
topology.component.resources.onheap.memory.mb: 512.0
#storm.scheduler: "org.apache.storm.scheduler.resource.ResourceAwareScheduler"
#topology.scheduler.strategy: "eu.rainbowh2020.Schedulers.EnergyAwareStrategy"

Stack Execution¶

In order to execute the stack, users have only to run the following command in the *analytic-stack/master* directory.

docker-compose up

And the system will start all services. We should note that users have to run firstly the services of master and after that all the other storage and processing services on the rest of the nodes.

Monitoring, Data Storage and Analytic Services on Edge Nodes¶

For the data storage and processing stack on Edge Nodes, users must execute a docker-compose file on all nodes.

Navigate to the repository and go to the *analytic-stack/nodes* directory where the docker-compose.yaml file is provided.

Next we provide details about the parameters of the docker-compose file.

MONITORING_CONFIGURATION_FILE="..." # The path of monitoring agent configuration file
STORAGE_RAINBOW_HEAD="..." # Cluster head's IPV6
STORAGE_NODE_NAME="..." # The hostname or ip of the node
STORAGE_PLACEMENT="..." # Enables and disables the placement algorithm of the storage. Default is False.
STORAGE_DATA_FOLDER="...." # The folder that the data of storage will be stored

Monitoring Agent Configurations¶

node_id: "node_id"  # user need to provide a node id (e.g. hostname)

sensing-units:
  general-periodicity: 1s # general sensing rate
  DefaultMonitoring: # Node-level metrics will be enable, users can disable this by removing it
    periodicity: 1s
#    disabled-groups: # metric-groups that the system will not start at all metric groups include CPU, memory, disk, network
##      - "disk"
#    metric-groups: # override the sensing preferences on specific groups
#      - name: "memory"
#        periodicity: 15s # change a static periodicity
#      - name: "cpu"
  UserDefinedMetrics: # specific implementation of sensing interface for user-defined metrics
    periodicity: 1s
    sources:
      - "/"
  ContainerMetrics: # Container-level monitoring metrics
    periodicity: 1s

dissemination-units: # users can enable multiple dissemination units however in rainbow we use Ignite as storage
 IgniteExporter:
    hostname: ignite-server
    port: 50000

#adaptivity:   # optional adaptivity properties
#  sensing:  # adaptivity in sensing units
#    DockerProbe:  # e.g., enable adaptivity for the container metrics
#      target_name: demo_test|cpu_ptc  # and set as target metric the cpu percentage of demo_test container
#      minimum_periodicity: 1
#      maximum_periodicity: 15
#      confidence: 0.95
#  dissemination:  # adaptivity in dissemination
#    all:  # the system sends adaptively all metrics to the storage
#      minimum_periodicity: 1
#      maximum_periodicity: 15
#      confidence: 0.95
#    metric_id:  # or it can send adaptively only specific metrics
#    - minimum_periodicity: 5s
#      maximum_periodicity: 35s
#      confidence: 95
#

For more information about the monitoring and its configuration please check its repository https://gitlab.com/rainbow-project1/rainbow-monitoring

Storm Worker Configurations¶

Generally, we need only to update the IPs of the Storm Worker configurations. However, users can update the following file accordingly to alter the processing characteristics of a node. Specifically, users can add other configurations from Storm Framework (since we utilize storm as execution engine).

Configuration file exists in *analytic-stack/nodes/storm* directory.

A typical configuration file is the following:

nimbus.seeds: ["cluster-head-IP"] # The cluster head IPV4
ui.port: 8080
storm.zookeeper.servers:
   - "cluster-head-IP" # The cluster head IPV4
storm.local.hostname: "node-IP" # As hostname we need to provide the node's IPV4
supervisor.slots.ports:
   - 6700
   - 6701
   - 6702
   - 6703

Stack Execution¶

In order to execute the stack, users have only to run the following command:

docker-compose up

For the execution on ARM processors, user need to execute specific docker-compose files. For instance, for 32 bit arm processors users need to run the following command:

docker-compose -f docker-compose-arm32.yaml up

or for 64 bit arm processors, users should execute the following command:

docker-compose -f docker-compose-arm64.yaml up

Ports and Networking¶

Apache Storm needs specific ports to be open namely or users can configure other ports in the storm configurations:

Default Port	Storm Config	Client Hosts/Processes	Server
2181	storm.zookeeper.port	Nimbus, Supervisors, and Worker processes	Zookeeper
6627	nimbus.thrift.port	Storm clients, Supervisors, and UI	Nimbus
6628	supervisor.thrift.port	Nimbus	Supervisors
8080	ui.port	Client Web Browsers	UI
8000	logviewer.port	Client Web Browsers	Logviewer
3772	drpc.port	External DRPC Clients	DRPC
3773	drpc.invocations.port	Worker Processes	DRPC
3774	drpc.http.port	External HTTP DRPC Clients	DRPC
670{0,1,2,3}	supervisor.slots.ports	Worker Processes	Worker Processes

Furthermore, Storage Agents need ports 50000, 47500, 47100 to be open as well.