Analytic Stack Setup¶
This page references the repository under https://gitlab.com/rainbow-project1/rainbow-installation
Data Storage and Analytic Services on Master¶
For the data storage and processing stack, users initially must execute a docker-compose file on master that includes all necessary services.
Navigate to the repository and go to the *analytic-stack/master* directory where the docker-compose.yaml file is provided.
Next we provide details about the parameters of the docker-compose file. For providing the details, edit the *.env* file inside the directory.
NODE_IPV6="..." # Node's IPV6
NODE_IPV4="..." # Node's IPV4
PROVIDER_HOSTS="..." # The IPs of the nodes that the system will retrieve its data (all nodes' ips)
NODE_HOSTNAME="..." # The hostname or ip of the node
STORM_NIMBUS_CONFIG_FILE="..." # The path of Storm Nimbus configuration file
STORAGE_PLACEMENT="..." # Enables and disables the placement algorithm of the storage. Default is False.
STORAGE_DATA_FOLDER="...." # The folder that the data of storage will be stored
Storm Nimbus and Storm UI Configurations¶
Generally, the configuration of Nimbus needs no alteration. However, users can update the following file accordingly. Furthermore, users can add also other configurations of Storm Framework. Finally, users can introduce other scheduling strategies (including RAINBOW’s strategies) via this confiugration file. For instance, if users set storm.scheduler equals to ResourceAwareScheduler and its strategy to be EnergyAwareStrategy, the execution will try to minimize the energy consumption.
Configuration file exists in *analytic-stack/master/storm* directory
A typical configuration file is the following:
storm.zookeeper.servers:
- "cluster-head-IP" # update with master's IPv4
nimbus.seeds: [ "cluster-head-IP" ] # update with master's IPv4
storm.log.dir: "/logs"
storm.local.dir: "/data"
storm.local.hostname: "cluster-head-IP" # update with master's IPv4
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
nimbus.thrift.max_buffer_size: 20480000
supervisor.thrift.max_buffer_size: 20480000
worker.childopts: "-Xmx%HEAP-MEM%m -Xloggc:artifacts/gc.log -XX:+HeapDumpOnOutOfMemoryError XX:HeapDumpPath=artifacts/heapdump"
topology.component.cpu.pcore.percent: 1000.0
topology.component.resources.onheap.memory.mb: 512.0
#storm.scheduler: "org.apache.storm.scheduler.resource.ResourceAwareScheduler"
#topology.scheduler.strategy: "eu.rainbowh2020.Schedulers.EnergyAwareStrategy"
Stack Execution¶
In order to execute the stack, users have only to run the following command in the *analytic-stack/master* directory.
docker-compose up
And the system will start all services. We should note that users have to run firstly the services of master and after that all the other storage and processing services on the rest of the nodes.
Monitoring, Data Storage and Analytic Services on Edge Nodes¶
For the data storage and processing stack on Edge Nodes, users must execute a docker-compose file on all nodes.
Navigate to the repository and go to the *analytic-stack/nodes* directory where the docker-compose.yaml file is provided.
Next we provide details about the parameters of the docker-compose file.
MONITORING_CONFIGURATION_FILE="..." # The path of monitoring agent configuration file
STORAGE_RAINBOW_HEAD="..." # Cluster head's IPV6
STORAGE_NODE_NAME="..." # The hostname or ip of the node
STORAGE_PLACEMENT="..." # Enables and disables the placement algorithm of the storage. Default is False.
STORAGE_DATA_FOLDER="...." # The folder that the data of storage will be stored
Monitoring Agent Configurations¶
node_id: "node_id" # user need to provide a node id (e.g. hostname)
sensing-units:
general-periodicity: 1s # general sensing rate
DefaultMonitoring: # Node-level metrics will be enable, users can disable this by removing it
periodicity: 1s
# disabled-groups: # metric-groups that the system will not start at all metric groups include CPU, memory, disk, network
## - "disk"
# metric-groups: # override the sensing preferences on specific groups
# - name: "memory"
# periodicity: 15s # change a static periodicity
# - name: "cpu"
UserDefinedMetrics: # specific implementation of sensing interface for user-defined metrics
periodicity: 1s
sources:
- "/"
ContainerMetrics: # Container-level monitoring metrics
periodicity: 1s
dissemination-units: # users can enable multiple dissemination units however in rainbow we use Ignite as storage
IgniteExporter:
hostname: ignite-server
port: 50000
#adaptivity: # optional adaptivity properties
# sensing: # adaptivity in sensing units
# DockerProbe: # e.g., enable adaptivity for the container metrics
# target_name: demo_test|cpu_ptc # and set as target metric the cpu percentage of demo_test container
# minimum_periodicity: 1
# maximum_periodicity: 15
# confidence: 0.95
# dissemination: # adaptivity in dissemination
# all: # the system sends adaptively all metrics to the storage
# minimum_periodicity: 1
# maximum_periodicity: 15
# confidence: 0.95
# metric_id: # or it can send adaptively only specific metrics
# - minimum_periodicity: 5s
# maximum_periodicity: 35s
# confidence: 95
#
For more information about the monitoring and its configuration please check its repository https://gitlab.com/rainbow-project1/rainbow-monitoring
Storm Worker Configurations¶
Generally, we need only to update the IPs of the Storm Worker configurations. However, users can update the following file accordingly to alter the processing characteristics of a node. Specifically, users can add other configurations from Storm Framework (since we utilize storm as execution engine).
Configuration file exists in *analytic-stack/nodes/storm* directory.
A typical configuration file is the following:
nimbus.seeds: ["cluster-head-IP"] # The cluster head IPV4
ui.port: 8080
storm.zookeeper.servers:
- "cluster-head-IP" # The cluster head IPV4
storm.local.hostname: "node-IP" # As hostname we need to provide the node's IPV4
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
Stack Execution¶
In order to execute the stack, users have only to run the following command:
docker-compose up
For the execution on ARM processors, user need to execute specific docker-compose files. For instance, for 32 bit arm processors users need to run the following command:
docker-compose -f docker-compose-arm32.yaml up
or for 64 bit arm processors, users should execute the following command:
docker-compose -f docker-compose-arm64.yaml up
Ports and Networking¶
Apache Storm needs specific ports to be open namely or users can configure other ports in the storm configurations:
Default Port |
Storm Config |
Client Hosts/Processes |
Server |
||||||
---|---|---|---|---|---|---|---|---|---|
2181 |
storm.zookeeper.port |
Nimbus, Supervisors, and Worker processes |
Zookeeper |
||||||
6627 |
nimbus.thrift.port |
Storm clients, Supervisors, and UI |
Nimbus |
||||||
6628 |
supervisor.thrift.port |
Nimbus |
Supervisors |
||||||
8080 |
ui.port |
Client Web Browsers |
UI |
||||||
8000 |
logviewer.port |
Client Web Browsers |
Logviewer |
||||||
3772 |
drpc.port |
External DRPC Clients |
DRPC |
||||||
3773 |
drpc.invocations.port |
Worker Processes |
DRPC |
||||||
3774 |
drpc.http.port |
External HTTP DRPC Clients |
DRPC |
||||||
670{0,1,2,3} |
supervisor.slots.ports |
Worker Processes |
Worker Processes |
Furthermore, Storage Agents need ports 50000, 47500, 47100 to be open as well.