Storm is a free and open source distributed realtime computation sytem. It works with streams of data, equivalent for realtime processing as Hadoop with batch processing. It's used in realtime analytics, online machine learning and more. It's one of the fastest process engine, one million tuples processed per second per node. It relies on a master(Nimbus) /slave architecture.

In this tutorial you will learn how to easily deploy a Storm environment using ProActive Cloud Automation, then run some computation on it, and finally release this Storm environment.

1 Deploy Storm Instance

  1. Go to ProActive Cloud Automation

  2. Register with your credentials that you provided

  3. Click on the Storm icon

  4. You can leave the default-values or you can try to play with options :

    instance_name name of your docker containers that will contain your storm
    infrastructure_name list where you can choose on which infrascture storm will be deployed
    number_of_slaves number of workers that will be deployed
    dashboard_port the mapping of the UI port to the host port

  5. When you have filled every fields you can click on submit. Then you should see in the running services area a new line with your Storm instance

    Now you can click to the endpoint adress to access the UI and see that instance is really deployed.

2 Run topology on your Storm instance

Create your first task

  1. Connect to the ProActive Workflow Studio.

  2. Create a new workflow
    Add a job variable called container_name with the same value that you have put for instance_name when you have deployed the storm. It will be the default value. Then create a first bash task

#firstly you need to change your current repository and go to the one with the topology to run
cd /home/activeeon/tutorial

#retrieve the name of nimbus container (deployment relies on Docker)
nimbus_container=$(docker ps -a --filter name="^.*${variables_container_name}Nimbus_\d$" --format "{{.Names}}")

#Copy the topology to nimbus container
docker cp testtopology.jar $nimbus_container:/testtopology.jar

Create your second task

You need to add a second bask task which depends on the first one

#retrieve the name of nimbus container,for this you can reuse
nimbus_container=$(docker ps -a --filter name="^.*${variables_container_name}Nimbus_\d$" --format "{{.Names}}")

#execute the topology which has been copied
docker exec $nimbus_container /bin/sh -c 'echo “ANY” echo “ANY” echo “ANY” | storm jar testtopology.jar storm.topology.TopologyCountBase stormTutorial'
Now, you can execute the workflow. As container_name, put the same thing as you put in instance_name when deploying Storm. When workflow executed, you can go the scheduler to verify that there is no faulty task, after this verificaton, you can access the StormUI and see that there is a topology running.

You can execute another topology, you just need to change "stormTutorial" with another name in your task.

3 Delete Storm instance

To delete the instance, you simply need to click on the little bin icon at the end of the line of your running instance in the ProActive Cloud Automation. Then you can check on the StormUI to see that it is not running anymore.