Multi-Stream support in SCP.NET Storm Topology


Streams are in the core of Apache Storm. In most cases topologies are based on a single input stream, however there are situations when one may need to start the topology with two or more input steams.

User code to emit or receive from distinct streams at the same time is supported in SCP. To support multiple streams, Emit method of the Context object takes an optional stream ID parameter.

SCP.NET provides .Net C# programmability against Apache Storm on Azure HDInsight clusters.Since Microsoft.SCP.Net.SDK version 0.9.4.283 multi-stream is supported in SCP.NE  Two methods in the SCP.NET Context object have been added. They are used to emit Tuple or Tuples to specify StreamId. The StreamId is a string and it needs to be consistent in both C# and the Topology Definition Spec. The emitting to a non-existing stream will cause runtime exceptions.

/* Emit tuple to the specific stream. */

public abstract void Emit(string streamId, List<object> values);

/* for non-transactional Spout only */

public abstract void Emit(string streamId, List<object> values, long seqId);

 

There is a sample SCP.NET storm topology that uses multi-stream in GitHub. You can download the hdinsight-storm-examples from github and navigate to the HelloWorldHostModeMultiSpout sample under SCPNetExamples folder.

It is pretty straight forward. Below I am copy pasting the code from the program.cs file of the sample where it shows how to define topology using the TopologyBuilder when you have multiple spouts.

// Use TopologyBuilder to define a Non-Tx topology

// And define each spouts/bolts one by one

TopologyBuilder topologyBuilder = new TopologyBuilder("HelloWorldHostModeMultiSpout");

// Set a User customized config (SentenceGenerator.config) for the SentenceGenerator

topologyBuilder.SetSpout(

"SentenceGenerator",

SentenceGenerator.Get,

new Dictionary<string, List<string>>()

{

{SentenceGenerator.STREAM_ID, new List<string>(){"sentence"}}

},

1,

"SentenceGenerator.config");

 

topologyBuilder.SetSpout(

"PersonGenerator",

PersonGenerator.Get,

new Dictionary<string, List<string>>()

{

{PersonGenerator.STREAM_ID, new List<string>(){"person"}}

},

1);

 

topologyBuilder.SetBolt(

"displayer",

Displayer.Get,

new Dictionary<string, List<string>>(),

1)

.shuffleGrouping("SentenceGenerator", SentenceGenerator.STREAM_ID)

.shuffleGrouping("PersonGenerator", PersonGenerator.STREAM_ID);

To test the topology, open the HelloWorldHostModeMultiSpout.sln file in Visual Studio and then deploy. For this sample topology I didn't have to make any changes to deploy in my cluster. To deploy in Solution Explorer, right-click the project HelloWorldHostModeMultiSpout, and select Submit to Storm on HDInsight.

You will see a pop-up window with dropdown list. Select your Storm on HDInsight cluster from the Storm Cluster drop-down list, and then select Submit. You can monitor whether the submission is successful by using the Output window.

Once the topology has been successfully submitted you will see it listed under your storm cluster in the Server Explorer from where we can view more information about the running topologies.

For more information on how to develop and deploy storm topologies in Visual Studio using SCP.NET please check Develop C# topologies for Apache Storm on HDInsight using Hadoop tools for Visual Studio and Deploy and manage Apache Storm topologies on Windows-based HDInsight Azure Documentation articles.

Comments (0)

Skip to main content