Using RStudio Server with Microsoft R Server Parcel for Cloudera

In previous releases of Microsoft R Server, parcel installation required downloading two pre-built parcel files. The 9.1 release improves upon this experience by providing a parcel generator script generate_mrs_parcel.sh to generate a single MRS-9.1.0-*.parcel file. Here are the complete instructions to install MRS Parcel in Cloudera Cluster. In this article we will look into how to…


Running Pleasingly Parallel workloads using rxExecBy on Spark, SQL, Local and Localpar compute contexts

RevoScaleR function rxExec(), allows you to run arbitrary R functions in a distributed fashion, using available nodes (computers) or available cores (the maximum of which is the sum over all available nodes of the processing cores on each node). The rxExec approach exemplifies the traditional high-performance computing approach: when using rxExec, you largely control how…


Remote Spark Compute Context using PuTTY on Windows

If you are running Microsoft R Server/Microsoft R Client from a Windows computer equipped with PuTTY, you can create a compute context that will run RevoScaleR functions from your local client in a distributed fashion on your Hadoop cluster. You use RxSpark to create the compute context, but use additional arguments to specify your user…


Best practices for executing embarrassingly parallel workloads with R Server on Spark

Introduction An embarrassingly parallel workload or problem is one where little or no effort is needed to separate the problem into a number of parallel tasks. This is often the case where there is little or no dependency or need for communication between those parallel tasks, or for results between them.   In this blog,…