Logging in Microsoft R Server Engine


With 9.1 release we have made improvements that can help us better diagnose and troubleshoot problems that can occur in a customer’s deployments. If you run into any errors/issues with Microsoft R Server and contact Microsoft support you might be asked to collect BxlServer logs in addition to other information. This blog describes how to enable and collect logs for various compute contexts that can be shared with Microsoft support.

How to enable logging

Enable logging for RevoScaleR package is very simple. We use the rxOptions function to set the traceLevel parameter:

rxOptions(traceLevel=<level>)

Logging has the following levels defined

traceLevel value Logging level Description
0 NONE By default logging is disabled
1 ERROR Error messages are logged
2 WARN Warning log messages are logged
3 INFO Informational log messages are logged
4 DEBUG More granular debug messages are logged
5 Reserved currently treated as DEBUG
6 Reserved currently treated as DEBUG
7 R runtime tracing enabled This level also enables some detailed call trace output in R code that implements the RevoScaleR functions. In the engine it is treated same as DEBUG

 

Log file locations

Once you enable logging and run your RevoScaleR computation, the next step is to collect the logs from various nodes involved in the computation for the particular RevoScaleR function. The logs are written to specific places on the node where computation takes place and varies by what compute context is specified for the computation. The naming of the log files follows the following convention:

Client – BxlServer.client.<processid>.log

Master – BxlServer.master.<processid>.log

Worker – BxlServer.worker.<processid>.log

On Hadoop we also have files that indicate whether the files was generated through by map or reduce phase of the computation

Worker mapper – BxlServer.mapper.<processid>.log

Worker reducer – BxlServer.reducer.<processid>.log

 

Local Compute Context

For local compute context the log files are written to the R temp directory (output of tempdir function call), this could be a directory similar to /tmp/RtmpMLrRiq on Linux for example or C:\TEMP\RtmpMLrRiq on Windows. Below is an example from Linux

[dev@host-client ~]$ ll -rth /tmp/RtmpMLrRiq/

total 0

-rw-rw-r-- 1 dev dev 0 Mar 15 10:54 BxlServer.client.45379.log

 

Distributed compute contexts

Since the computations in a distributed compute context could take place on multiple different nodes, we have various ways to find and collect the log files depending on which compute context you are using. For all the compute context there is a client R process that’s either running on a user’s desktop or on an edge node (in case of a Hadoop cluster) and the client communicates with the cluster based on where it’s executing. In all cases there is a job directory (with a GUID name) that is used to store various files that are required for the computation. Once the computation is complete the log files from various nodes are collected into the job directory for the client R process (except in case of Hadoop, see following section). You can get the client job GUID folder using the following object in your R session:

rxgLastPendingJob@clientGUIDDir

 

RxHadoopMR and RxSpark

For Hadoop cluster client R process can be either running on your desktop or on the edge node. If you are on a desktop ssh is used in the background to communicate with an edge node. If you are directly on an edge node of the cluster then no ssh is required.

 

Client (when not on edge node):

Below is an example script to demonstrate the logging output

 

You can then check the shareDir/<GUID> (shareDir is a parameter for the compute context):

C:\ClientShareDir\43E70EA1D18F41AF88DADF13E5156590> dir

Directory: C:\ClientShareDir\43E70EA1D18F41AF88DADF13E5156590

Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---         3/15/2017  11:29 AM       1427 0.RInData
-a---         3/15/2017  11:31 AM         31 applicationIds.txt
-a---         3/15/2017  11:31 AM       1445 BxlServer.client.74204.log
-a---         3/15/2017  11:30 AM       4256 BxlServer.master.53954.log
-a---         3/15/2017  11:29 AM        906 cleanupHadoopMRJob.sh
-a---         3/15/2017  11:29 AM       1415 createOutfileChunk.sh
-a---         3/15/2017  11:30 AM       1565 DEV_VM1.ROutData
-a---         3/15/2017  11:29 AM       1448 getHadoopStatus.sh
-a---         3/15/2017  11:29 AM         26 input.R
-a---         3/15/2017  11:29 AM          6 jobIdentifier.txt
-a---         3/15/2017  11:29 AM      16523 jobInfo.RData
-a---         3/15/2017  11:30 AM          9 jobStatus.txt
-a---         3/15/2017  11:29 AM       1397 killRemoteJob.sh
-a---         3/15/2017  11:31 AM          0 lastOutputBlock.txt
-a---         3/15/2017  11:30 AM       5450 MergedOut0.txt
-a---         3/15/2017  11:29 AM       2049 run-r.sh

 

On Edge node:

The master R process log file is in the cluster GUID folder (usually under /var/RevoShare by default and can be changed using shareDir parameter of the compute context). We also write another file called applicationIds.txt that contains the applicationIds of the YARN jobs that have been submitted to the cluster as part of the computation (this will be useful in getting worker logs, see further below). At the end of the computation the files are copied over to the client process’ GUID dir (rxgLastPendingJob@clusterGUIDDir)

[dev@dev-vm1 ~]$ ll -rth /tmp/474DBA206AA144A99025F79D8248C960
total 80
-rw-rw-r-- 1 dev dev  1317 Apr 12 13:04 0.RInData
-rw-rw-r-- 1 dev dev    31 Apr 12 13:05 applicationIds.txt
-rw-rw-r-- 1 dev dev  1785 Apr 12 13:05 BxlServer.client.3853.log
-rw-rw-r-- 1 dev dev  6018 Apr 12 13:05 BxlServer.master.4521.log
-rw-rw-r-- 1 dev dev   904 Apr 12 13:04 cleanupHadoopMRJob.sh
-rw-rw-r-- 1 dev dev  1411 Apr 12 13:04 createOutfileChunk.sh
-rw-rw-r-- 1 dev dev  1562 Apr 12 13:05 DEV_VM.ROutData
-rw-rw-r-- 1 dev dev  1609 Apr 12 13:04 getHadoopStatus.sh
-rw-rw-r-- 1 dev dev    24 Apr 12 13:04 input.R
-rw-rw-r-- 1 dev dev     5 Apr 12 13:04 jobIdentifier.txt
-rw-rw-r-- 1 dev dev 16247 Apr 12 13:04 jobInfo.RData
-rw-rw-r-- 1 dev dev     9 Apr 12 13:05 jobStatus.txt
-rw-rw-r-- 1 dev dev  1349 Apr 12 13:04 killRemoteJob.sh
-rw-rw-r-- 1 dev dev     0 Apr 12 13:05 lastOutputBlock.txt
-rw-rw-r-- 1 dev dev  5505 Apr 12 13:05 MergedOut0.txt
-rw-rw-r-- 1 dev dev  2030 Apr 12 13:04 run-r.sh

 

The client R process log for edge nodes are under tempdir location.

Worker node logs:

For Hadoop worker node processes are managed through YARN containers and as such the logs are written to yarn log directories for the respective containers. To collect the logs from worker nodes we need to use “yarn logs -applicationId <applicationId> command. The applicationId can be obtained from the applicationIds.txt file as shown above (there can be more than one applicationId written to the file depending on if multiple iterations are required).

 

NOTE: yarn logs command will return stdout, stderr, sysout logs as well. Look for log type BxlServer.*.log

[dev@dev-vm1 ~]$ cat /var/RevoShare/dev/cluster-F6AABFC81FAC4E36BA1B28C1881A532A/applicationIds.txt
application_1489107491981_0320

[dev@dev-vm1 ~]$ yarn logs -applicationId application_1489107491981_0320 > yarn-logs.txt

17/03/15 11:12:58 INFO client.RMProxy: Connecting to ResourceManager at dev-vm1/102.123.215.107:8032

[dev@dev-vm1 ~]$ vi yarn-logs.txt
Container: container_1489107491981_0320_01_000002 on dev-vm3.sys-sqlsvr.local_8041
=========================================================================================
LogType:BxlServer.mapper.59762.log
Log Upload Time:Wed Mar 15 10:53:00 -0700 2017
LogLength:1685
Log Contents:
[2017-03-15.10:52:28.438.464][ INFO]Logging configuration completed.  traceLevel=4 process type=WORKER_MAP log dir=/var/log/hadoop-yarn/container/application_148910749198
1_0320/container_1489107491981_0320_01_000002
[2017-03-15.10:52:28.441.100][DEBUG]Server Reserved Memory size=10485760
[2017-03-15.10:52:28.441.429][ INFO]RxUtil request received USE=HdfsConnect
[2017-03-15.10:52:31.630.485][DEBUG]Server Reserved Memory size=10485760
[2017-03-15.10:52:31.630.714][ INFO]RxUtil request received USE=SetSplitInfo
[2017-03-15.10:52:31.634.074][DEBUG]Server Reserved Memory size=10485760
[2017-03-15.10:52:31.634.285][ INFO]RxUtil request received USE=ComputeHadoopTask

Container: container_1489107491981_0320_01_000003 on dev-vm3.sys-sqlsvr.local_8041
=========================================================================================
LogType:BxlServer.reducer.60310.log
Log Upload Time:Wed Mar 15 10:53:00 -0700 2017
LogLength:741
Log Contents:
[2017-03-15.10:52:48.438.501][ INFO]Logging configuration completed.  traceLevel=4 process type=WORKER_REDUCE log dir=/var/log/hadoop-yarn/container/application_148910749
1981_0320/container_1489107491981_0320_01_000003
[2017-03-15.10:52:48.440.055][DEBUG]Server Reserved Memory size=10485760
[2017-03-15.10:52:48.440.441][ INFO]RxUtil request received USE=HdfsConnect
[2017-03-15.10:52:51.699.914][DEBUG]Server Reserved Memory size=10485760
[2017-03-15.10:52:51.700.103][ INFO]RxUtil request received USE=UpdateIRO
[2017-03-15.10:52:51.702.203][DEBUG]Server Reserved Memory size=10485760
[2017-03-15.10:52:51.702.385][ INFO]RxUtil request received USE=WriteIRO
[2017-03-15.10:52:52.431.104][ NONE]Shutting Down Log Processing

 

Teradata

All Log files for master and worker processes are aggregated into the client GUID directory on the R client. The number of worker logs is determined by the number of AMP nodes in your Teradata system.
Below is an example of how you can setup RxInTeradata compute context

After running your script you can check the shareDir for log files

PS C:\> dir C:/User//Documents/A0A8235B879E4AAB94E937C9666541C8

    Directory: C:\User\\Documents\A0A8235B879E4AAB94E937C9666541C8

Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---         4/12/2017   3:18 PM       1073 0.RInData
-a---         4/12/2017   3:18 PM      47163 BxlServer.client.10128.log
-a---         4/12/2017   3:18 PM       5778 BxlServer.master.22047.log
-a---         4/12/2017   3:18 PM       4953 BxlServer.worker.24278.log
-a---         4/12/2017   3:18 PM       4953 BxlServer.worker.24280.log
-a---         4/12/2017   3:18 PM        285 input.R
-a---         4/12/2017   3:18 PM      15439 jobInfo.RData
-a---         4/12/2017   3:18 PM         33 MasterOut.txt
-a---         4/12/2017   3:18 PM        218 MergedOut0.txt
-a---         4/12/2017   3:18 PM        218 MergedOut1.txt
-a---         4/12/2017   3:18 PM       2897 sample-ds1.xdf
-a---         4/12/2017   3:18 PM       3969 tdIn.zip
-a---         4/12/2017   3:18 PM       1342 DEVTD_VM.ROutData

 

SQL Server

All log files for client, master and worker processes are aggregated into client GUID directory on client. Below is an example:

After running the example you can check the log files in the shareDir/<GUID> directory

C:\Users\dev\AppData\Local\Temp\RevoJobs\8C25E32014EE468AAC3F036D1FC2FD6E>ls -la

total 53

drwx------+ 1 4294967295 dev     0 Mar 15 13:34 .
drwx------+ 1 4294967295 dev     0 Mar 15 13:34 ..
-rwx------+ 1 4294967295 dev   988 Mar 15 13:34 0.RInData
-rwx------+ 1 4294967295 dev   159 Mar 15 13:34 BxlServer.client.461160.log
-rwx------+ 1 4294967295 dev   242 Mar 15 13:34 BxlServer.master.520864.log
-rwx------+ 1 4294967295 dev   242 Mar 15 13:34 BxlServer.worker.349880.log
-rwx------+ 1 4294967295 dev  1146 Mar 15 13:34 MergedOut.txt
-rwx------+ 1 4294967295 dev   478 Mar 15 13:34 DEV_DESKTOP.ROutData
-rwx------+ 1 4294967295 dev 15388 Mar 15 13:34 jobInfo.RData
-rwx------+ 1 4294967295 dev  1944 Mar 15 13:34 results.zip
-rwx------+ 1 4294967295 dev     3 Mar 15 13:34 state.txt
-rwx------+ 1 4294967295 dev  1235 Mar 15 13:34 tdIn.zip

With more logging in Microsoft R Server we can get a better picture of analytics execution pipeline and this can greatly help with troubleshooting issues by Microsoft when required. Hopefully this blog clarified the way you can collect logs to share with Microsoft support in case you need any help. Feel free to comment if there are any questions.

For a comprehensive view of all the capabilities in Microsoft R Server 9.1, refer to this blog

Comments (0)

Skip to main content