SQL Server on Linux: Quick Performance Monitoring

I have been asked several times about how to get a Performance Monitor like view on Linux.   There are lots of Linux tools available (top, iotop, Grafana, and SQL Sentry just scratch the surface of available options) to monitor the Linux system.  Allow me to share one such example to capture and monitor a system.

Performance Co-Pilot

PMCHART can be used much like you use performance monitor. You can select metrics to capture, record, replay them both locally and remotely. This only requires (pcp and pcp-gui) installs. The web service provides access from Vector and other utilities.  https://pcp.io/docs/lab.pmchart.html   The sample scripts in this post have common install package instructions.

Example: pmchart -c Memory -c CPU -c Disk -c Overview -c Paging -t 5 -h MyMachine

Screen shots from Centos and Ubuntu VMs running on my MacBook

image

pmrep - convert to CSV or other formats:https://pcp.io/man/man1/pmrep.1.html

Vector (Remote Visualization via Browser)

Getting Started: Using Performance Co-Pilot and Vector for Browser Based Metric Visualizations: https://rhelblog.redhat.com/2015/12/18/getting-started-using-performance-co-pilot-and-vector-for-browser-based-metric-visualizations/

Vector

The following installs the web server and vector (static HTML app) If you want just local monitoring. 

Ubuntu

apt-get install nginx
cd /var/www/html
wget https://dl.bintray.com/netflixoss/downloads/1.1.0/vector.tar.gz
tar xvzf vector.tar.gz
Connect to https://localhost:80

Centos/RHEL

yum install cp-webapi pcp-webjs pcp-gui
chkconfig pmwebd on; service pmwebd start
firefox https://localhost:44323/ &

Reference: https://pcp.io/download.html

Is PCP Running?

service pcp status
service pmcd status
service pmlogger status
service pmproxy status
service pmwebd status
netstat -ltp | grep "/p"

tcp 0 0 *:4330 *:* LISTEN 32018/pmlogger
tcp 0 0 *:44321 *:* LISTEN 28269/pmcd
tcp 0 0 *:44322 *:* LISTEN 32199/pmproxy
tcp 0 0 *:44323 *:* LISTEN 24370/pmwebd
tcp6 0 0 [::]:4330 [::]:* LISTEN 32018/pmlogger
tcp6 0 0 [::]:44323 [::]:* LISTEN 24370/pmwebd

Sample Bash Capture Script and GUI Viewing

#======================================================
# Capture performance counters in a log (archive) using pcp
# $1 = Minutes to capture
# $2 = HostName
# $3 = Replay archived data with pmchart - 0 or 1
# $4 = Watch live data with pmchart - 0 or 1
#
# Example ./pcpcapture.sh 10 MyMachine 1 1
#
# Ubuntu Install
# ==============
# sudo su
# apt-get update
# apt-get -y update
# curl --silent 'https://bintray.com/user/downloadSubjectPublicKey?username=pcp' | sudo apt-key add -
# echo "deb https://dl.bintray.com/pcp/xenial xenial main" | sudo tee -a /etc/apt/sources.list
# apt-get update
# apt-get install pcp pcp-webapi
# service pcp start
# service pmcd start
# service pmlogger start
# service pmproxy start
# service pmwebd start
# netstat -ltp | grep "/p"
#
# Centos Install
# ==============
# yum install pcp pcp-gui
# service pmcd start
# service pmlogger start
# service pmie start
#
#======================================================
now=$(date +"%m_%d_%Y_%H_%M_%S")
tmpdir=/tmp/pcpcapture/$now
folio=$tmpdir/$now.folio
view=$tmpdir/$now.view
config=$tmpdir/$now.config
rm -rf /tmp/pcpcapture
mkdir -p $tmpdir

#======================================================
# Create Config file
#
echo '#pmlogger Version 1' > $config
echo >> $config
echo 'log mandatory on default {' >> $config
echo ' kernel.all.cpu.user' >> $config
echo ' kernel.all.cpu.sys' >> $config
echo ' kernel.all.cpu.nice' >> $config
echo ' kernel.all.cpu.intr' >> $config
echo ' kernel.all.cpu.wait.total' >> $config
echo ' kernel.all.cpu.steal' >> $config
echo ' kernel.all.cpu.idle' >> $config
echo '}' >> $config
echo 'log mandatory on once {' >> $config
echo ' hinv.ncpu' >> $config
echo '}' >> $config
echo 'log mandatory on default {' >> $config
echo ' kernel.all.load [ "1 minute" ]' >> $config
echo '}' >> $config
echo 'log mandatory on default {' >> $config
echo ' disk.all.read' >> $config
echo ' disk.all.write' >> $config
echo '}' >> $config
echo 'log mandatory on default {' >> $config
echo ' network.interface.in.bytes [ "eth2" ]' >> $config
echo ' network.interface.in.bytes [ "eth1" ]' >> $config
echo ' network.interface.in.bytes [ "docker0" ]' >> $config
echo ' network.interface.in.bytes [ "eth0" ]' >> $config
echo ' network.interface.out.bytes [ "eth2" ]' >> $config
echo ' network.interface.out.bytes [ "eth1" ]' >> $config
echo ' network.interface.out.bytes [ "docker0" ]' >> $config
echo ' network.interface.out.bytes [ "eth0" ]' >> $config
echo '}' >> $config
echo 'log mandatory on default {' >> $config
echo ' mem.util.cached' >> $config
echo ' mem.util.bufmem' >> $config
echo ' mem.util.other' >> $config
echo ' mem.util.free' >> $config
echo '}' >> $config
echo 'log mandatory on default {' >> $config
echo ' mem.util.cached' >> $config
echo ' mem.util.bufmem' >> $config
echo ' mem.util.other' >> $config
echo ' mem.util.free' >> $config
echo '}' >> $config
echo 'log mandatory on default {' >> $config
echo ' kernel.all.cpu.user' >> $config
echo ' kernel.all.cpu.sys' >> $config
echo ' kernel.all.cpu.nice' >> $config
echo ' kernel.all.cpu.intr' >> $config
echo ' kernel.all.cpu.wait.total' >> $config
echo ' kernel.all.cpu.steal' >> $config
echo ' kernel.all.cpu.idle' >> $config
echo '}' >> $config
echo 'log mandatory on default {' >> $config
echo ' disk.all.read' >> $config
echo ' disk.all.write' >> $config
echo '}' >> $config
echo 'log mandatory on default {' >> $config
echo ' swap.pagesout' >> $config
echo ' swap.pagesin' >> $config
echo '}' >> $config

#======================================================
# Create Folio file
#
echo PCPFolio > $folio
echo Created: $now >> $folio
echo Creator: pmchart $view >> $folio
echo Version: 1 >> $folio
echo Archive: $2 $tmpdir/$now >> $folio

#======================================================
# Create View file
#
echo '#kmchart' > $view
echo version 1 >> $view
echo global width 1300 >> $view
echo global height 600 >> $view
echo global points 60 >> $view
echo global xpos 95 >> $view
echo global ypos 45 >> $view
echo >> $view
echo 'chart title "CPU Utilization [%h]" style utilization' >> $view
echo ' plot legend "User" color #2d2de2 metric kernel.all.cpu.user' >> $view
echo ' plot legend "Sys" color #e71717 metric kernel.all.cpu.sys' >> $view
echo ' plot legend "Nice" color #c2f3c2 metric kernel.all.cpu.nice' >> $view
echo ' plot legend "Intr" color #cdcd00 metric kernel.all.cpu.intr' >> $view
echo ' plot legend "Wait" color #00cdcd metric kernel.all.cpu.wait.total' >> $view
echo ' plot legend "Steal" color #fba2f5 metric kernel.all.cpu.steal' >> $view
echo ' plot legend "Idle" color #16d816 metric kernel.all.cpu.idle' >> $view
echo 'chart title "Average Load [%h]" style plot antialiasing off' >> $view
echo ' plot legend "1 min" color #ffff00 metric kernel.all.load instance "1 minute"' >> $view
echo ' plot legend "# cpus" color #0000ff metric hinv.ncpu' >> $view
echo 'chart title "IOPS over all Disks [%h]" style stacking' >> $view
echo ' plot legend "Reads" color #ffff00 metric disk.all.read' >> $view
echo ' plot legend "Writes" color #ee82ee metric disk.all.write' >> $view
echo 'chart title "Network Interface Bytes [%h]" style stacking' >> $view
echo ' plot legend "in %i" color #ffff00 metric network.interface.in.bytes instance "eth2"' >> $view
echo ' plot legend "in %i" color #0000ff metric network.interface.in.bytes instance "eth1"' >> $view
echo ' plot legend "in %i" color #ff0000 metric network.interface.in.bytes instance "docker0"' >> $view
echo ' plot legend "in %i" color #008000 metric network.interface.in.bytes instance "eth0"' >> $view
echo ' plot legend "out %i" color #ee82ee metric network.interface.out.bytes instance "eth2"' >> $view
echo ' plot legend "out %i" color #aa5500 metric network.interface.out.bytes instance "eth1"' >> $view
echo ' plot legend "out %i" color #666666 metric network.interface.out.bytes instance "docker0"' >> $view
echo ' plot legend "out %i" color #aaff00 metric network.interface.out.bytes instance "eth0"' >> $view
echo 'chart title "Real Memory Usage [%h]" style stacking' >> $view
echo ' plot color #9cffab metric mem.util.cached' >> $view
echo ' plot color #fe68ad metric mem.util.bufmem' >> $view
echo ' plot color #ffae2c metric mem.util.other' >> $view
echo ' plot color #00ff00 metric mem.util.free' >> $view
echo 'chart title "Real Memory Usage [%h]" style stacking' >> $view
echo ' plot color #9cffab metric mem.util.cached' >> $view
echo ' plot color #fe68ad metric mem.util.bufmem' >> $view
echo ' plot color #ffae2c metric mem.util.other' >> $view
echo ' plot color #00ff00 metric mem.util.free' >> $view
echo 'chart title "CPU Utilization [%h]" style utilization' >> $view
echo ' plot legend "User" color #2d2de2 metric kernel.all.cpu.user' >> $view
echo ' plot legend "Kernel" color #e71717 metric kernel.all.cpu.sys' >> $view
echo ' plot legend "Nice" color #c2f3c2 metric kernel.all.cpu.nice' >> $view
echo ' plot legend "Intr" color #cdcd00 metric kernel.all.cpu.intr' >> $view
echo ' plot legend "Wait" color #00cdcd metric kernel.all.cpu.wait.total' >> $view
echo ' plot legend "Steal" color #fba2f5 metric kernel.all.cpu.steal' >> $view
echo ' plot legend "Idle" color #16d816 metric kernel.all.cpu.idle' >> $view
echo 'chart title "IOPS over all Disks [%h]" style stacking' >> $view
echo ' plot legend "Reads" color #ffff00 metric disk.all.read' >> $view
echo ' plot legend "Writes" color #ee82ee metric disk.all.write' >> $view
echo 'chart title "VM Activity - Page Migrations [%h]" style plot' >> $view
echo ' plot legend "Out" color #ffff00 metric swap.pagesout' >> $view
echo ' plot legend "In" color #0000ff metric swap.pagesin' >> $view

#======================================================
# Watch live data
#
if [ $4 == 1 ]; then
echo Launching live data graphs
pmchart -h $2 -c Overview -c CPU -c Memory -c Disk -c Paging &
fi

#======================================================
# Start the logging on the designated host
#
echo "====================================================="
echo "Logging performance values on host $2 for for $1 minutes"
echo "====================================================="
echo "---------- To check logging status use: pmlc -h $2"
pmlogger -u -T $1mins -r -c $config -h $2 -l $tmpdir/pcplogger.log -t 5.0 $tmpdir/$now

#======================================================
# Replay captured data
#
if [ $3 == 1 ]; then
echo Launching data replay
pmafm $folio check
pmafm $folio list
pmdumplog -all -d -i -m $tmpdir/$now.0
pmafm $folio replay &
fi

Sample Bash Capture Script To CSV and TXT

#======================================================
tmpdir=/home/user/pcpcapture
stopFile=$tmpdir/Collect.Stop

if [ "${1}" = "STARTDETACHED" ]; then
echo "Launching pcpcapture DETACHED with nohup"

    launchPath=`pwd`
rm -f $launchPath/nohup.out
rm -f $launchPath/pcpcapture.log
nohup $launchPath/pcpcapture.sh START & > $launchPath/pcpcapture.log

    # Make sure nohup has launched DETACHED
#
echo "Launched capture"
sleep 6
exit 0
fi

#======================================================
# WaitForExit(name)
#
function WaitForExit()
{
name=$1

    pid=`ps aux | grep ${name} | grep -v grep | awk '{print $2}'`
while [ "$pid" != "" ];
do
kill -SIGUSR1 $pid
sleep 1
pid=`ps aux | grep pmrep | grep -v grep | awk '{print $2}'`
done
}

#======================================================
# STOP or START
#
if [ "${1}" = "STOP" ]; then
touch $stopFile
WaitForExit pmrep
WaitForExit pmstat
else
touch $stopFile
WaitForExit pmrep
WaitForExit pmstat

    # Remove any previous files
#
rm -f $stopFile
rm -f $tmpdir/.Metrics*
rm -f $tmpdir/.Stats*

    mkdir -p $tmpdir

    #======================================================
# Output all counters for import into performance warehouse
# https://access.redhat.com/articles/2372811
#
runtime="-s 20 -t 15s"
host=`hostname`

    while [ ! -f $stopFile ];
do
now=$(date +"%m_%d_%Y_%H_%M_%S")
echo "$now Starting performance capture"

        counters=`pmprobe | awk 'BEGIN{FS=".";} {print $1}' | sort | uniq | awk '{print}' ORS=' '`
counters+=" :pidstat :vmstat-w :proc-cpu-ext :proc-io-ext :proc-mem-ext :proc-info-ext"

        echo "Launching: pmrep -x $runtime -k -p --dynamic-header -F $tmpdir/.MetricsCapture.$now.$host.csv -o csv ${counters}"
nohup pmrep -x $runtime -k -p --dynamic-header -F $tmpdir/.MetricsCapture.$now.$host.csv -o csv ${counters} & > $tmpdir/.MetricsCapture.$now.$host.log

        #======================================================
# https://man7.org/linux/man-pages/man1/pmstat.1.html
        #
echo "Launching: pmstat -x $runtime -x > $tmpdir/.StatsCapture.$now.$host.txt"
pmstat -x $runtime -x > $tmpdir/.StatsCapture.$now.$host.txt

        WaitForExit pmrep

        echo "Renaming files"
mv $tmpdir/.MetricsCapture.$now.$host.csv $tmpdir/MetricsCapture.$now.$host.csv
mv $tmpdir/.StatsCapture.$now.$host.txt $tmpdir/StatsCapture.$now.$host.txt
done

fi

Documentation References

https://www.unix.com/man-page/centos/1/pmlogger/
https://www.systutorials.com/docs/linux/man/1-PCPIntro/
https://menehune.opt.wfu.edu/Kokua/Irix_6.5.21_doc_cd/usr/share/Insight/library/SGI_bookshelves/SGI_Admin/books/PCP_IRIX/sgi_html/ch06.html
https://hostpresto.com/community/tutorials/how-to-monitor-your-server-performance-with-pcp-and-vector-on-ubuntu-14-04/
https://groups.google.com/forum/#!topic/vector-users/aFZn7BzxMrU
https://medium.com/netflix-techblog/introducing-vector-netflixs-on-host-performance-monitoring-tool-c0d3058c3f6f
https://hostpresto.com/community/tutorials/how-to-monitor-your-server-performance-with-pcp-and-vector-on-ubuntu-14-04/
https://vectoross.io/

Bob Dorr - Principal Software Engineer SQL Server