OS agent is a solution for those of you who wish to get additional metrics that can be obtained only from the Operating System level.
CPU
|
CPU Queue
|
Memory
|
LAN
|
SAN
|
SAN IOPS
|
SAN Latency
|
OS agent metrics and features
-
OS CPU utilization of user/sys/IO wait/idle in %
-
CPU queue: load average, blocked processes / raw / direct IO
-
Memory utilization of used/FS cache/free memory in MB
-
Paging rate in MB/sec
-
Paging space utilization in %
-
SAN (FC & vSCSI) throughput per adapter
- data in MB/sec
- IO/sec
- response time (latency)
- error
-
LAN (ethernet) throughput per adapter
- data in MB/sec
- packet count
- error
-
Total IO throughput (Linux)
- IOPS
- Data in MB/sec
- response time (latency)
-
Filesystem capacity utilization
-
AIX SEA (Shared Ethernet Adapter) throughput per adapter in MB/sec (IBM Power only)
- SAN multipath monitoring
- JOB TOP, CPU and Memory tracking of running processes visually over time
Operating systems
-
AIX 5.1+
-
Linux on Power
-
Linux x86
Implementation
it is implemented as a simple client/server application.
There is XorMon NG daemon listening on the host where XorMon NG server is running on port
8162.
Each LPAR has a simple Perl-based agent installed. This agent is started every minute from the crontab and saves memory and paging statistics into a temporary file.
The agent contacts the server every 15-25 minutes and sends all locally stored data for that period.
Agent prerequisites
- Perl interpreter. All Unix/Linux systems contain Perl in basic installation.
- It may run under any user account, it does not need any special privileges in the OS.
- Opened TCP communication between each LPAR and XorMon NG server on port 8162.
- Connections are initiated from the monitored AIX / Linux only.
Usage
perl lpar2rrd-agent.pl [-s ] [-d] [-c] [-n ] [-b ] [-i ] <XorMon NG server hostname/IP>[:<PORT>]
-d forces sending out data immediately to check communication channel (DEBUG purposes)
-c agent collects & sends only internal HMC data
-n agent sends only NMON data from NMON directory <NMON_DIR>
-b path to Hitachi HvmSh API
-i IP address of HVM (Hitachi Virtualization Manager)
-t <max send time in seconds>
-s <step in seconds>, do not set < 60, do not forget to update crontab line accordingly e.g. -s 300 means in crontab */5 for minutes
-m using sudo for multipath (only root can run it): sudo multipath -l", put this into sudoers: lpar2rrd ALL = (root) NOPASSWD: /usr/sbin/multipath -ll
options -c and -n are mutual exclusive
options -b and -i are both required for Hitachi agent
no option - agent collects & sends standard OS agent data
Crontab entry for scheduling, use non-admin account preferably
* * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <XorMon NG server hostname/IP> > /var/tmp/lpar2rrd-agent.out 2>&1
The agent collects data and sends them every 5 - 20 minutes to the XorMon NG server
If you use other than standard XorMon NG port, then add it after SERVER, separated by the ':' delimiter
* * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <XorMon NG server hostname/IP>:<PORT> > /var/tmp/lpar2rrd-agent.out 2>&1
If you want to send data to more XorMon NG server instances (number is not restricted)
* * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <XorMon NG server 1 hostname/IP> <XorMon NG server 2 hostname/IP> > /var/tmp/lpar2rrd-agent.out 2>&1
Enhanced setting
-
The default behaviour is such that the agent tries to send data to the XorMon NG server at random 5 - 20 mins intervals
you can specify max time limit for sending data, minimum is 5 minutes
* * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -t <max send time in seconds> <XorMon NG server hostname/IP>
-
How to avoid SAN checks via fcstat (those may cause some problems, it should not happen in v4.50+ though)
* * * * * FCSTAT=/bin/true /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <XorMon NG server hostname/IP> > /var/tmp/lpar2rrd-agent.out 2>&1
-
By default, only interfaces that have an IP address assigned are reported; this be skipped by using an env variable and selection is done based on XorMon NG_LAN_INT env var, it allows regex only for Linux, be careful here to do not stack in 1 graph interfaces from different virtualization level what might lead to creasing of presented traffic by counting some traffic more times
* * * * * XorMon NG_LAN_INT="eth.*0$,bond.*,rhevm,9.*" /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <XorMon NG server hostname/IP> > /var/tmp/lpar2rrd-agent.out 2>&1
Debug