Implementation is done through OS agent running on each Oracle Solaris host (LDOM/CDOM/Global Zone/Zone).
Working modes
- install OS agents on all Control Domains (CDOM) only
- Install OS agents on all LDOMs and Global Zones
- Install OS agents on all LDOMs and Global Zones and Zones
1) you get all CDOM data and limited performance data set about all its LDOMs (CPU/Mem/Net).
2) brings you more details about each LDOM.
3) monitoring all Zones from OS point of view.
Installation summary
- Assure your network allows TCP connection initiated from OS agents to XorMon server on port 8162
- Install the OS agent on all LDOMs, CDOMs and Global Zones
- Optionally install the OS agent on all Zones to get additional OS based metrics
OS agent install on a LDOM/CDOM
- Create user lpar2rrd with role solaris.ldoms.read
- Installation under root:
# gunzip lpar2rrd-agent-6.00-0.solaris-sparc.tar.gz
# tar xf lpar2rrd-agent-6.00-0.solaris-sparc.tar
# pkgadd -d .
The following packages are available:
1 lpar2rrd-agent LPAR2RRD OS agent 6.00
...
Upgrade (remove original package at first then install the new one):
# pkgrm lpar2rrd-agent
# pkgadd -d .
- Assign LDOM/CDOM read rights solaris.ldoms.read for the user (lpar2rrd) which will run the agent:
# usermod -A solaris.ldoms.read lpar2rrd
Assure that rights are fine, "/sbin/ldm ls -p" should not return "Authorization failed"
# su - lpar2rrd
$ /sbin/ldm ls -p
OS agent install on Zone/Global Zone
Use any unprivileged user (lpar2rrd preferably) for agent install and run.
Use same Solaris package like in LDOM example above.
Use Solaris x86 package on that platform: lpar2rrd-agent-6.00-0.solaris-i86pc.tar
Testing connection
- Test connection to the XorMon server
$ /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -d <XorMon-SERVER>
...
OS agent working for server: <XorMon-SERVER>
store file for sending is /var/tmp/lpar2rrd-agent-<XorMon-SERVER>-lpar2rrd.txt
It means that data has been sent to the server, all is fine
Here is example when the agent is not able to sent data :
$ /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -d <XorMon-SERVER>
...
OS agent working for server: <XorMon-SERVER>
store file for sending is /var/tmp/lpar2rrd-agent-<XorMon-SERVER>-lpar2rrd.txt
Agent timed out after : 50 seconds /opt/lpar2rrd-agent/lpar2rrd-agent.pl:265
It means that the agent could not contact the server.
Check communication (if firewalls are open), DNS resolution of the server etc.
Schedule OS agent in Solaris lpar2rrd's crontab
# su - lpar2rrd
$ crontab -e
* * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <XorMon-SERVER> > /var/tmp/lpar2rrd-agent.out 2>&1
Replace <XorMon-SERVER> by hostname of your XorMon server.
You might need to add lpar2rrd user into /etc/cron.allow (Linux) or /var/adm/cron/cron.allow (AIX) if 'crontab -e' command fails
Allow it for lpar2rrd user as root user.
# echo "lpar2rrd" >> /etc/cron.allow
You will see your Solaris boxes in the UI under Solaris folder within an hour (Ctrl-F5 in the web browser).
XorMon is able to monitor stand-alone MS Windows servers and MS Hyper-V performance metrics (hosts and VMs).
Implementation is done through
single Windows OS agent running on any Windows host in the Windows domain.
This OS agent gets all required configuration from the AD and performance data of monitored hosts through WMI.
It passes such data to XorMon server where data is saved and presented.
Is used LPAR2RRD Windows OS agent which works everywhere where is available PowerShell 3 and higher
It does not directly depends on the Windows version, if you are able to upgrade PowerShell to 3.0+ on older machines then it will work.
Installation summary
- Allow TCP connection initiated from Windows LPAR2RRD Hyper-V agent server to XorMon server on port 8162
- PowerShell version 3 and higher only is supported on Windows hosts
User creation
- Create the user in the AD with membership in these groups:
- Event Log Readers
- Hyper-V Administrators
- Performance Log Users
- Performance Monitor Users
This must be done for all Hyper-V nodes and MS Windows servers that are supposed to be monitored (this can be set globally in AD).
- Set rights in GPO and AD using this manual
- Give local admin rights to the the user on the Win server where LPAR2RRD Hyper-V agent will be running.
Add him into Domain Users group.
- Assign read-only rights for monitored Hyper-V clusters to the user
OS agent installation
Unzip LPAR2RRD-Win-agent-1.3.3.zip |
Unblock these files by right clicking them and checking "Unblock"
- Setup.vbs
- Configuration.vbs
- LPAR2RRD-agent.ps1
- LPAR2RRD-agent-Configuration.ps1
- LPAR2RRD-agent-Installer.ps1
 |
When 'Unblock" does not help and installation does not start (install windows immediatelly disapears), you might need to Set-ExecutionPolicy to enable running powershell for current user. |
Run Setup.vbs |
Select installation directory
 |
Put hostname of the XorMon server
 |
Test connection to the XorMon server
 |
Put user which will run LPAR2RRD OS agent on this machine
 |
Select monitored mode (agent v1.3.3+)
 |
You can also use manuall OS agent installation
In case of a cluster: add there cluster name and names of all nodes as well like: cluster1,node1,node2,cluster2,node...
- Wait about 30 minutes, then Ctrl-F5 in your XorMon UI and you should see Hyper-V folder in the main menu
Monitored modes
The agent can run in these modes:
- leave it in default mode and monitor just the server where it is installed
- monitor all visible servers from AD
- monitor only specific servers (recommended), added into cfg file
for monitoring of cluster - add its name to servers/nodes too like: cluster1,node1,node2,cluster2,node...
OS agent is add-on feature for monitoring from operating system level.
It is monitoring CPU, memory utilization, paging, LAN and SAN traffic on all adapters.
It requires the OS agent deployment to every monitored LPAR.
The agent is written in Perl and calls basic OS commands to obtain required statistics like vmstat, lparstat and svmon.
You can even use already installed LPAR2RRD agents and direct them to XorMon host.
Additional information about the OS agent you can find in
documentation
Prerequisites
OS agent installation (client)
-
Get the latest OS agent from download page
- Linux RedHat, Rocky
# rpm -Uvh lpar2rrd-agent-7.60-0.noarch.rpm
# rpm -qa|grep lpar2rrd-agent
lpar2rrd-agent-7.60-0
- AIX / VIOS
# rpm -Uvh lpar2rrd-agent-7.60-0.ppc.rpm
# rpm -qa|grep lpar2rrd-agent
lpar2rrd-agent-7.60-0
- Linux Debian
# apt-get install ./lpar2rrd-agent_7.60-0_all.deb
lpar2rrd-agent-7.60-0
- Schedule its run every minute from the crontab on every LPAR.
This line must be placed into lpar2rrd crontab:
# su - xormon
$ crontab -e
* * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <XorMon_SERVER.your-domain.com> > /var/tmp/lpar2rrd-agent.out 2>&1
Replace <XorMon_SERVER> by hostname of your XorMon server.
Use preferably FDQN in XorMon hostname, hostname only might have a problem with resolving.
In case you want to direct the agent data to more servers (or ports) add second or more hosts on the cmd line
Bellow cfg will collect data once and sends it to 3 Hosts (Host1 port 8162, Host2 port 8162 and Host3 port 7162)
* * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <Host1> <Host2> <Host3>:7162 > /var/tmp/lpar2rrd-agent.out 2>&1
-
You might need to add xormonuser into /var/adm/cron/cron.allow (AIX) or /etc/cron.allow (Linux) under root user if above "crontab -e" fails.
# echo "lpar2rrd" >> /etc/cron.allow
Troubleshooting
Client (agent) side:
-
Test if communication through the LAN is allowed.
$ telnet <XorMon_SERVER> 8162
Connected to 192.168.1.1 .
Escape character is '^]'.
This is ok, exit either Ctrl-C or ^].
-
Check following agent files:
data store: /var/tmp/lpar2rrd-agent-*.txt
error log: /var/tmp/lpar2rrd-agent-*.err
output log: /var/tmp/lpar2rrd-agent.out
-
run the agent from cmd line:
$ /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -d <XorMon_SERVER.your-domain.com>
...
Agent send : yes : forced by -d
Agent send slp: sending wait: 4
OS/HMC agent working for server: <XorMon_SERVER>
store file for sending is /var/tmp/lpar2rrd-agent-<XorMon_SERVER.your-domain.com>-lpar2rrd.txt
It means that data has been sent to the server, all is fine
Here is example when the agent is not able to sent data :
$ /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -d <XorMon_SERVER.your-domain.com>
...
Agent send : yes : forced by -d
Agent send slp: sending wait: 1
OS/HMC agent working for server: <XorMon_SERVER>
store file for sending is /var/tmp/lpar2rrd-agent-<XorMon_SERVER>-lpar2rrd.txt
Agent timed out after : 50 seconds /opt/lpar2rrd-agent/lpar2rrd-agent.pl:265
It means that the agent could not contact the server.
Check communication, port, above telnet example, DNS resolution of the server etc.
Note that you do not need to upgrade OS agent with every LPAR2RRD server upgrade. Check here if you need any fix or new feature.
- AIX or VIOS : lpar2rrd-agent-XXX.ppc.rpm
- Linux ppc/x86: lpar2rrd-agent-XXX.noarch.rpm
- Linux Debian: lpar2rrd-agent-XXX._all.deb
-
Solaris x86: lpar2rrd-agent-XXX.solaris-i86pc.tar.gz
-
Solaris Sparc: lpar2rrd-agent-XXX.solaris-sparc.tar.gz
Release notes
- 8.20-0
- added lpar2rrd-AIX-wrapper.sh wrapper to avoid Dynatrace One agnet issue
- Linux: OS info is added
- 8.10-10
- 8.10-7
- AIX: Huawei volume id proper detection
- Hostname was saved just for AIX, fixed i pro x86
- 8.10-6
- Solaris: cpu_sys a cpu_usr was swapped, fixed
- 8.10-4
- AIX: LVM: discovery is ruynning once 6,5 hour instead of every minute
- 8.10-2
- AIX: LVM: unmounted filesystems are skipped
- AIX LVM: Concurrent VG were skipped, fixed
- 8.00-7
- Agent sends data after 10minutes instead of 20 minutes plus random time (0 - 9 minutes)
- 8.00-5
- 8.00-2
- AIX: getting Hitachi storage volume id
- bulk data transfers (all data lines in single chunk)
- 7.91-7
- Linux: port WWWN is sending now
- 7.91-1
- Linux: skipping disk capacity for /dev/loop*
-
7.90-1
- AIX: fixed a bug from previou version where did not FC adapter throughput
-
7.90-0
- Monitoring of AIX FC adapter errors
-
7.60-14
- Monitoring of AIX Eth adapter errors
-
7.60-5
- Excluded Perl dependency on File::Copy which might not exsist on Fedora or RHEL
- RHEL9: lscpu has different syntax, CPU MHz is not in std output, must be used "lscpu -e=MHZ" separatelly, without that does not work CPU core graphs
- AIX: trasfered is even hdisk size
-
7.60-1
- AIX FC errors in graphs (FC physical adapters only)
- Linux: SAN data: support for more FC drivers types when some do not provide counters we use but provide other counters
- Linux: CPU model is send and saved on the server side
- Suport of Solaris 8 (memory usage works now however without FS cache)
-
7.50-1
- Linux: support for multipath health status alerting
- AIX / Linux: fixes for getting volume ID of disks
- Solaris: SARMON enhanced about IOPS, MB/sec and latency
-
7.40-1
- Solaris sarmon CPU queue implemented
- Linux: added CPU stolen metric
-
7.30-1
- Linux: added CPU core and GHz utilisation graphs for Linux (x86 only)
- Linux Total: fixed data when there was more adapters
- AIX: wrong identification of iSCSI disks in mutlitaph checking
-
7.20-1
- Linux fix: the agent loses VM UUID (/opt/lpar2rrd-agent/.uuid) after upgrade (it cannot map the VM to the OS agent data then in the product UI), it is also not refreshed after reboot
To avoid it do not use "rpm -Uvh ...", but remove original rpm and install this one like here:
rpm -e lpar2rrd-agent-7.20-0.noarch; rpm -i lpar2rrd-agent-7.20-1.noarch.rpm
For future upgrades you can use again rpm -Uvh ...
-
Solaris10: fixed an issue with zone in the command prstat (agent not sending data)
-
7.20-0
- Linux: new total IOPS, Data and Latency graphs
- Linux multipath: using sudo for "multipath" command (only root can run it)
-
7.00-4
- AIX: SAN: fcstat is parsing correctly even if there is NVMe stats
- AIX: fixed SMT number recognization
- Solaris10: fix for: swap -lk
-
7.00-1
- Linux: fix in send of volume_id, not all disks were selected
-
7.00-0
-
6.15-2
- AIX VIOS: unused LAN adapters might produce "SEA LOOP" error log entries, fixed
- AIX rpm install error fixed "Error in PREUN scriptlet in rpm package lpar2rrd-agent-6.XX..."
- Solaris CPU queue stats : number of virtual CPUs was wrong
-
6.15-0
- AIX/Linux/Solaris: support of CPU queue statistics
-
6.11-2
- Solaris: support for new menu structures in the UI
-
6.10-0
- Solaris: support of Pools
- NMON: support of Linux on Power
- Linux: shmmem fix
-
6.01-0
- AIX: fix for filesystem utilization, it has not been reported when NFS filesystems was there
- Linux: fix for filesystem utilization, problem with long filesystem names
-
6.00-0
- AIX WLM (Workload Manager) monitoring
- Monitoring and alerting for filesystem space utilization
-
5.05-7
-
AIX pinned memory did not count large pages (an issue since "vmstat -v" implementation for mem data in 4.95-7)
-
5.05-6
-
AIX WPAR memory was fixed (an issue since "vmstat -v" implementation for mem data in 4.95-7)
-
5.05-5
-
Linux memory: buffers memory has been added to cached (FS cache)
-
5.05-2
-
VMware support for RHEL 5.x where UUID is not accessible from /proc
-
5.05-0
-
VMware support, OS agents running o VM's are mapped directly to VM's in VMware menu tree
-
JOB TOP, CPU and Memory tracking of running processes graphically in the time
-
5.00-4
-
AIX: skipping fcs interfaces which has scsi_id=0x0 (lsattr -El), it might lead to hanging of fcstat and entstat cmds
-
5.00
-
4.96
-
Linux: LAN write throughput was wrong, there was the same value as for read
-
AIX: added support for native SCSI adapters : sisioa0
-
4.95-7
-
Linux: switched to use `cat /proc/meminfo` instead of `free` for memory stats
-
AIX: memory stats switched to use "vmstat -v" instead of "svmon -G" as before
(svmon counts FS cache in different way)
-
AIX LAN enhanced about packet count in/out
It is presented in the UI under LAN tab (below usual MB/sec graphs, LPAR2RRD server v4.95-6+)
-
4.95-4
-
Solaris: fixed a few issues with memory reporting on Sparc HW
-
WPAR: avoids cmds which are not working in WPAR env and just reports errors in the log
-
4.90-0
-
Network statistic ported on CentOS7 & Red Hat 7 (new "ifconfig -a" format)
-
4.84-3
-
Randomization of agent connecting to the LPAR2RRD could lead to long times, introduced max time (20 mins) for pushing data
-
4.84-2
-
HMC agent could stop working under some condition since 4.81 agent version, fixed
-
4.84-1
-
Fixed memory graphs for some Linux X86 like CentOS 6 which had wrong data (negative). Most of Linux X86 have not been affected
-
4.84-0
-
4.81-1
-
4.80 and 4.81 reported FCS only if there was any disk locally attached what NPIV and tape dedicated FSC are not
-
4.81
-
OS agent contacts the daemon randomly after 5 minutes to more spread connecting of the daemon in the time
-
4.80
-
vSCSI adapters support
-
response times support for all vSCSI and FCS adapters are monitored (requires LPAR2RRd server 4.74+)
-
SMT info from LPARs is passed to the server to support SMT based CPU Workload Estimations
-
as prevetion against hanging external shell commands executed from the agent there is limited number of copies every particular external cmd running concurently in the system
-
SEA fix (it did not sent SEA data for some NMON versions which use a bit different key word in the nmon file)
-
introduced variable which preferes hostname before the lpar name in data send to the daemon (usefull for NMON loads from unmanaged full lpar partitions where is not posible to change lpar name)
LPAR2RRD_HOSTNAME_PREFER=1 /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -n /home/nmon_dir <LPAR2RRD server>
-
4.70
-
fixed a paging issue on AIX where paging of 64kB pages was considered as standard 4kB pages
(real paging was higher than displayed one)
-
many of small other fixes, upgrade to 4.70 if possible.
-
4.60
-
it is implemented support for integrated NMON online grapher (aka nmon.lpar2rrd.com) in you local instance
-
couple of small other fixes, you might keep to use 4.50 if it is working fine.
-
4.50
-
NMON data support
-
It is able to use already generated NMON data as a source (online and batch mode processing is possible) instead of using agent called commands to get statistical data.
-
It might use NMON data and its own data as the source concurrently.
-
It is able to process collected NMON data from many LPARs in one location as a batch job once a day.
-
WPAR support
-
HMC monitoring
-
It posibble to feed via one OS agen more LPAR2RRD server concurently
-
4.07
-
fixes FCS problems (it avoids non connected fcs adapters which causing problems like long term query and errpt warnings)
-
4.06
-
it uses internal timestamp to save time of last saved data on the server side, then it sends newer data only
-
4.05
-
limit of concurrently running OS agents to 10 on single OS (to do not exhaust max number of running processes if anything stuck)
-
possibility to exclude fcstat (SAN stats), the cmd might cause problems
crontab line:
* * * * * FCSTAT=/bin/true; export FCSTAT; /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl >LPAR2RRD-SERVER< >PORT< > /var/tmp/lpar2rrd-agent.out 2>&1
-
4.04
-
fix for LPARs (servers) not managed by the HMC, it has not worked before correctly
-
4.03
-
fix for AME support, it did not work in 4.00, it is necessary to upgrade even LPAR server to 4.03 release at least
-
fix for memory usage data if AMS is being used
-
4.00
-
New communication protocol (requires 4.00+ LPAR2RRD server)
-
Data collection enhanced about LAN (eth), SAN (fcs), SEA (VIOS only), paging allocated, AME
-
1.02
-
initial release which supports memory and paging data only