I’ve posted the translation of this article on HyperForge and I think this is also worthful to post it on my blog.
Starting with version 3.2 Hyperic HQ offers a new function called HQ Health. You can find the function in the Administration Menu Administration -> Plugins -> HQ Health
HQ Health offers a lot of metrics like memory usage of the server, information about caching params and load and a complete list of all HQ Agents connected to this particular HQ Server.
If you experience connection problems between HQ Server and Agent or if a platform shows up as unavailable in HQ when it’s definetly available HQ Health is the first tool I recommend you to debug the problem.
HQ Health offers the following informations about connected Agents:
FQDN
This is the identifier of the platform
within HQ Server
Address
IP-Address for communication from Server to Agent
Port
Port on Agent side for communication from Server to Agent. Default port is 2144. You’ve to allow connections from HQ Server to the Address and the Port. Check this by run telnet <Address> <Port> is a good idea. This is an example for a proper connect:
mpluhar@biollante:/tmp$ telnet 192.168.1.114 2144 Trying 127.0.0.1... Connected to localhost. Escape character is '^\]'. GET Connection closed by foreign host.
If you get a timeout or no connection, check your HQ Agent configuration and your firewall and network configuration.
Version
Shows the version number of your HQ Agent. Hyperic HQ is backward compatible. So you may run Hyperic HQ Server 3.2.x with 3.1.x Agents, but I strongly encourage you to run the latest version. If you have any problems with Hyperic HQ upgrade Server and Agents to the latest version and check the version of your Agents with HQ Health.
Creation Time
Creation Time shows you the date when you made the initial setup the platform within Hyperic HQ Server
#Platforms
The number of platforms this Agent is collecting data for. For example an agentless device like an SNMP-device is a platform, too.
#Metrics
This is the number of metrics which an Agent collects. Try to balance the number of metrics within different HQ Agents and try to avoid to make a single Agent collect thousand metrics for all SNMP devices in your network. You create a single point of failure and if the Agent plaform fails, you’re in trouble monitoring the child devices. Furthermore if the host ist not a dedicated monitoring host, you maybe downgrade the performance of other services you’re running on the host.
Time Offset
Time Offset shows the offset in ms between HQ Server and Agent. Time synchronisation on HQ Server and HQ Agents is very important to determine the availability of platforms and services correctly. If you see huge values here, you’re in trouble. Try to setup NTP-daemons on your Server and Agent hosts. Of course, you can monitor your NTP-daemon with Hyperic HQ and fire up an alarm if the offset becomes too big. Single or double digit values are okay. If you see a question mark, your HQ Agent seems to be inactive
Currently with Hyperic HQ 3.2.x it’s not possible to run HQ Agents and Server in different time zones.
[...] Okay, this is only a minor release but there are a lot of important bux fixes and since the upgrade process works fine every time I upgraded, I definetly encourage everybody to upgrade their installations. Don’t forget to upgrade your HQ Agents ! You’ll find a list of all Agents with version numbers in HQ Health. [...]