Monitor | SNMP OID | Details |
|
System Health State | .1.3.6.1.4.1.2.3.51.2.2.7.1.0 | This monitor gives the status of system health for the system in which the ASM resides. It returns one of the following values: - critical: implying a severe error has occurred and the system may not be functioning.
- nonCritical: indicates that a error has occurred but the system is currently functioning properly.
- systemLevel: implying that a condition has occurred that may change the state of the system in the future but currently the system is working properly.
- normal: implying that the system is operating fine.
|
(LED) Blade Health State
| .1.3.6.1.4.1.2.3.51.2.2.8.2.1.1.5 | The overall state of system health (of the server blade) is monitored by querying this oid. It returns one of the following outputs: 1 = good, 2 = warning, 3 = bad |
Switch Module Health State | .1.3.6.1.4.1.2.3.51.2.22.3.1.1.1.15 | This module is responsible for transmitting information from and to the BladeCenter modules over the Ethernet. The LED status of the switch module indicates its health. A performance degradation of this module is determined by querying this variable which returns : 0 = unknown, 1 = good, 2 = warning, 3 = bad. |
Health State for the Power Module | .1.3.6.1.4.1.2.3.51.2.2.4.1.1.3 | The power health module contains the power health information for each power module. This module is responsible for the cooling of the chassis. This particular monitor returns the state for the power module as 0 = unknown, 1 = good, 2 = warning, 3 = not available. |
|
Temperature | 1.3.6.1.4.1.2.3.51.2.2.1.1.2.0 | The chassis temperature (caused due to the heat generated by the active blades), must be maintained at an acceptable level and the administrator must be notified if it exceeds a certain threshold. Sometimes, the full unit is shut down if the temperature shoots the limit which mandates its monitoring. The BladeServer functions optimally only if there is effective Chassis cooling.
This monitor the module temperature in degrees centigrade(C). |
- Blower 1 State
- Blower 2 State
| - .1.3.6.1.4.1.2.3.51.2.2.3.10
- .1.3.6.1.4.1.2.3.51.2.2.3.11
| Performance degradation of the blower is determined by watching this monitor. When queried, it returns one of the following outputs: 0 = unknown, 1 = good, 2 = warning, 3 = bad. |
Blade Power State (LED) | .1.3.6.1.4.1.2.3.51.2.2.8.2.1.1.4 | This monitor indicates whether the particular server blade is powered on or not. When queried, it returns a value of 0 if powered off and 1 if powered on. |
- Blower 1 Speed
- Blower 2 Speed
| - .1.3.6.1.4.1.2.3.51.2.2.3.1
- .1.3.6.1.4.1.2.3.51.2.2.3.2
| The blowers are used to cool the chassis. Blower 1 speed expressed in percent(%) of maximum RPM. An octet string expressed as 'ddd% of maximum' where d is a decimal digit or blank space for a leading zero. If the blower is determined not to be running or the blower speed cannot be determined, the string will indicate 'Offline' The system will be shutdown if condition (4)occurs. |
System Resource & Traffic Monitors | TOP ^ | |
CPU Utilization |
| Each blade has its own CPU, Memory, and Disk and these are the critical system resources that need monitoring on the individual blades. The blades mostly have Host-Resource mib implemented. Based on the OS on the blades, you can also monitor these resources using CLI or WMI. |
Memory Utilization |
|
|
Disk Utilization |
|
|
Rx Traffic | .1.3.6.1.2.1.2.2.1.10 | Rx Utilization is the percentage of the network bandwidth currently used by the received traffic on the network. A consistent high utilization indicates bottlenecks in the network and needs further troubleshooting.
|
Tx Traffic | .1.3.6.1.2.1.2.2.1.16 | Tx utilization is the percentage of the network bandwidth used up by the transmitted traffic. Again, a high utilization indicates network performance bottlenecks. Indepth traffic analysis using the Netflow module helps identify and free-up the bandwidth quickly. |
Rx/Tx Errors | Rx- .1.3.6.1.2.1.2.2.1.14 Tx - .1.3.6.1.2.1.2.2.1.20 | The number of inbound packets (Rx) or out-bound packets (Tx) containing errors, preventing them from being delivered to the next layer protocol. |