Home / Monitoring IBM BladeCenter H Series

Monitoring IBM BladeCenter H Series


Device Monitoring Templates > Monitoring-IBM-BladeCenter-H-Series
 Summary >> What to Monitor >> Monitor Details >> Related Topics
Summary
Device Type Object Identifier Mibs
IBM BladeCenter H Series

.1.3.6.1.4.1.2.6.158.5


What to Monitor
Monitor SNMP OID Details
Health Monitors TOP ^
System Health State .1.3.6.1.4.1.2.3.51.2.2.7.1.0

 

This monitor gives the status of system health for the

system in which the ASM resides. It returns one of the following values:

 

  • critical: implying a severe error has occurred and the system may not be functioning.
  • nonCritical: indicates that a error has occurred but the system is currently functioning properly.
  • systemLevel: implying that a condition has occurred that may change the state of the system in the future but currently the system is working properly.
  • normal: implying that the system is operating fine.

 

 

 

(LED) Blade Health State

.1.3.6.1.4.1.2.3.51.2.2.8.2.1.1.5

 

The overall state of system health (of the server blade) is monitored by querying this oid. It returns one of the following outputs:

1 = good, 2 = warning, 3 = bad

 

Switch Module Health State .1.3.6.1.4.1.2.3.51.2.22.3.1.1.1.15 This module is responsible for transmitting information from and to the BladeCenter modules over the Ethernet. The LED status of the switch module indicates its health. A performance degradation of this module is determined by querying this variable which returns : 0 = unknown, 1 = good, 2 = warning, 3 = bad.
Health State for the Power Module .1.3.6.1.4.1.2.3.51.2.2.4.1.1.3 The power health module contains the power health information for each power module. This module is responsible for the cooling of the chassis. This particular monitor returns the state for the power module as 0 = unknown, 1 = good, 2 = warning, 3 = not available.
Power/LED Monitors TOP ^
Temperature

1.3.6.1.4.1.2.3.51.2.2.1.1.2.0

 

The chassis temperature (caused due to the heat generated by the active blades), must be maintained at an acceptable level and the administrator must be notified if it exceeds a certain threshold. Sometimes, the full unit is shut down if the temperature shoots the limit which mandates its monitoring. The BladeServer functions optimally only if there is effective Chassis cooling. 


This monitor the module temperature in degrees centigrade(C).

 

 

  • Blower 1 State
  • Blower 2 State

 

 

 

  • .1.3.6.1.4.1.2.3.51.2.2.3.10
  • .1.3.6.1.4.1.2.3.51.2.2.3.11

 

 

 

Performance degradation of the blower is determined by watching this monitor. When queried, it returns one of the following outputs:

0 = unknown, 1 = good, 2 = warning, 3 = bad.

 

Blade Power State (LED) .1.3.6.1.4.1.2.3.51.2.2.8.2.1.1.4

This monitor indicates whether the particular server blade is powered on or not. When queried, it returns a value of 0 if powered off and 1 if powered on.

 

  • Blower 1 Speed
  • Blower 2 Speed
  • .1.3.6.1.4.1.2.3.51.2.2.3.1
  • .1.3.6.1.4.1.2.3.51.2.2.3.2
The blowers are used to cool the chassis. Blower 1 speed expressed in percent(%) of maximum RPM. An octet string expressed as 'ddd% of maximum' where d is a decimal digit or blank space for a leading zero. If the blower is determined not to be running or the blower speed cannot be determined, the string will indicate 'Offline'

The system will be shutdown if condition (4)occurs.
System Resource & Traffic Monitors TOP ^
CPU Utilization
Each blade has its own CPU, Memory, and Disk and these are the critical system resources that need monitoring on the individual blades. The blades mostly have Host-Resource mib implemented. Based on the OS on the blades, you can also monitor these resources using CLI or WMI.
Memory Utilization

Disk Utilization

Rx Traffic .1.3.6.1.2.1.2.2.1.10 Rx Utilization is the percentage of the network bandwidth currently used by the received traffic on the network. A consistent high utilization indicates bottlenecks in the network and needs further troubleshooting.
Tx Traffic .1.3.6.1.2.1.2.2.1.16 Tx utilization is the percentage of the network bandwidth used up by the transmitted traffic. Again, a high utilization indicates network performance bottlenecks. Indepth traffic analysis using the Netflow module helps identify and free-up the bandwidth quickly.
Rx/Tx Errors

Rx- .1.3.6.1.2.1.2.2.1.14
Tx - .1.3.6.1.2.1.2.2.1.20

The number of inbound packets (Rx) or out-bound packets (Tx) containing errors, preventing them from being delivered to the next layer protocol.
Related Topics TOP ^

Monitoring HP Proliant Servers


    Post a comment

    Your Name or E-mail ID (mandatory)

     

    Note: Your comment will be published after approval of the owner.




     RSS of this page