I just wanted to add a small addendum to that posting. I have seen severe issues with the MegaRaid controller going into writethrough mode. It is particularly crucial on the compute nodes. Under some circumstances it can lead to the drives on the compute node suffering disk corruption. I have felt the pain of this leading to a so called Bare Metal Restore of the affected node.
I’ve also had the pleasure of being involved with the replacement of around 50 Exadata V2 batteries. In the last couple of months. This is almost certainly due to the age of the batteries. The batteries will have been due to be replaced in all Exadata’s after 2 years, but these batteries just failed to make the distance.
One of the MegaCLI commands Andy highlighted provides a wealth of information:
[root@db01 ~]# /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -a0 BBU status for Adapter: 0 BatteryType: iBBU08 Voltage: 4040 mV Current: 0 mA Temperature: 50 C BBU Firmware Status: Charging Status : None Voltage : OK Temperature : OK Learn Cycle Requested : No Learn Cycle Active : No Learn Cycle Status : OK Learn Cycle Timeout : No I2c Errors Detected : No Battery Pack Missing : No Battery Replacement required : No Remaining Capacity Low : No Periodic Learn Required : No Transparent Learn : No Battery state: GasGuageStatus: Fully Discharged : No Fully Charged : No Discharging : No Initialized : Yes Remaining Time Alarm : No Remaining Capacity Alarm: Yes Discharge Terminated : No Over Temperature : No Charging Terminated : No Over Charged : No Relative State of Charge: 100 % Charger System State: 1 Charger System Ctrl: 0 Charging current: 0 mA Absolute state of charge: 0 % Max Error: 0 % BBU Capacity Info for Adapter: 0 Relative State of Charge: 100 % Absolute State of charge: 87 % Remaining Capacity: 1341 mAh Full Charge Capacity: 1353 mAh Run time to empty: Battery is not being discharged Average time to empty: 161 min Average Time to full: Battery is not being charged Cycle Count: 2 Max Error: 0 % Remaining Capacity Alarm: 0 mAh Remaining Time Alarm: 0 Min BBU Design Info for Adapter: 0 Date of Manufacture: 06/02, 2011 Design Capacity: 1530 mAh Design Voltage: 4100 mV Specification Info: 0 Serial Number: 2080 Pack Stat Configuration: 0x0000 Manufacture Name: LS36681 Device Name: bq27541 Device Chemistry: LPMR Battery FRU: N/A BBU Properties for Adapter: 0 Auto Learn Period: 2592000 Sec Next Learn time: 384645185 Sec Learn Delay Interval:0 Hours Auto-Learn Mode: Enabled Exit Code: 0x00
This is from a V2 that has had it’s battery replaced. First thing to highlight is the battery type:
Earlier batteries were 07 and were perhaps less longer lasting than the full 2 years before a preventative maintenance was due. I’d be extra vigilant if you have the 07 model. It will show as iBBU on a storage cell and unknown on a compute node.
Next up is the temperature:
Temperature: 50 C
You really want to ensure this is under 55C or there is something either wrong with the environment (use ipmitool to check the ambient temperature) or the battery is overheating.
You can tell if your battery is charging with either the:
Charging Status : None
Average Time to full: Battery is not being charged
Output’s would show charging and a time to full if it was charging. One possible reason for a low battery charge is a learn cycle.
Charge capacity determines whether the writeback or writethrough mode is in use:
Full Charge Capacity: 1353 mAh
This is a relatively new battery and has a good amount of charge.
As you start approaching going below 700 mAH you may want to take proactive action and schedule a battery replacement.
You also want to ensure the Max error is down low:
Max Error: 0 %
Last thing I’m going to highlight is the battery manufacture:
Device Chemistry: LPMR
While on an X2-2 it displays:
Device Chemistry: LION
Both display the bq27541 used for determining the charge level. Apart from this line, there appears little difference between the output on a V2 and an X2.
Just to reemphasise keep an eye on your batteries and make sure your MegaRaid Controller is in writeback!