To analyze the provided panic error, the two sections of AutoSupport that are essential for investigation are:1. HA-RASTRACE.TGZWhat it is: HA-RASTRACE.TGZ contains HA (High Availability) system trace logs. It records hardware diagnostics, error traces, and the HA system's response to hardware events. These logs are critical when analyzing hardware-related panics, including those caused by PCI errors.Why it's relevant to the panic: In the given panic message, the NMI (Non-Maskable Interrupt) error originates from a Qlogic FC 16G adapter. HA-RASTRACE.TGZ will provide detailed diagnostics, including the error reporting from the HA interconnect and other hardware diagnostics. Specifically, it may include information about how the system detected the PCI fault and any actions taken to protect the system state.How to analyze:Extract the HA-RASTRACE.TGZ file from the AutoSupport bundle.Review hardware-related trace messages for entries associated with the PCI bus or the Qlogic FC adapter.Look for specific error codes or keywords like PCI Error, NMI, or Qlogic.NetApp's 'AutoSupport Logs and Diagnostics Guide' highlights HA-RASTRACE.TGZ as a primary resource for debugging hardware faults.The 'Panic Troubleshooting Guide' for ONTAP systems specifies HA-RASTRACE as a key source for identifying NMI-related errors.2. SSRAM-LOGWhat it is:SSRAM-LOG records low-level hardware error details, including PCI device register states and uncorrectable memory errors. It is particularly useful for analyzing errors originating in peripheral hardware like network orstorage adapters connected via PCI.Why it's relevant to the panic: The panic message explicitly references a PCI Error NMI caused by a Qlogic FC adapter. SSRAM-LOG captures detailed state information for PCI devices, which can help identify whether the faultoriginated in the adapter hardware, the PCI bus, or another related component.How to analyze:Extract the SSRAM-LOG from the AutoSupport bundle.Search for PCI-related errors, including the specific error source IDs (e.g., ErrSrcID(CorrSrc(0xf00),UCorrSrc(0x18))).Review the log entries to confirm the root cause of the NMI.The 'Hardware Diagnostics and Troubleshooting Guide for ONTAP' lists SSRAM-LOG as a key file for debugging PCI errors.NetApp's documentation on PCI diagnostics emphasizes the use of SSRAM-LOG for validating hardware-level faults.
To analyze the provided panic error, the two sections of AutoSupport that are essential for investigation are:
1. HA-RASTRACE.TGZ
What it is: HA-RASTRACE.TGZ contains HA (High Availability) system trace logs. It records hardware diagnostics, error traces, and the HA system's response to hardware events. These logs are critical when analyzing hardware-related panics, including those caused by PCI errors.
Why it's relevant to the panic: In the given panic message, the NMI (Non-Maskable Interrupt) error originates from a Qlogic FC 16G adapter. HA-RASTRACE.TGZ will provide detailed diagnostics, including the error reporting from the HA interconnect and other hardware diagnostics. Specifically, it may include information about how the system detected the PCI fault and any actions taken to protect the system state.
How to analyze:
Extract the HA-RASTRACE.TGZ file from the AutoSupport bundle.
Review hardware-related trace messages for entries associated with the PCI bus or the Qlogic FC adapter.
Look for specific error codes or keywords like PCI Error, NMI, or Qlogic.
NetApp's 'AutoSupport Logs and Diagnostics Guide' highlights HA-RASTRACE.TGZ as a primary resource for debugging hardware faults.
The 'Panic Troubleshooting Guide' for ONTAP systems specifies HA-RASTRACE as a key source for identifying NMI-related errors.
2. SSRAM-LOG
What it is:
SSRAM-LOG records low-level hardware error details, including PCI device register states and uncorrectable memory errors. It is particularly useful for analyzing errors originating in peripheral hardware like network or
storage adapters connected via PCI.
Why it's relevant to the panic: The panic message explicitly references a PCI Error NMI caused by a Qlogic FC adapter. SSRAM-LOG captures detailed state information for PCI devices, which can help identify whether the fault
originated in the adapter hardware, the PCI bus, or another related component.
How to analyze:
Extract the SSRAM-LOG from the AutoSupport bundle.
Search for PCI-related errors, including the specific error source IDs (e.g., ErrSrcID(CorrSrc(0xf00),UCorrSrc(0x18))).
Review the log entries to confirm the root cause of the NMI.
The 'Hardware Diagnostics and Troubleshooting Guide for ONTAP' lists SSRAM-LOG as a key file for debugging PCI errors.
NetApp's documentation on PCI diagnostics emphasizes the use of SSRAM-LOG for validating hardware-level faults.