Interpreting the output from smartctl
is tricky because some of the parameters represent packed values that have to be processed first. For example, Seagate describes its SMART parameters in their hard-drive SMART attribute documentation that lists all the parameters and how the values are packed.
For example, reading Raw_Read_Error_Rate
or Seek_Error_Rate
seems pretty alarming due to the high values but the value has to be converted into hexadecimal and then broken up into various sub-values in order to obtain meaningful readings.
The following script follows the Seagate attribute manual and computes the real number of failures. Alternatively, for graphical interfaces, gsmartctrl
is a smartctl
GNOME GUI that can also highlight suspicious or bad values while accounting for the values that look alarming.
#!/bin/sh # # # ███████╗██╗ ██╗███████╗███████╗████████╗ ██████╗ ██████╗ ██████╗ ██████╗ # # ██╔════╝██║ ██║██╔════╝██╔════╝╚══██╔══╝██╔════╝ ██╔═══██╗██╔═══██╗██╔══██╗ # # ███████╗██║ █╗ ██║█████╗ █████╗ ██║ ██║ ███╗██║ ██║██║ ██║██║ ██║ # # ╚════██║██║███╗██║██╔══╝ ██╔══╝ ██║ ██║ ██║██║ ██║██║ ██║██║ ██║ # # ███████║╚███╔███╔╝███████╗███████╗ ██║ ╚██████╔╝╚██████╔╝╚██████╔╝██████╔╝ # # ╚══════╝ ╚══╝╚══╝ ╚══════╝╚══════╝ ╚═╝ ╚═════╝ ╚═════╝ ╚═════╝ ╚═════╝ # # # # IT-Beratung mit Fokus auf Datensicherheit # # # # www.sweetgood.de # # # # Copyright : All rights reserved! # Repository url : https://codeberg.org/SWEETGOOD/shell-scripts/ # Author : SWEETGOOD # Filename : parse-raw-smart-values-seagate.sh # Created at : 13.05.2024 # Last changed at : 13.05.2024 # Version : 1.0 # License : CC BY-SA 4.0 Deed # Description : Checks RAW encoded prefailure values on SEAGATE drives # Inspiration from: https://www.disktuna.com/big-scary-raw-s-m-a-r-t-values-arent-always-bad-news # Requirements : smartctl bc # If no parameter was given if [ $# = 0 ] then exit 255 fi LC_ALL=de_DE.UTF-8 # Function to split a number into two 32-bit integers (created using https://codingfleet.com/code-converter/javascript/bash/) split() { # Convert the number to binary string n=$(echo "obase=2; $1" | bc) # Pad with zeros to make it 64-bit while [ ${#n} -lt 64 ]; do n="0$n" done # Split the binary string into two 32-bit parts lo=$(echo "${n}" | cut -c1-32) hi=$(echo "${n}" | cut -c33-64) echo "$lo $hi" } # Check device model DEVICE=$(/usr/sbin/smartctl -A -i /dev/"$1" | awk -F ' ' '/Device Model/{print $3}') if [ "$(echo "${DEVICE}" | cut -c1-2)" != "ST" ]; then echo "No Seagate drive. Exiting." exit fi printf '%sn\n' "Device Model: ${DEVICE}" printf '%sn\n' "Serial Number: $(/usr/sbin/smartctl -A -i /dev/"$1" | awk -F ' ' '/Serial Number/{print $3}')" RRER=$(split "$(/usr/sbin/smartctl -A -i /dev/"$1" | awk -F ' ' '/Raw_Read_Error_Rate/{print $10}')") SER=$(split "$(/usr/sbin/smartctl -A -i /dev/"$1" | awk -F ' ' '/Seek_Error_Rate/{print $10}')") HECCR=$(split "$(/usr/sbin/smartctl -A -i /dev/"$1" | awk -F ' ' '/Hardware_ECC_Recovered/{print $10}')") RRER1=$(echo "${RRER}" | awk '{ print $1; }') RRER2=$(echo "${RRER}" | awk '{ print $2; }') echo "Raw_Read_Error_Rate: $(echo "ibase=2;obase=A;${RRER1}" | bc) errors in $(numfmt --grouping "$(echo "ibase=2;obase=A;${RRER2}" | bc)") operations." SER1=$(echo "${SER}" | awk '{ print $1; }') SER2=$(echo "${SER}" | awk '{ print $2; }') echo "Seek_Error_Rate: $(echo "ibase=2;obase=A;${SER1}" | bc) errors in $(numfmt --grouping "$(echo "ibase=2;obase=A;${SER2}" | bc)") operations." HECCR1=$(echo "${HECCR}" | awk '{ print $1; }') HECCR2=$(echo "${HECCR}" | awk '{ print $2; }') echo "Hardware_ECC_Recovered: $(echo "ibase=2;obase=A;${HECCR1}" | bc) errors in $(numfmt --grouping "$(echo "ibase=2;obase=A;${HECCR2}" | bc)") operations." # If there is any error if [ "$(echo "ibase=2;obase=A;${RRER1}" | bc)" != 0 ] || [ "$(echo "ibase=2;obase=A;${SER1}" | bc)" != 0 ] || [ "$(echo "ibase=2;obase=A;${HECCR1}" | bc)" != 0 ]; then exit 1 fi
Some parameters seem universal to the hard-drive brand and it is a good idea to monitor them in order to be able to tell whether the drive is falling.
Reallocated_Sector_Count
, 1-4 keep an eye on it, more than 4 replaceReported_Uncorrect
, 1 or more replaceCommand_Timeout
, 1-13 keep an eye on it, more than 13 replaceCurrent_Pending_Sector_Count
, 1 or more replaceOffline_Uncorrectable
, 1 or more replaceAnd for SSDs:
Reallocate_NAND_Blk_Cnt
, these are blocks that cannot be reprogrammed anymore; too many of these and the drive is ageing,Percent_Lifetime_Remain
, to be read as For the contact, copyright, license, warranty and privacy terms for the usage of this website please see the contact, license, privacy, copyright.