Skip to content

Instantly share code, notes, and snippets.

@myshkin-uk
Created March 1, 2019 18:02
Show Gist options
  • Save myshkin-uk/f1a5a0f081ada2fe38df64029f13d641 to your computer and use it in GitHub Desktop.
Save myshkin-uk/f1a5a0f081ada2fe38df64029f13d641 to your computer and use it in GitHub Desktop.
Wind up and point all the optional safety checks which bash offers for scripts
Usage - include this script before doing serious work in another script.
you will need to leave your script using do_exit, not plain old exit - because bash double-uses exit as a way of flagging faults.
# No shebang because we are included in another sh file
# Files which include this one MUST use bash, since we use the 'source' command below.
# @FIXME DGC 16-Aug-2016
# I'm not clear why this script doesn't use a guard variable as per the coding standard.
# @file script-start.inc.sh
#
# @brief This file should be invoked at start of almost all script files in our project
# to trap and log script errors so we can find what the nature of the collapse was.
#
# Note that you will save a lot o f grief if you get your script to pass the tests here first:
# https://www.shellcheck.net/
# it's totally tedious, and totally worth the effort.
#
# This script should be usable anywhere.
# However if it can find the files:
# /usr/bin/NetrixFileNames.inc.sh
# $configIncAPAFN which will normally be /etc/dexdyne/config/webcontrol.conf
# it includes their source.
# The 'file names' file can provide $consoleLogsAPADN which will then be used below in crash reporting.
#
# This script:
# Can't run if you already have an EXIT trap active, because this script needs to set that for itself.
# Sets the flag so that 'failure' ( non-zero return codes )
# of intermediate stages in a pipeline abort the pipeline.
# That isn't standard bash behaviour, and we may need to fix existing code
# which expects to get away with such things.
#
# @param notrap If passed "notrap", the file will run as normal, but without setting -e and traps.
# this may ( or may not ) be useful in debugging the exact reasons for a failure.
#
# @return No return code.
# Some unrecoverable errors will exit the calling script.
# They should write a failure message to the event log.
#
# @copyright Copyright Dexdyne Ltd. 2013-2019. All Rights Reserved.
#
# @author DGC
# This script sets up many "do something on trap" mechanisms, which write to the event log file
# so at least we'll have a record that a script died when we didn't expect it to.
# It also makes the script run with "stop on any error",
# which is a tough criterion to meet - some commands can return exit codes
# which look for all the world like error flags,
# beware of the sleep command, for instance
# and for stuff you wouldn't want to abort everything on.
#
# see:
# http://www.davidpashley.com/articles/writing-robust-shell-scripts.html
#
# See more detail near bottom of file on how we get round this using sub-shells.
# Actually we now tend to encapsulate things with " if xxx.sh ; then : ;fi ", which should also fix the issue.
#
# The shell interpreter in busybox is basically undocumented.
# The developer says "if it doesn't behave in a posix-compatible way,
# then complain and it might get fixed".
#
# There is a busybox forum where you can complain :-(
#
# a sort of manual for the ash shell can be found at
# http://www.digipedia.pl/man/doc/view/ash.1
# see also
# http://linux.die.net/man/1/ash
# and
# http://www.manpages.com/man/sh-linux-man-page/
# usage
# silentIfTerminatedBySigterm=true
# or silentIfInterruptedBySigint=true
# or silentIfTerminatedOrIfInt=true
# source /sbin/script-start.inc
#
# Set the quiet parameters to stop THIS script putting anything about receiving SIGINT and/or SIGTERM in the error log.
# They may be omitted and the behaviour is then as if you had set them false.
#
# It is VERY difficult to understand how bash works in this area,
# and even more to work out what parts of it the ash interpreter in busybox implements.
#
# I guess one could try reading the busybox source code, but good luck on that.
#
# This script is intended to work on bash and surprising amounts of it used to work on busybox on other models.
# It has not at present been tried with dash.
#
# It is worth defining some terms:
# primary script --- a top-level script, which DOES NOT INCLUDE THIS FILE, usually run from start-up scripts or an exe file.
# secondary script --- a script which DOES NOT INCLUDE THIS FILE, run by a primary script, or another secondary script
# service script --- a script run by a primary or secondary script which DOES NOT INCLUDE THIS FILE
#
# We think that in a quiet system, with no attached consoles, we can expect to see other processes
# request our primary script to die by issuing SIGTERM. As far as we can see, that will only be received
# by the primary script itself, between commands in it
# - it will not be passed to any commands or secondary/service scripts
# we may happen to be executing at the moment the signal is raised.
#
# So if we are hung up in a script or command we've called, we will pretty much ignore a SIGTERM sent to us.
# That certainly applies to "sleep n" commands - they are non-interruptible and hold off the processing of SIGTERM.
#
# If we receive an interrupt from an attached console, however, it appears that will go to the process
# which is live at that very moment - in other words it MAY be a secondary script.
# That script will then quit, with a simple error code, disguising the ACTUAL signal,
# ( EVEN IF IT INCLUDED ITS OWN COPY OF THIS FILE )
# and we will report it as an error in THIS script, just as if a command we called went wrong.
#
# There's no point in making too much of this - it will cause double-reporting, not under-reporting.
# HOWEVER - it makes sense if scripts that expect, in the architecture we have designed, to be sent a SIGTERM
# do not then report that to the error log.
# We try to achieve this with the silentIfTerminatedBySigterm and silentIfInterruptedBySigint variables.
# This file uses the convention that functions and variables to be used ONLY within this file
# have names ending in double-underline.
# Full bash supports ${parameter=default} and ${parameter:=default} to force undefined parameters
# but it looks like ash / busybox doesn't.
# We can get around that easily enough
#
# we can use
# ${1-} meaning - tolerate missing first parameter - treat it as ""
# ${1-fred} meaning - tolerate missing first parameter - treat it as "fred"
#
# The next few lines ensure all 3 variables are set to true or false
# so we have no issues later once we start trapping unset variables...
# At this moment undefined variable trapping must be OFF.
# You can remove this to try going back to ash/dash/busybox.
. /usr/bin/ensure-shell-is-bash.inc.sh
# Error trapping is OFF at present.
doTrapErrors=true
if [ "${1-}" = "notrap" ]
then
doTrapErrors=false
fi
{
if [ "true" != "${silentIfTerminatedBySigterm-false}" ]
then
silentIfTerminatedBySigterm="false"
fi
if [ "true" != "${silentIfInterruptedBySigint-false}" ]
then
silentIfInterruptedBySigint="false"
fi
if [ "true" = "${silentIfTerminatedOrIfInt-false}" ]
then
silentIfTerminatedBySigterm="true"
silentIfInterruptedBySigint="true"
fi
}
# @FIXME DGC 26-Oct-2017
# This script should really be independent of running on a "Netrix box".
# so the following few lines import Netrix files as-and-when they are available.
# shellcheck disable=SC1091
{
# You can't create a function "sourceIfAvailable()" because
# if any of the code sourced tries to create a 'global variable' it will only succeed
# in creating a temporary one local to the function.
# So we do the equivalent with an "executable string", which runs in the current context.
sourceIfAvailable='if [ -r "${target}" ] ; then source "${target}" ; fi'
target="/usr/bin/NetrixFileNames.inc.sh" ; eval ${sourceIfAvailable}
# Force the standard command path.
# There is no code in our system that uses a special path, so this is reasonable.
target="/sbin/set-std-path.inc.sh" ; eval ${sourceIfAvailable}
# We may need to alter the behaviour below based on the model we are running on
# as at this moment they use different issues of busybox.
#
# Currently the only place that can tell us that is webcontrol.conf
# though an alternative mechanism with a dedicated file ( say in / ) would be better.
#
# Import Control Centre configuration file
target="${configIncAPAFN}" ; eval ${sourceIfAvailable}
}
# set true/false to echo entry/exit/abort messages on scripts
echoOnEntryAndExit__=false
# Output to stdout the tails of all files matching a wildcard
#
# @param fileNamePattern Wildcard to match.
# @param lines Number of lines per file to output.
# @param logWidth Maximum number of character columns to print.
#
listFileTails()
{
local fileNamePattern=$1
local lines=$2
local logWidth=$3
local fileName
# These log files may not exist - which would cause script abort when running with set -e
# if we failed to test.
#
# shellcheck disable=SC2086
if ls ${fileNamePattern} >/dev/null 2>&1
then
#
# Stepping through the output of ls is normally deprecated if there is any chance of spaces
# or other 'interesting' characters in the file names.
# We have no such issues with our log files, on which this function is used.
#
for fileName in ls ${fileNamePattern}
do
echo "- - - - - - - ${fileName} - - - - - -"
tail -n"${lines}" "${fileName}" | cut -c1-"${logWidth}"
done
fi
}
# Appends a pile of 'useful' information about the state of the machine to a file.
#
# @param filename to be written to.
#
outputDebugInfoToFile()
{
local outputOPAFN=$1
# FIXME DGC 17-1-2013
# ought to defend ourselves against $1 being blank
# maybe even check the path part exists
#
# Not a serious issue, but it's slack programming.
#
{
# You can get the ACTUAL console width under bash.
local logWidth; logWidth=130 # enough to catch useful stuff, but excludes long long lines from webcore
# FIXME DGC 17-1-20913
#
# adding file handle counts here would be good
echo "-------------------------- date -----------------------------"
date
echo "------------------------- ps ax -----------------------------"
# FIXME DGC 3-4-2012
# nasty shortcut here - we happen to know that busybox ignores a "ax" parameter
# even though that's not documented behaviour.
# but it is required on big linux ps to get all users.
#
# show full output of ps, not simplified ps.sh columns, - in case the rest is useful
#
ps ax | cut -c1-${logWidth}
echo "------------------------- whoami ----------------------------"
whoami
echo "--------------------------- df ------------------------------"
# @FIXME DGC 16-Mar-2016
# See big FIXME in routine 'getRootFileSystemName()'
# In the NetrixShellMacros.inc.sh file.
#
df 2>/dev/null
echo "--------------------- cat /proc/meminfo ---------------------"
cat /proc/meminfo
echo "------------------------ route -n ---------------------------"
/sbin/route -n
echo "------------------------ ifconfig ---------------------------"
/sbin/ifconfig
echo "------------------- cat /etc/resolv.conf --------------------"
cat /etc/resolv.conf
echo "--------------------- Tail of dmesg. ------------------------"
dmesg | grep -v termios | tail -n 30 | cut -c1-${logWidth}
echo
echo "-- lines in process logs possibly reporting a script error --"
echo
if [ -n "${consoleLogsAPADN}" ]
then
# @FIXME DGC 27-Feb-2017
# If any of the .log files are not readable by the current user, this emits an error.
# It may be that we need to do a 'find' for files we have the right to read before we grep.
# Obviously will all work fine when we are root, which we usually are.
#
find "${consoleLogsAPADN}" -mount -mindepth 2 -maxdepth 2 -readable -name "*.log" \
-exec grep -e ": line" {} \;
echo
echo "---------------- Tails of all process logs. -----------------"
# shellcheck disable=SC2016
{
listFileTails '${consoleLogsAPADN}/*/current' 10 ${logWidth}
listFileTails '${consoleLogsAPADN}/ppp/ip-*.txt' 20 ${logWidth}
listFileTails '${consoleLogsAPADN}/openvpn/vpn*.txt' 20 ${logWidth}
}
fi
if [ -n "${pppdScriptsLogAPADN}" ]
then
# @FIXME DGC 27-Feb-2017
# If any of the .log files are not readable by the current user, this emits an error.
# It may be that we need to do a 'find' for files we have the right to read before we grep.
# Obviously will all work fine when we are root, which we usually are.
#
echo "---------------- Tail of all ppd scripts log. -----------------"
# shellcheck disable=SC2016
{
listFileTails '${pppdScriptsLogAPADN}/auth-*.txt' 20 ${logWidth}
}
fi
} >> "${outputOPAFN}"
sync
}
# Set these only to 'true' or 'false'.
stopOnUnexpectedSignals=true
logUnexpectedSignals=true
# Trap routine for most signals/event in scripts.
#
# @param Dexdyne-defined text corresponding to the name of the signal.
# @param The script in which the issue occurred.
# @param The line number at which the issue occurred.
# @param Use true/false to decide if the trap routine should force a script exit.
# @param If we do force an exit, this is the script exit code.
#
# If this routine does not explicitly 'exit' then on return the script will
# 'go on doing what it was going to do'
# For some signals that won't be much as most are disregarded anyway, so the script will
# just keep on executing.
# For signals like int/term, the signal will have the effect it would have had if you hadn't trapped it.
#
unexpectedTrap__() # Routine name is deliberately non-conformant, so it should not clash with any user routine.
{
local signalName="$1"
local faultFile="$2"
local faultLineNumber="$3"
local doExit="$4"
local exitCode="$5"
#
# Bash apparently provides the $LINENO variable,
# but other interpreters ( like the busybox ash we expect to be using on the AVR32/N7000 ) may not.
#
echo "Unexpected ${signalName} received at line ${faultLineNumber:-(unknown)} in script ${faultFile}"
if [ "${signalName}" != "SIGWINCH" ] # The user changing the size of the window we are printing into is NOT a serious issue
# and we just ignore it. Wont see it in an embedded situation anyway.
then
filename="/reboots/last-unexpected-trap-in-script"
echo "Unexpected ${signalName} received at line ${faultLineNumber:-(unknown)} in script ${faultFile}" > ${filename}
outputDebugInfoToFile ${filename}
# Allow for the option not to log unexpected signals to error log.
if ${logUnexpectedSignals}
then
# it may or may not be reasonable to try to append to the event log
# - depending on the exact disaster we have suffered.
# but may as well give it a try
${appendEventLogExeAPAFN} "unexpected ${signalName} received in script ${faultFile}"
fi
# Allow for the option to treat script errors as soft.
if ${stopOnUnexpectedSignals}
then
# Exclude signals which we have seen and don't seem serious.
#
# Assume we always get the SIGxxx format here, whether the user invoked it with or without the "SIG" part.
#
if ${echoOnEntryAndExit__}
then
echo "Aborting script ${faultFile} because of unexpected trap,"
fi
# Just in case it all goes wrong, and we don't exit
# set this global flag that can be tested by the script we're servicing.
G_exitWanted=1
# This exits the offending script that included us!!!
# No idea if this is always the correct action - but we can't afford to loop adding to the event log.
#
do_exit 99 "${faultFile}" "${faultLineNumber}"
fi
fi
if ${doExit}
then
exit "${exitCode}"
fi
}
# Trap routine script errors,
# which will never be executed if the set -e flag is active, as that turns errors into exits.
#
# We do what we can by making a permanent record of the last few lines of the console log
# of every running process which is logging to ${consoleLogsAPADN}
# one of those should contain the info we need to see what happened
#
unexpectedError__() # Routine name is deliberately non-conformant, so it should not clash with any user routine.
{
local faultFile="$1"
local faultLineNumber=$2
# @FIXME DGC 26-Oct-2017
# This gives incorrect information.
# The line number is correct, but if a file is included in another
# then $faultFile refers to the including, not the included, file.
echo "Error was trapped at line ${faultLineNumber} in script ${faultFile}"
filename="/reboots/last-script-error-or-bare-exit"
echo "Error was trapped at line ${faultLineNumber} in script ${faultFile}" > ${filename}
outputDebugInfoToFile ${filename}
# Allow for the option to treat script errors as soft.
if true
then
# It may or may not be reasonable to try to append to the event log
# - depending on the exact disaster we have suffered.
# but may as well give it a try
# @FIXME DGC 27-Feb-2017
# At present only the root user can write the event log.
# That is actually a bug I think.
# The following lines will all echo error messages to stdout, which I assume is harmless
# - if you had to avoid them you could test 'whoami' but that risks
# silliness if we fix the underlying bug.
#
${appendEventLogExeAPAFN} "Script error noted - request Dexdyne support to examine logs."
${appendEventLogExeAPAFN} "\$1: Error was trapped at line ${faultLineNumber} in script ${faultFile}"
${appendEventLogExeAPAFN} "\$1: Debug info can be found in file ${filename}"
fi
# We could just log the problem and continue the script.
# At present we choose to force the script to exit.
do_exit 98
}
# Trap routine for exit from script,
# which also catches reading uninitialised variables when the "set -u" flag is active.
# It would catch all errors if we set "set -e", but we currently don't.
#
# If you trap EXIT, all that happens is that you execute these commands on the way out
# - you can't/don't avoid doing the exit.
#
# If you run another exit, that over-rides the return code,
# otherwise the code returned is the one that originally brought us here,
# so if a command returns 47, and thereby causes a trap under "set -e"
# we will return 47 from this script, even though we run commands here,
# unless we make an effort not to.
#
# It appears we can't access the rc or "exit code"
# of the last command or error before we came here - shame.
# So though we will return it - we don't know what it is.
#
# Nor can we, under ash, read the line number $LINENO in which an error occurred.
#
# We do what we can by making a permanent record of the last few lines of the console log
# of every running process which is logging to ${consoleLogsAPADN}
# one of those should contain the info we need to see what happened
#
unexpectedExit__() # Routine name is deliberately non-conformant, so it should not clash with any user routine.
{
local faultFile="$1"
local faultLineNumber=$2
# @FIXME DGC 28-Feb-2017
# Line number always seems to be 1 for any exit, though it works for errors.
# If we become convinced of that, stop looking at it.
if ${errorTrapDefused__}
then
return
fi
echo "Bare exit ( or uninitialised variable access ) was trapped in script ${faultFile}."
runExitScript_niu
filename="/reboots/last-script-error-or-bare-exit"
echo "Bare exit ( or uninitialised variable access ) encountered at line ${faultLineNumber} in script ${faultFile}" > ${filename}
outputDebugInfoToFile ${filename}
# Allow for the option to treat script errors as soft.
if true
then
# It may or may not be reasonable to try to append to the event log
# - depending on the exact disaster we have suffered
# but may as well give it a try
# @FIXME DGC 27-Feb-2017
# At present only the root user can write the event log.
# That is actually a bug I think.
# The following lines will all echo error messages to stdout, which I assume is harmless
# - if you had to avoid them you could test 'whoami' but that risks
# silliness if we fix the underlying bug.
#
${appendEventLogExeAPAFN} "Script exit/uninitialised var noted - request Dexdyne support to examine logs."
${appendEventLogExeAPAFN} "\$1: Bare exit encountered, or uninitialised variable read, at line ${faultLineNumber} in script ${faultFile}"
${appendEventLogExeAPAFN} "\$1: Debug info can be found in file ${filename}"
if ${echoOnEntryAndExit__}
then
echo "Quitting script ${faultFile}"
fi
fi
# If we reach this point then the trap will continue on and exit the script with the original exit code.
# @FIXME DGC 28-Feb-2017
# If what we encountered was a script 'running-off-the-end' or just executing 'exit'
# then the exit code is zero.... which indicates success.
# Unfortunately I think we've looked into this before, and we cannot access that code,
# we can only quit, and then it will become apparent what it was.
# So we have a choice here:
# - we can quietly quit, and return the original code,including zero.
# - we can force an exit with a code of 97 or something.
# Currently we choose the latter - as it at least makes sure the caller sees a failure.
exit 97
}
# Function executed asynchronously on receipt of a SIGINT signal
#
# This is what we receive if someone types CTRL-C on a terminal connected to the process.
#
# In Netrix this doesn't happen at run-time, so my interest in doing the following perfectly is limited.
#
# See notes at the top about the fact that while THIS process may receive
# and handle a SIGINT - the process to which we return may see the issue as a
# trapped error ( because of our exit code ) - not as a SIGINT.
#
# IT IS POSSIBLE that we should react to this signal by sending an equivalent signal to our parent process.
# However I see descriptions of something called a "process group",
# which appear to be broadcast signals like SIGTERM so they all get it - more investigation needed
# if the issue becomes significant.
#
receivedSigint__() # Routine name is deliberately non-conformant, so it should not clash with any user routine.
{
local faultFile="$1"
local faultLineNumber=$2
runExitScript_niu
if ! ${silentIfInterruptedBySigint}
then
# It may or may not be reasonable to try to append to the event log
# - depending on the exact disaster we have suffered.
# but may as well give it a try.
# The trigger is external, so I assume $faultLineNumber is not significant.
${appendEventLogExeAPAFN} "\$1: unexpected SIGINT received in script ${faultFile} - terminating"
if ${echoOnEntryAndExit__}
then
echo "Terminating script ${faultFile} because of unexpected SIGINT."
fi
fi
# I believe that trapping this signal stops the otherwise default action which would terminate the script.
# We would like to go ahead and exit, so we have to do it for ourselves with an exit statement.
# Just in case it all goes wrong, and we don't exit
# set this global flag that can be tested by the script we're servicing.
G_exitWanted=1
# We can quietly quit, and return the original code,
do_exit 99 "${faultFile}" "${faultLineNumber}" # Exits the including script.
}
# Function executed asynchronously on receipt of a SIGTERM signal
#
# This is what we receive if someone/something simply says "kill 123"
#
# Netrix DOES use this to stop "watcher and shepherd scripts" so we must do it properly.
# In particular we shouldn't moan in the error log about something we expected to happen.
#
# Currently no Netrix scripts trap SIGTERM for any practical purpose ( like releasing lock files or similar )
# so we don't have to consider a script which over-rides this trap.
#
receivedSigterm__() # Routine name is deliberately non-conformant, so it should not clash with any user routine.
{
local faultFile="$1"
local faultLineNumber=$2
runExitScript_niu
if ! ${silentIfTerminatedBySigterm}
then
# It may or may not be reasonable to try to append to the event log
# - depending on the exact disaster we have suffered.
# but may as well give it a try.
# @FIXME DGC 7-8-2013
# This is not the appropriate test on the 8000 box.
# We would need to check for the use of 'service [netrix | comms | networking} stop'.
# @FIXME DGC 27-Feb-2017
# The very helpful 'shellcheck' suggests
# pgrep -f "K06netrix.sh"
# as a better alternative to grepping the output of ps aux
#
# @FIXME DGC 1-Feb-2018
# I assume this logic all goes to hell in a handcart on systemd machines.
# shellcheck disable=SC2009
if ps aux | grep "K..netrix" # crude synonym for "box is shutting down - should catch K06netrix and K??netrixshutdown"
then
: # If unit is closing down we don't take any notice of SIGTERM reports
# the scripts OUGHT to know how to die without reporting an error, but no reason to blather in the event log.
else
# the trigger is external, so I assume $faultLineNumber is not significant
${appendEventLogExeAPAFN} "\$1: unexpected SIGTERM received in script ${faultFile} - terminating"
if ${echoOnEntryAndExit__}
then
echo "Terminating script ${faultFile} because of unexpected SIGTERM."
fi
fi
fi
# I believe that trapping this signal stops the otherwise default action which would terminate the script.
# We would like to go ahead and exit, so we have to do it for ourselves with an exit statement.
# Just in case it all goes wrong, and we don't exit
# set this global flag that can be tested by the script we're servicing.
# shellcheck disable=SC2034
G_exitWanted=1
# This exits the including script.
do_exit 99 "${faultFile}" "${faultLineNumber}"
}
# If a script receives SIGPIPE, it's expected to shut down
#
# We honestly don't know whether having trapped this we could choose to soldier on,
# or whether this is just a temporary diversion, and we will exit the script when the trap exits.
#
# Don't think any netrix code pipes output into a script at present,
# ( though I think we once used it as a substitute for svlogd when it wasn't available )
# so we will probably never encounter this.
#
receivedSigpipe__() # Routine name is deliberately non-conformant, so it should not clash with any user routine.
{
local faultFile="$1"
local faultLineNumber="$2"
# Next line is for debug only.
echo "SIGPIPE received in script ${faultFile}"
runExitScript_niu
# The trigger is external, so assume $faultLineNumber contains no useful information.
${appendEventLogExeAPAFN} "\$1: SIGPIPE received in script ${faultFile} - process our output was being piped to died on us."
if ${echoOnEntryAndExit__}
then
echo "Terminating script ${faultFile} because of SIGPIPE."
fi
# Just in case it all goes wrong, and we don't exit
# set this global flag that can be tested by the script we're servicing.
# shellcheck disable=SC2034
G_exitWanted=1
# This exits the including script.
do_exit 99 "${faultFile}" "${faultLineNumber}"
}
runExitScript_niu()
{
# Use of this is suspended until we find it something to do that we can't do inline here!
return 0
}
# Function which should be called by scripts that have included this file, in order to exit.
#
# We cannot distinguish between an error trapped by "set -e" ( or set_u ??? ) and an exit statement,
# so we forbid the use of simple 'exit' in scripts working with us.
#
# We provide this function as a way for them to exit
#
# @param [opt] Desired exit code - defaults to zero.
#
do_exit()
{
local rc="${1-0}" # The exit code we ought to return from the script.
# echo "Running do_exit ${rc} in script ${faultFile}"
# At one time we used to turn exit trapping off, but that's not compatible with stacked handling.
# So all we do now is to tell our trap to do nothing.
errorTrapDefused__=true
# we have to trust the rest of this function, to avoid recursing if there's a further error.
allowUninitialisedVariables
runExitScript_niu
# Next line is totally unnecessary - all traps are nullified by exiting the script.
trap - SIGINT SIGTERM
# Leave the exit trap running, though now the trap routine will do nothing.
if ${echoOnEntryAndExit__}
then
echo "do_exit is exiting script ${faultFile} with return code $rc."
fi
# Just in case it all goes wrong, and we don't exit
# set this global flag that can be tested by the script we're servicing.
# shellcheck disable=SC2034
G_exitWanted=1
exit "${rc}" # Exit THE INCLUDING SCRIPT, via whatever exit trap(s) are in place.
}
allowErrors()
{
set +e
}
allowUninitialisedVariables()
{
set +u
}
# Find the active trap for a signal, if there is one.
#
# @param The name of something which can be trapped by the bash 'trap' command.
# In the short format, not including a leading 'SIG'
# EXIT is assumed if the parameter is missing.
#
# @return Success if the signal is being trapped.
# On success:
# The global variable "$G_reinstateActiveTrapCommand" is set to a command which can be used to reinstate the current trap.
# The global variable "$G_activeTrapRoutine" is set to a the name of the current trap routine.
#
# If no trap is active, they will return an empty string, which is also a legal bash command.
#
findActiveTrapFor()
{
trapName="${1-EXIT}" # Setting a default is easier than providing a validity test.
# There are 4 'special' traps in addition to the standard signals.
# The output of 'kill -l' ( which mirrors 'trap -l' ) includes the SIG prefix.
#
if [ "EXIT" != "${trapName}" ] \
&& [ "DEBUG" != "${trapName}" ] \
&& [ "RETURN" != "${trapName}" ] \
&& [ "ERR" != "${trapName}" ] \
&& ! kill -l | grep -q "SIG${trapName}"
then
echo "Routine was asked for active trap for signal '${trapName}',"
echo " but that isn't a special KILL / DEBUG / RETURN / ERR token and"
echo " doesn't appear in the output of 'kill -l'"
do_exit 1
fi
# The output from 'trap -p' is in the format
# trap -- 'exitRoutine' EXIT
# which is usable as a command to reinstate the trap in question.
G_reinstateActiveTrapCommand="$(trap -p | grep "${trapName}")"
G_activeTrapRoutine=""
if [ -n "${G_reinstateActiveTrapCommand}" ]
then
# @FIXME DGC 10-Apr-2017
# There is some slicker code to do this using parameter extraction rather than 'cut',
# in shell-macros.inc.sh.
G_activeTrapRoutine=$(echo ${G_reinstateActiveTrapCommand} | cut -d' ' -f3 | cut -d"'" -f2)
fi
[ -n "${G_reinstateActiveTrapCommand}" ]
}
#################### end of function definitions - execution starts here ####################################
# shellcheck disable=SC2034
G_exitWanted=0
if ${doTrapErrors} && findActiveTrapFor EXIT
then
${appendEventLogExeAPAFN} "script-start.inc.sh found that a trap on EXIT was previously set. Can't work with that."
# Consider a stack-dump here?
exit 1 # Quit the whole script which included us.
fi
# This script has to be trusted - do this for clarity, though probably already the case.
allowErrors
allowUninitialisedVariables
# If this variable is set, we are running for the second time in the same script file - we shouldn't.
if [ -n "${haveRunScriptStart__}" ]
then
echo "****************************** script-start.inc.sh used repeatedly - that's an error. *********************************"
fi
haveRunScriptStart__="true"
if ${echoOnEntryAndExit__}
then
# This would cause an 'uninitialised variable' error if that situation was trapped already.
# Bash now allows included scripts to be given parameters on the same line.
# I think the following line was intended to print the positional parameters of the including script.
# and I'm not sure who that interacts with the new capability.
# It's also worth noting that for this to work we must include this script
# BEFORE decoding the script positional parameters, since we now use 'shift' on each one
# which would make them unavailable afterwards.
#
# @FIXME DGC 1-Feb-2018
# UM - should this be using "$@" or similar?
#
echo "Starting script $0 $1 $2 $3 $4 $5 $6 $7 $8"
fi
# This setting forces the return code of a pipeline to be 'fail' if any of the commands in the pipe fail.
# I have tested that this does what it says - without it the script does not notice a non-zero return code
# from an intermediate stage.
# Despite that simple testing this remains new-and-experimental!
# We may find places in our scripts where the other behaviour is required,
# and we actually DESIRE to tolerate intermediate failures in a pipeline.
#
set -o pipefail
# THIS COMMENT IS TENTATIVE, AND MAY BE WRONG.
# "set -o pipefail" forces an error in any stage of a
# pipeline to cause the return of an error.
# That means that if we go, say, "stuff="$(route -n | grep xxx | sed .... "
# the absence of the text xxx in the output of 'route' will return a non-zero code,
# which is interpreted as an error, and in turn causes an error trap.
# So if we intend to quietly return an empty string when the xxx is not present we need to append " || true "
# BUT - I don't think it is adequate to do
# stuff="$(route -n | grep xxx | sed .... || true )"
# because the failure of grep will cause the entire pipeline to return an error
# and I haven't been able to find out whether the "|| true" is applied to 'sed' or the entire pipeline.
# Therefore we should do:
# stuff="$(route -n | grep xxx | sed .... )" || true
# which seems to do the trick.
if ${doTrapErrors}
then
# There is a facility in bash to make all errors which would be trapped as "ERR" trigger an exit instead.
# Now as it happens the "which line number" feature seems to work for errors but not exits.
# Plus we can set up a separate trap for errors, and issue a better report than cramming everything into one.
# Therefore we don't do "set -e" - in fact we positively turn it off.
set +e
# Set up features to allow exit to be forced on reading from uninitialised variables.
# @FIXME DGC 28-Feb-2017
# It seems obvious to a blind man running that this should turn into an ERR, not an EXIT
# but that's not the way it works in bash :-(
set_u="-u"
restoreExitOnUninitialisedVariableTrapping()
{
set ${set_u}
}
setUpExitOnUninitialisedVariableTrapping()
{
restoreExitOnUninitialisedVariableTrapping
}
#
# The semantics of "trap" is that on receiving the signal named as the 2nd parameter
# we run the text string given as the first parameter as a script.
# Use single quotes so that things like $0 and $LINENO are interpreted as if during the line which failed
# not the line which sets up the trap.
# ( LINENO seems to be badly defined for exits; it works OK for ERR traps. )
# NOTE - the routine name given is NOT evaluated until used.
# so it is perfectly possible to set up a trap for a mis-typed routine name
# and never find out until such an error is trapped for the first time years later.
# Extra care is needed to ensure the routines we invoke actually exist.
# It's possible we could fix that by holding the function names in variables,
# and trapping uninitialised variables, but we haven't tried, and it would make for difficult reading.
#
trap ' unexpectedError__ ${0} ${LINENO} ' ERR # ?
# NB when access to an uninitialised variable is faulted, it forces an exit, not an error.
trap ' unexpectedExit__ ${0} ${LINENO} ' EXIT # 0
errorTrapDefused__=false
# Since all ( ??? ) other signals are asynchronous and external to the actions of the script itself,
# there is little point in passing the line number at which the script stopped running,
# However if we are relying on an entry written to the event log, we will want to know the name of the script.
trap ' receivedSigint__ ${0} ${LINENO} ' SIGINT # 2
trap ' receivedSigterm__ ${0} ${LINENO} ' SIGTERM # 15
trap ' receivedSigpipe__ ${0} ${LINENO} ' SIGPIPE # 13 process we are piping to has stopped.
# FIXME DGC 16-3-2012
#
# NOTE THAT THE FOLLOWING IS IS LAZY CODING
#
# The perfect way to do this is to run kill -l which spits out a list like the one below from a DX2, and decode it.
# Note that the numbers 1 to 9 are fixed since early Linux (shell ?) versions,
# but higher numbers have shifted about a bit across versions encountered by Dexdyne
# ( and some signals have come and gone )
# so only the textual labels offer any certainty of selecting the desired signal.
#
# 1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP
# 6) SIGABRT 7) SIGBUS 8) SIGFPE 9) SIGKILL 10) SIGUSR1
# 11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM
# 16) SIGSTKFLT 17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP
# 21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU 25) SIGXFSZ
# 26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 29) SIGIO 30) SIGPWR
# 31) SIGSYS 34) SIGRTMIN 35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3
# 38) SIGRTMIN+4 39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8
# 43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13
# 48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12
# 53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7
# 58) SIGRTMAX-6 59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2
# 63) SIGRTMAX-1 64) SIGRTMAX
#
# then process that so as to trap all signals we don't otherwise deal with
# -- leave that as an exercise for another day.
# Beware - only the versions without "SIG" on the front
# are acceptable to kill commands in shell scripts on the n8000.
# A reminder of the parameters to unexpectedTrap__:
#
# @param Dexdyne-defined text corresponding to the name of the signal.
# @param The script in which the issue occurred.
# @param The line number at which the issue occurred.
# @param true/false to decide if the trap routine should force a script exit.
# @param if we do force an exit, this is the script exit code.
# numeric
# value
# on n8000
trap ' unexpectedTrap__ SIGHUP ${0} ${LINENO} true 1 ' SIGHUP # 1
trap ' unexpectedTrap__ SIGQUIT ${0} ${LINENO} true 1 ' SIGQUIT # 3
trap ' unexpectedTrap__ SIGILL ${0} ${LINENO} true 1 ' SIGILL # 4
trap ' unexpectedTrap__ SIGTRAP ${0} ${LINENO} true 1 ' SIGTRAP # 5
trap ' unexpectedTrap__ SIGABRT ${0} ${LINENO} true 1 ' SIGABRT # 6
trap ' unexpectedTrap__ SIGBUS ${0} ${LINENO} true 1 ' SIGBUS # 7
trap ' unexpectedTrap__ SIGFPE ${0} ${LINENO} true 1 ' SIGFPE # 8
# @FIXME DGC 27-Feb-2017
# According to the very helpful "shellcheck", this is a waste of time,
# as SIGKILL and SIGSTOP cannot be trapped :-)
# shellcheck disable=SC2173
trap ' unexpectedTrap__ SIGKILL ${0} ${LINENO} true 1 ' SIGKILL # 9
trap ' unexpectedTrap__ SIGUSR1 ${0} ${LINENO} true 1 ' SIGUSR1 # 10
trap ' unexpectedTrap__ SIGSEGV ${0} ${LINENO} true 1 ' SIGSEGV # 11
trap ' unexpectedTrap__ SIGUSR2 ${0} ${LINENO} true 1 ' SIGUSR2 # 12
trap ' unexpectedTrap__ SIGALRM ${0} ${LINENO} true 1 ' SIGALRM # 14
trap ' unexpectedTrap__ SIGSTKFLT ${0} ${LINENO} true 1 ' SIGSTKFLT # 16
# It's normal that a script has child processes that terminate - so this should not be trapped.
#
# we could trap it and take no action, but it would just waste processor power.
#
#trap ' unexpectedTrap__ SIGCHLD ${0} ${LINENO} false 0 ' SIGCHLD # 17
trap ' unexpectedTrap__ SIGCONT ${0} ${LINENO} true 1 ' SIGCONT # 18
# @FIXME DGC 27-Feb-2017
# According to the very helpful "shellcheck", this is a waste of time,
# as SIGKILL and SIGSTOP cannot be trapped :-)
# shellcheck disable=SC2173
trap ' unexpectedTrap__ SIGSTOP ${0} ${LINENO} true 1 ' SIGSTOP # 19
trap ' unexpectedTrap__ SIGSTP ${0} ${LINENO} true 1 ' SIGTSTP # 20
trap ' unexpectedTrap__ SIGTTIN ${0} ${LINENO} true 1 ' SIGTTIN # 21
trap ' unexpectedTrap__ SIGTTOU ${0} ${LINENO} true 1 ' SIGTTOU # 22
trap ' unexpectedTrap__ SIGURG ${0} ${LINENO} true 1 ' SIGURG # 23
trap ' unexpectedTrap__ SIGXCPU ${0} ${LINENO} true 1 ' SIGXCPU # 24
trap ' unexpectedTrap__ SIGXFSZ ${0} ${LINENO} true 1 ' SIGXFSZ # 25
trap ' unexpectedTrap__ SIGVTALRM ${0} ${LINENO} true 1 ' SIGVTALRM # 26
trap ' unexpectedTrap__ SIGPROF ${0} ${LINENO} true 1 ' SIGPROF # 27
# We have seen this - it informs a process that "it's window size has changed"
# it happens when we resize the terminal window during debugging - leave it trapped for now
# so it could be logged ( it shouldn't be happening on an embedded system!!! )
#
# I have amended the code so it doesn't write to the last-signal file
# it only outputs one line on the script's console and carries on.
#
# I have added code above so that it cannot cause a script abort
trap ' unexpectedTrap__ SIGWINCH ${0} ${LINENO} false 0 ' SIGWINCH # 28
# @FIXME - see above
# But be careful how much run-time we add - this is an overhead at the start-up of ALL scripts.
# It could be better to do 'kill -l' on the first encounter only
# and store the flags in /tmp/kill-supports-these, or a global shell variable
# then test it with shell pattern-matching.
#
if kill -l | grep -q SIGPOLL
then
# Older script processors have this.
trap ' unexpectedTrap__ SIGPOLL ${0} ${LINENO} true 1 ' SIGPOLL # 29
else
# The more recent bash script processors accept this.
trap ' unexpectedTrap__ SIGIO ${0} ${LINENO} true 1 ' SIGIO # 29
fi
trap ' unexpectedTrap__ SIGPWR ${0} ${LINENO} true 1 ' SIGPWR # 30
trap ' unexpectedTrap__ SIGSYS ${0} ${LINENO} true 1 ' SIGSYS # 31
# on the N8000 kill -l lists the following which we do not yet bother with
#
# SIGRTMIN # 34
# SIGRTMIN+1 # 35
# SIGRTMIN+2 # 36
# SIGRTMIN+3 # 37
# SIGRTMIN+4 # 38
# SIGRTMIN+5 # 39
# SIGRTMIN+6 # 40
# SIGRTMIN+7 # 41
# SIGRTMIN+8 # 42
# SIGRTMIN+9 # 43
# SIGRTMIN+10 # 44
# SIGRTMIN+11 # 45
# SIGRTMIN+12 # 46
# SIGRTMIN+13 # 47
# SIGRTMIN+14 # 48
# SIGRTMIN+15 # 49
# SIGRTMAX-14 # 50
# SIGRTMAX-13 # 51
# SIGRTMAX-12 # 52
# SIGRTMAX-11 # 53
# SIGRTMAX-10 # 54
# SIGRTMAX-9 # 55
# SIGRTMAX-8 # 56
# SIGRTMAX-7 # 57
# SIGRTMAX-6 # 58
# SIGRTMAX-5 # 59
# SIGRTMAX-4 # 60
# SIGRTMAX-3 # 61
# SIGRTMAX-2 # 62
# SIGRTMAX-1 # 63
# SIGRTMAX # 64
# You would think it would be "generally helpful" to force noticing of errors in scripts
# ( by invoking -e 'quit on error' )
# but this means that ANYTHING which does exit 1 ( or any other non-zero value )
# will immediately abort the script,
# even if we had every intention of testing the return code in the following line
#
# There's no way to say "only abort if I don't test the return value myself quite soon"....
#
# BUT - there's a way around this - if we use
#
# ( command ) ; rc=$?
#
# Then the exit because of -e is NOT taken, but we can still pick up the return code
#
# You can also put the command in a "true/false" situation - so kludges like
#
# command && true
# false || command
#
# will suppress the trap.
#
# Having experimented a bit, we now prefer:
# if command; rc=$? ; then : ; fi
#
# Exit ( and therefore trap ) on reading uninitialised variables.
setUpExitOnUninitialisedVariableTrapping
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment