Talk to an expert

⟵

May 7th, 2020

Unofficial Bash Strict Mode – errexit

by Robert Golabek in Technology

Bash errexit

Part one: errexit

Improve your code behaviour by setting the errexit flag
errexit places output on the standard error stream stderr

The problem

Look at this Bash script:

#!/usr/bin/env bash

cat /tmp/i_do_not_exist
echo "Hey"

Given the file does not exist, should it run all the way through or should it fail?

Turns out the script continues running just fine!

When tackling the problem, I tend to resort to the “Fail Fast” approach. From the Fail Fast – C2 wiki:

This is done upon encountering an error so serious that it is possible that the process state is corrupt or inconsistent, and immediate exit is the best way to ensure that no (more) damage is done.

Sounds reasonable. So why is “fail silently” the normal behaviour in a shell script? Well, in the context of a shell, you DO NOT want to exit when there’s an error (imagine crashing your shell when you cat a file that doesn’t exist). The behaviour was simply carried out to the non-interactive shell.

The solution

How can we improve this behaviour? By setting the flag errexit:

set -o errexit

Or use the shorthand version (more commonly used):

set -e

What does this do? According to the manual, we should:

Exit immediately if a pipeline (…) returns a non-zero status.

Going back to our example, we would do this instead:

#!/usr/bin/env bash

set -e

cat /tmp/i_do_not_exist
echo "Hey"

…which would then fail. Since the file doesn’t exist, cat returns a non-zero exit code. This behaviour is described in the following Bats unit test:

#!/usr/bin/env bats

load '../../../node_modules/bats-support/load'
load '../../../node_modules/bats-assert/load'

@test "runs fine even though file does not exist" { 
	run "$BATS_TEST_DIRNAME/errexit.sh"
	[ "$status" -eq 0 ]
}

@test "fails since file does not exist AND errexit is turned on" {
	run "$BATS_TEST_DIRNAME/errexit2.sh"
	[ "$status" -ne 0 ]
	[ "$output" == "cat: /tmp/i_do_not_exist: No such file or directory" ]
}

That works and will definitely IMO help you, but watch out for the quirks.

The quirks

Quirk #1: Programs that return a non-zero status

Not all commands return 0 on successful runs. The most prominent example is grep. From the manual:

Normally the exit status is

0 if a line is selected,
1 if no lines were selected,
and 2 if an error occurred.

However, if the -q or –quiet or –silent is used and a line is selected, the exit status is 0 even if an error occurred.

So, in the example below, echo will never be run.

#!/usr/bin/env bash

set -e

status_code=$(grep non_existant_word /dev/null)
echo "Hello world"

What can we do in this situation? Thankfully, there’s a bit in the Bash manual in the errexit section that can help (reformatted for clarity):

The shell does not exit if the command that fails is:

part of the command list immediately following a while or until keyword
part of the test in an if statement,
part of any command executed in a && or || list except the command following the final && or ||
any command in a pipeline but the last, or if the command’s return status is being inverted with !.

In our case, we can simply rewrite to comply with #2:

#!/usr/bin/env bash

set -e

if grep non_existant_word /dev/null; then
	echo "Hello world"
else
	echo "Does not exist"
fi

This behaviour can be verified by the following Bats test:

#!/usr/bin/env bats

load '../../../node_modules/bats-support/load'
load '../../../node_modules/bats-assert/load'

@test "fails since grep returns non 0" {
	run "$BATS_TEST_DIRNAME/grep_fail.sh"
	[ "$status" -ne 0 ]
}

@test "runs fine since grep is in a if statement" {
	run "$BATS_TEST_DIRNAME/grep_correct.sh"
	[ "$status" -eq 0 ]
	[ "$output" == "Does not exist" ]
}

Source: /posts/bash-strict-mode/grep.bats

Quirk #2: What if you are ok with a command failing or returning non-zero?

In that case, simply run an OR operation with true:

rm *.log || true

…because we don’t want to fail if there are no log files.

Why does this work? Recall #3 from the manual:

The shell does not exit if the command that fails is (…)

3. part of any command executed in a && or || list except the command following the final && or ||

As the command following the final || is true, there’s no way for the whole line to fail.

Another option would be to turn it off briefly:

set +e
command_allowed_to_fail
set -e

The + syntax means “remove” and – means “to add” (counterintuitive, yes). Therefore, we’re simply disabling that feature while our command_allowed_to_fail is called!

Bonus: How do I know which command failed?

This is not specific to errexit, but often you need to know where the command failed.

#!/usr/bin/env bash

set -e

function random_bytes {
	echo $(head -c "$1" /dev/random | base64)
}

random_bytes 10
random_bytes 50
random_bytes 
random_bytes 5

But how can you tell which command failed (apart from looking at the very obvious mistake)?

echo everything you’re doing
Pros: straightforward
Cons: quite boring to do
set -x, which will print every instruction
Pros: simple to add
Cons: you may end up exposing more than you want (imagine printing a variable with secrets…now imagine that running in a CI environment)
put a trap to print the line number when a command fails
Pros: can be added globally
Cons: a bit verbose