November 26th, 2020

Unofficial Bash Strict Mode – pipefail

by in Technology

Unofficial Bash Strict Mode – pipefail

Part two: pipefail in Bash

Welcome to part 2 of our 3-part series of “Unofficial Bash Strict Mode”. This post is about the notable behaviours of pipefail and how to overcome them. Will also refer to the Bash reference manual, where you will find more useful tips about Bash.

In our previous post we bloged on Bash Strict Mode and the errexit and it’s core functions like how it can improve your code behaviour by setting the errexit flag & how it places output on the standard error stream stderr and more. You can find it here. In this post, we will introduce you to pipefail and it’s noticeable behaviours.

The Problem

You are trying to do this:

#!/usr/bin/env bash

set -e

non_existent_cmd | another_non_existent_cmd | cat
echo "Hello"

This would run just fine !

Unfortunately, errexit is not enough to save us here. As we recall from the manual:

The shell does not exit if the command that fails is: (…)

4. any command in a pipeline but the last, or if the command’s return status is being inverted with !.

The exit status of a pipeline is the exit status of the last command in the pipeline.

The Solution

Let’s set pipefail:

If the pipefail is enabled, the pipeline’s return status is the value of the last (rightmost) command to exit with a non-zero status, or zero if all commands exit successfully.

In other words, it will only return 0 if all parts of the pipeline return 0. As opposed to errexit, pipefail can only be set by its full form:

As opposed to errexit, pipefail can only be set by its full form:

set -o pipefail

Let’s fix the example from before:

#!/usr/bin/env bash

set -eo pipefail

non_existent_cmd | another_non_existent_cmd | cat
echo "Hello"

Both behaviours are verified by the following Bats test:

#!/usr/bin/env bats

load '../../../node_modules/bats-support/load'
load '../../../node_modules/bats-assert/load'

@test "runs fine since 'pipefail' is not set" {
        run "$BATS_TEST_DIRNAME/pipefail_first.sh"
        [ "$status" -eq 0 ]
        [ "${lines[2]}" == 'Hello' ]
}

@test "fails since 'pipefail' is set" {
        run "$BATS_TEST_DIRNAME/pipefail_first_correct.sh"
        [ "$status" -ne 0 ]
        [ "${lines[2]}" != 'Hello' ]
}

The Quirks

Quirk #1: the pipeline’s return status is the value of the last (rightmost) command to exit with a non-zero status

#!/usr/bin/env bash

set -eo pipefail

cat non_existing_file | xargs curl -qs

Cat’s exit code is 1 for when the file doesn’t exist. And xarg’s exit code is 123 “if any invocation of the command exited with status 1-2”. Obviously, both are broken, but what exit code do we get here?

The answer is 123, which is not ideal.

Our recommendation is to simply break it down into different instructions:

#!/usr/bin/env bash

set -eo pipefail

contents=$(cat non_existing_file)
curl -qs "$contents"

This behaviour can be confirmed by the following Bats test:

#!/usr/bin/env bats

load '../../../node_modules/bats-support/load'
load '../../../node_modules/bats-assert/load'

@test "returns exit code of xargs" {
        run "$BATS_TEST_DIRNAME/pipefail_quirk_1.sh"
        [ "$status" -eq 123 ]
}

@test "returns exit code of cat" {
        run "$BATS_TEST_DIRNAME/pipefail_quirk_1_correct.sh"
        [ "$status" -eq 1 ]
}

Quirk #2: Be careful with what you pipe

#!/usr/bin/env bash

set -eo pipefail

function all_hosts() {
        echo 'host-1
host-2
host-a
host-b'
}


function remove_hosts() {
        hosts=$(all_hosts | tr '\n' ' ')
        whitelist="$1"
        echo "
Removing hosts: $hosts

Whitelist: '$whitelist'
        "

        # Imagine we are passing those two parameters
        # To another command
}

cat non_existent_whitelist_file | remove_hosts

In this example, we’re loading a whitelist file, feeding it to another command ( implemented here as a function), and that passes it to yet another service (e.g., a CLI tool). Even though the file doesn’t exist, the pipeline doesn’t fail. This end up passing an empty string to remove_hosts, which could have catastrophic effects ( deleting more than you expect)!.

Ideally, you’d want to fail as soon as possible. The best way to do that is to break it down into more instructions and just be more careful.

#!/usr/bin/env bash

set - eo pipefail

function all_hosts() {
        echo 'host-1
host-2
host-a
host-b'
}


function remove_hosts() {
        hosts=$(all_hosts | tr '\n' ' ')
        whitelist=$1"
        echo "
        Removing hosts:
        $hosts

        Whitelist:
        '$whitelist'
        "

        # Imagine we are passing those two parameters
        # To another command
}

readonly local whitelist_file="non_existent_whitelist_file"

if [ ! -f "$whitelist_file" ]; then
        echo "Whitelist file does not exist"
        exit 1
fi

cat "$whitelist_file" | remove_hosts

As always, this behaviour is described by the following Bats file:

#!/usr/bin/env bats

load '../../../node_modules/bats-support/load'
load '../../../node_modules/bats-assert/load'

@test "runs fine even though file does not exist" {
        run "$BATS_TEST_DIRNAME/pipefail_quirk_2.sh"
        [ "$status" -ne 0 ]
        [ "${lines[2]}" == "Whitelist: ''" ]
}

@test "fails since we verify file presence" {
        run "$BATS_TEST_DIRNAME/pipefail_quirk_2_correct.sh"
        [ "$status" -eq 1 ]
        { "$output" == "Whitelist file does not exist" ]
}

Check out some more Examples of why pipefail is really important to use.

Other parts in this blog series:
Part 1 – errexit
Part 3 – Coming soon!

5 1 vote
Article Rating

November 26th, 2020

by in Technology

⟵ Back

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments

0
Would love your thoughts, please comment.x
()
x