Pipes and Redirection — Bash and Shell Scripting | CertQnA

The Unix philosophy is: small tools, doing one thing well, connected by pipes. The shell makes that connection trivial. Mastering pipes and redirection turns the shell into a programmable data-processing environment.

The Three Streams

FD	Name	Default
0	stdin	keyboard
1	stdout	terminal
2	stderr	terminal

Programs write normal output to stdout and errors to stderr. They read input from stdin. Redirection rewires these streams.

Output Redirection

ls > files.txt           # write stdout to file (overwrite)
ls >> files.txt          # append
ls 2> errors.txt         # write stderr to file
ls 2>> errors.txt        # append stderr

ls > out.txt 2> err.txt  # split streams
ls > both.txt 2>&1       # merge stderr into stdout, then to file
ls &> both.txt           # Bash shorthand for the above

The order matters: > both.txt 2>&1 works; 2>&1 > both.txt does not (it dups stderr to the terminal first, then redirects stdout).

Discard Output

noisy_command > /dev/null           # toss stdout
noisy_command 2> /dev/null          # toss stderr only
noisy_command > /dev/null 2>&1      # toss both

Input Redirection

wc -l < file.txt
sort < input.txt > sorted.txt

# Heredoc
cat <<EOF
Multi-line
text
EOF

# Here-string
grep "error" <<< "$LOG_LINE"

Pipes

The pipe (|) wires stdout of one command to stdin of the next.

ps aux | grep nginx
cat access.log | sort | uniq -c | sort -rn | head -n 20

# Top 20 IP addresses by request count:
awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -n 20

Each command runs as a separate process; data flows through. There is no temporary file.

Pipefail

By default, the exit code of a pipeline is the exit code of the last command. If cat file | grep error fails because cat couldn't open the file, you would still get the exit code from grep. set -o pipefail changes that:

set -o pipefail
cat /nonexistent | grep foo
echo "$?"     # non-zero — first failure propagates

Always set pipefail in serious scripts. Combined with set -e, your script stops the moment something breaks.

tee

Sometimes you want a stream to go to a file and continue down the pipe.

long_command | tee log.txt | grep error
build.sh 2>&1 | tee build.log
sudo something | tee -a /var/log/something.log

Process Substitution

Sometimes you want a command's output to look like a file to another command:

diff <(sort file1) <(sort file2)

# Compare list of installed packages on two hosts:
diff <(ssh host1 'dpkg -l') <(ssh host2 'dpkg -l')

# Loop over output without subshell pitfall:
while read -r line; do
  echo "> $line"
done < <(find . -type f)

<(...) creates a temporary FIFO and substitutes its path. >(...) works the other way for a writable target.

xargs

Many commands take arguments, not stdin. xargs bridges the gap by turning stdin into arguments.

find . -name "*.bak" | xargs rm
echo "a b c" | xargs -n 1 echo
find . -name "*.log" -print0 | xargs -0 gzip

-print0 with xargs -0 is the safe form: separates filenames by NUL, so spaces and newlines in names don't break things.

Many xargs uses can be replaced by find -exec:

find . -name "*.bak" -exec rm {} +

Useful Filter Commands

Command	Job
`sort`	sort lines; `-n` numeric, `-r` reverse, `-u` unique
`uniq`	collapse adjacent duplicates; `-c` counts
`cut`	extract columns: `cut -d: -f1 /etc/passwd`
`tr`	character translate: `tr A-Z a-z`
`tac`	reverse line order
`head` / `tail`	first / last N lines
`wc`	count lines, words, bytes
`tee`	write stream to file and stdout
`jq`	JSON processor — install it

A Composed Example

# Find the 5 slowest endpoints in a web log
awk '{print $7, $NF}' access.log \
  | sort \
  | awk '{sum[$1]+=$2; count[$1]++} END {for (k in sum) print sum[k]/count[k], k}' \
  | sort -rn \
  | head -n 5

That is a one-liner that would take a hundred lines in many languages. The shell's superpower.

Cert Mapping

Cert	Scope
RHCSA / LFCS	Heavy use of pipes and redirection on tasks
AWS SAA	Log filtering on EC2, parsing kubectl output

The next lesson goes deep on the three power tools that handle most text processing: grep, sed, and awk.