Reason through shell pipeline problems and surface the right forgotten Linux command. Each section maps a real situation to a specific tool with concrete invocations.
Invoke this skill when the user:
- Wants to monitor a running process, file, or resource without modifying the process
- Needs to send output to both the screen and a file at the same time
- Is running a slow pipeline or batch job and has no visibility into progress
- Wants every log line to carry a timestamp without changing the program being logged
- Needs to transform a file in-place using a pipeline (read and write the same file)
- Wants to format raw text output into a readable aligned table
- Needs to find what's unique to one dataset vs another, or what's shared between two
- Wants to read or process a log or file from the end (newest-first)
- Needs to rename or reorganise a batch of files without writing a script
- Wants to run many independent tasks concurrently without threading code
Also trigger on natural-language descriptions: "how do I watch a file change?", "can I log to a file and see output at the same time?", "I want a progress bar for this pipeline", "how do I timestamp my agent's logs?", "I need to sort a file in-place without a temp file", "how do I line up these columns?", "what's the difference between these two lists?", "how do I read the log bottom-up?", "can I rename a bunch of files at once?", "how do I run these in parallel?".
Ask: Are you watching a file, a command output, or a system resource that changes over time?
Yes — want a refreshing terminal view:
watch -n<seconds> <command>
# examples:
watch -n1 nvidia-smi --query-gpu=memory.used --format=csv
watch -n2 'ls -lh outputs/ | tail -20'
watch -n1 -d free -h # -d highlights what changedYes — want to follow a growing log file:
tail -f <logfile> # live stream, no refresh overhead
tail -f <logfile> | ts '[%H:%M:%S]' # with timestamps (see section 4)watch is for commands you re-run; tail -f is for files being appended to.
Ask: Do you need the output in a file for later AND visible in the terminal now?
Yes:
<command> | tee <output-file>
# append instead of overwrite:
<command> | tee -a <output-file>
# fan out to multiple files:
<command> | tee file1 file2 > /dev/null
# common AI pattern — log and timestamp simultaneously:
<command> 2>&1 | ts '[%H:%M:%S]' | tee run-$(date +%Y%m%d-%H%M%S).logAsk: Is there a pipeline step where data flows through and you have no visibility?
Yes — bytes or throughput:
pv <input-file> | <processing-command>
<command> | pv | <next-command>Yes — line count matters more than bytes:
pv --line-mode <input-file> | <processing-command>
# with known total for accurate ETA:
LINES=$(wc -l < input.jsonl)
pv --line-mode --size "$LINES" input.jsonl | <processing-command>Yes — rate-limit a fast producer for a slow consumer:
pv --rate-limit <bytes-per-second> <input-file> | <slow-consumer>Ask: Do you need to know when each line of a log or stream was produced?
Yes — wall clock timestamp:
<command> 2>&1 | ts '[%Y-%m-%dT%H:%M:%S]'
# custom strftime format:
<command> | ts '[%H:%M:%S]'Yes — time between lines (find where time was spent):
<command> | ts -s '%.s'Yes — timestamp and save:
<command> 2>&1 | ts '[%H:%M:%S]' | tee run.logAsk: Does your pipeline read from and write to the same file?
Yes:
<transform> <file> | sponge <file>
# examples:
sort data.txt | sponge data.txt
python3 -m json.tool config.json | sponge config.json
grep -v 'DEBUG' app.log | sponge app.logDo NOT use <transform> file > file — the shell truncates the output file before the input is read.
Ask: Is the input delimited (tabs, commas, spaces, colons) and hard to read unformatted?
Yes — TSV:
column -t -s $'\t' <file>
<command> | column -t -s $'\t'Yes — CSV:
column -t -s ',' <file>Yes — colon-separated (like /etc/passwd):
column -t -s ':' /etc/passwdYes — wrap a long list into screen-width columns:
<command> | columnAsk: Do you have two sets of items (sorted files, output lists, capability sets) to compare?
Prerequisite: both files must be sorted. If not: sort file | sponge file first, or use process substitution.
# only in file A:
comm -23 a.txt b.txt
# only in file B:
comm -13 a.txt b.txt
# in both:
comm -12 a.txt b.txt
# on-the-fly sorting:
comm -23 <(sort a.txt) <(sort b.txt)Ask: Do you want to see the most recent entries first, or process a file in reverse line order?
Yes — read newest-first:
tac <logfile> | head -20Yes — find last occurrence of a pattern:
tac <logfile> | grep -m1 'ERROR'Yes — process each line in reverse order through a pipeline:
tac <file> | <pipeline>Use tail -f for live streaming. Use tac when you want to reverse a complete file for processing.
Ask: Do you need to rename, move, or delete many files based on their names?
Yes — interactive rename in your editor:
vidir <directory>/
# or specific files:
ls *.txt | vidir -
# inside your editor:
# - edit any path to rename the file
# - delete a line to delete the file
# - change the directory prefix to move the file
# - use editor's regex substitute for bulk patterns: :%s/old/new/gSimpler case — rename with a fixed pattern:
rename 's/old-prefix-/new-prefix-/' *.txtUse vidir when the rename logic is complex or varies per file.
Ask: Do you have a list of independent tasks to run and want them concurrent?
Yes — simple parallel execution:
cat tasks.txt | parallel <command> {}
# from a file list:
ls *.sh | parallel bash {}Yes — limit concurrency:
cat tasks.txt | parallel -j4 <command> {} # max 4 jobs
cat tasks.txt | parallel -j0 <command> {} # one per CPUYes — keep output ordered and labeled:
cat tasks.txt | parallel --tag <command> {}Yes — retry on failure:
cat tasks.txt | parallel --retries 3 <command> {}Yes — pipe blocks of stdin to each job (batched inference):
cat data.jsonl | parallel -j4 --pipe --block 10k <inference-command>| Situation | Command |
|---|---|
| Monitor a changing command output | watch -n<s> <cmd> |
| Output to terminal AND a file | cmd | tee file |
| Progress bar for a pipeline | pv file | cmd or cmd | pv | cmd |
| Timestamp every output line | cmd | ts '[%H:%M:%S]' |
| In-place file transform | transform file | sponge file |
| Format delimited text as table | cmd | column -t -s '<delim>' |
| Find differences between two lists | comm -23 a.txt b.txt |
| Read a log newest-first | tac logfile | head -N |
| Batch rename files in editor | vidir <dir>/ |
| Run tasks concurrently | cat tasks.txt | parallel <cmd> {} |
Every scenario above has a working exercise in this repo:
bash setup.sh # create demo/ with sample data
# then open exercises/01-watch.md through exercises/10-parallel.md