Devops / sysadmin notes

From "Unix and Linux system administration handbook - 5th ed"

Process & threads: Use strace for deep debugging
For risky commands i.e rm, try with -i (interactive) for confirmation first

From sadservers.com

Scenario 1 (figure out which process is writing to some files):
- fuser <filename> (fuser ~ find user --> return list of processes using the supplied file)
- Can also use lsof | grep <filename>
Scenario 2 (get most frequent visitor's IP from access log): awk '{print $1}' access.log | sort | uniq -c | sort | tail -1
Scenario 3 (find which port to knock): Using nmap against all ports nmap -p- localhost

Misc:

uptime: Show averaeg load (number of processes waiting for CPU time) at 1 / 5 / 15 mins intervals. Can infer if CPU load is increasing / decreasing by comparing loads at these intervals.
/proc virtual file system: Directory of relevant files being used by kernel to manage a process. Accessible at /proc/<PID>/<file>
page table: Each process has a mapping between their virtual address space and physical address space. This abstraction has many benefits:
- Enable memory sharing between parent-child or different processes, similar to the abstraction of NAT to workaround IP address exhaustion.
- Physical memory benefits from spatial / temporal locality. Kernel can optimize by mapping virtual memory addresses of the same process to nearby physical memory addresses.
- Enable memory swapping: Thanks to this mapping, the process does not need to store its whole memory space on physical addresses. Kernel can optimize by only storing frequently accessed ones on a fixed allocated amount of physical memory, the rest can sit in storage. Memory now acts like a cache. When a "cache miss" occur, a page fault signal appears, and the kernel handles that by loading the requested address from storage into memory, similar to how lazy loading works. This also means the memory footprint of all processes can exceed the capacity of RAM.
- Granular control of memory access: Can set different memory access permissions to page table. This abstraction also provides better isolation between memory spaces of processes.
- More details: Read The Linux Programming Interface
vmstat / iostat / pidstat / sar: Monitoring resource usage. More here and Systems Performance Enterprise and the Cloud
netstat / ss: Monitoring network connections. netstat is older + more universal but is getting replaced by ss (socket statistics) which is more performant.
Abstraction leads to the (necessary) gap between perception and reality: Due to the expensive nature of context switching when making system calls, the OS may make its own decision about whether / when to actually execute program instructions. For example when a program attempts to free its allocated memory, the OS may decide to just keep this memory intact (if there are still plenty space) to avoid making syscall that asks the physical memory to do actual freeing, or wait until there is a group big enough to do freeing all at once ("batching"). This style of "overcommitting credit" is similar to how it works in financial industries, where daily transactions "happens" on real-time on the ground of credits, and actual expensive executions (reconcilations etc) are usually done in batch at the end of the day to reduce load on the banks. More here

Tuning:

CPU:
- Use nice to set process priority for CPU scheduling (nice guys gets deprioritized)

Command line:

Working with JSON: pipe to jq for pretty print / further filtering. Example: terraform show -json plan.tfplan | jq

memoryonrepeat/notes.md