Created
November 28, 2014 22:51
-
-
Save eldondev/8fc57ee78e91574ffdc1 to your computer and use it in GitHub Desktop.
Bringing linux pipes into the 21st century
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Throughput may seem like a funny thing to measure on a unix pipe. For many, the PV utility may seem like a toy for bored sysadmins to watch their database dumps get backed up. But the truth is, that unix pipes remain one of my most frequently used tools over the course of the day. Why? | |
* On *nix systems, they are almost always there. | |
* They are frequently the first tool that experienced people turn to when diagnosing an issue with a system. For example: | |
* Web service misbehaving? Try to curl it. | |
* Error in a log file? Grep it! | |
* Need to fix a config file? Sed it! | |
* Need to find out how big some set of data is, an average, etc? wc + bc are your friends. | |
* The parts they contain are | |
* many (I will give you $0x1 if you can name all the coreutils) and | |
* composable (we can send sed to grep to sed to awk to ...) | |
As a result a lot of services can be accessed or tested with this small set of (somewhat) standard tools. But since these standard tools are often what you can use in small bits, people end up using them in big ways, too. | |
One example is curl'ing the twitter sample firehose. Such a task can last for days, weeks, months! You might pipe it to split, then use split's --filter function to execute some other pipeline step on batches of it, which could then be re-curled into (s3,elastic search, etc.) | |
But sometimes I want to see pictures, graphs, etc. showing my what my throughput often looks like | |
Enter something like graphite and statsd. What would be cool would be unix pipe passthroughs that do the following: | |
1) Sends line or byte-based throughput to something like statsd. | |
2) Kill the entire process if some threshold is not met. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment