Created
October 16, 2012 08:08
-
-
Save kgrz/3897971 to your computer and use it in GitHub Desktop.
Enumerating the contents of a file split with a symbol
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Enumerable module provides most of the list traversal classes, some of which are mixed-in to File class. | |
| # #each_line, #each_slice etc are such methods. | |
| # What if you need to split against a custom symbol. This is particularly useful while processing reports or logs. | |
| # For instance, one of the reports I get is see is of this format: | |
| # \e!C\r | |
| # <some header data> | |
| # ----------------------------------------------------------------------- | |
| # Name Account Address DoB | |
| # ----------------------------------------------------------------------- | |
| # Some name Some account Some address Some date of birth | |
| # The \e!C\r is an escape character that presumably defines the start of the page for printer. The records in the file are not | |
| # uniform. One page might have only 4 lines whereas the next page may have full number of rows. The pages can be split like so: | |
| class File | |
| def each_page(&block) | |
| read.split("\e!C\r").each do |page| | |
| block.call(page) | |
| end | |
| end | |
| end | |
| File.open("report.txt") do |file| | |
| file.each_page do |page| | |
| # do stuff to that page | |
| end | |
| end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment