Skip to content

Instantly share code, notes, and snippets.

@scheibo
Created June 24, 2011 03:40
Show Gist options
  • Select an option

  • Save scheibo/1044179 to your computer and use it in GitHub Desktop.

Select an option

Save scheibo/1044179 to your computer and use it in GitHub Desktop.
v commands
# v commands
## Addressing
All of these matches are dependent on the context in which they are used. The default is line granularity, unless they are used in a compound expression or have a literal '#' placed in front of them. For example, .,5 means from the current line to the 5th line. .,#5 means the same as #.,#5 from the current character (context of the second arg changes its meaning) to the 5th character. #.,5 means from the current character to the 5th line. Note this means that (0,$ == #0,$ == 0,#$ == 1,$ == 1,#$) != #1,$.
. - The current address. In ed this is the current line and in sam this is the 'dot' (character granularity). In the command language, this depends on the context.
$ - The null string at the end of the file (sam) or the last line in the buffer (ed), depending on context.
n - the nth line in the buffer, where n is in the range '0,$'. This is the same as the meaning in both sam and ed.
#n - the empty string after character n. This is a address mode specific to sam, but it goes along well with the convention that '#' forces character addressing.
' - the mark. This is the same as in sam, and is similar to ed, except ed is of the form 'x for some x, where ed has multiple marks available.
/regexp/ - start of a regular expression, same as in ed, which is applied line by line and if it matches the line is the new address. Like ed, the final / can be omitted if there is nothing else following it. Also like ed, // repeats the search, finding the next match. Unlike ed, unstead of a BRE the regexp syntax is defined to be whatever is used by the regex library backend, and is usually a PCRE. Note this is *unlike* sam, to get the sam behavior of setting the address to be the address of the next following index that matches, use the next command.
#/regex/ - gives sam style regex matches - aka character based as opposed to line based. See discussion above.
?regex? - matches backwards, ed (line) style. See above.
#?regex? - matches backwards, sam style (character). See above. Note that the same backwards match syntax of -/regex/ is not supported.
### Compounds addresses
Same as in sam, except + can't be elided.
a1+a2 - a2 evaluated starting at right of a1
a1-a2 - a2 evaluated in reverse direction started at left of a1
a1,a2 - left of a1 to right of a2
a1;a2 - set a1 first and then calculate a2
a2 defaults to 1 in + and 0 forms and a1 defaults to '.'. In , and ; the defaults are 1 for a1 and $ for a2. If you throw a # in front of them the defaults for a1 for , and ; change to 0 (in order to mean the same thing when in character mode.) ie/ #,10 == ,10
#### References
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/ed.html
http://www.gnu.org/software/ed/manual/ed_manual.html
http://doc.cat-v.org/plan_9/4th_edition/papers/sam/
@singpolyma
Copy link

So, if I understand correctly ' is "the mark" and you don't want to allow multiple marks?

BTW, aren't ed marks character granular? So, technically, you could set two marks and then use them as the endpoints of your range to get character addressing?

@singpolyma
Copy link

Hmm, reading more it seems ed marks are line based. Interesting.

Anyway, I like what I'm seeing so far.

@singpolyma
Copy link

You might want to specify a set of regexes that will always work or something, maybe have something to put it into "extended RE mode" ? Are BREs a proper subset of PCREs? I expect not...

@singpolyma
Copy link

You actually probably do need at least one print command, to get data from the server to the client for display...

@scheibo
Copy link
Author

scheibo commented Jun 24, 2011

Okay, i've actually just sat on this for a second and i think the casting is actually possibly too complicated. I might be looking for a way to simplify it (having '#,' and ',' have different defaults seems ugly, and mixing line and character mode needs to be better specified - ie 1,#84 should mean from the left of line 1 to character 84. and 5,#5 should be from the RIGHT of line 5).

As for the mark - yes, there will only be one mark. I don't feel there is a strong need for multiple marks. The mark is mainly going to be used for implementing things like copy and paste (really, selection), where the selected area is everything from the mark to the dot.

I said I'm going to be using PCRE's because I'll probably be using Onigurama for the regexp engine (though I'll want to stick this detail in a wrapper module so swapping regex engines can happen with a compiler flag), and most regex libs these days have PCREs. I really want to use RE2, bc its NFA based, but thats a C++ lib. BRE are hideous, everything needs to be escaped - {0,5} vs. {0,5} whereas ERE aren't that bad but might as well use PCRE if you're going to use an ERE anyway. Still, the regex backend should be swappable, so if someone really wants ERE as opposed to PCRE it should be the simple case of writing a smaller wrap and linking in that lib as opposed to a PCRE one.

@scheibo
Copy link
Author

scheibo commented Jun 24, 2011

re: print. yeah, will be doing the standard p. I just won't be doing the silly 'n' command (iirc thats the one with the $ at the end). i don't know if i need the line number command or if that can be done in the client.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment