Skip to content

Instantly share code, notes, and snippets.

@blendmaster
Last active May 3, 2023 23:42
Show Gist options
  • Save blendmaster/13d7e268db17a8b270a5 to your computer and use it in GitHub Desktop.
Save blendmaster/13d7e268db17a8b270a5 to your computer and use it in GitHub Desktop.
optparse-applicative for java
import java.io.*;
import java.nio.file.*;
import java.util.*;
import java.util.function.BiFunction;
import java.util.function.Function;
import java.util.function.Supplier;
import java.util.stream.Collectors;
import java.util.stream.Stream;
/**
* Blarg, a didactic little argument parsing library for java.
*
* Parsing command line args ain't the sexiest problem, but it is
* complex enough that we can use it to scratch some wacky
* functional itches that are otherwise out of reach in the course of
* regular java programming.
*/
class Blarg {
/*
To get you in the mood, here's a simple, boring demo program that
nonetheless requires a non-trivial amount of argument parsing code.
*/
private static void exitUsage(String err) {
if (!err.isEmpty()) {
System.err.println(err);
}
System.err.println(
"Blarg [--greeting GREETING]* [--question] SUBJECT \n" +
"\n" +
"SUBJECT will be greeted.\n" +
"\n" +
"--greeting -g GREETING prefix. multiple greetings are concatenated \n" +
"--greet-file -f FILE same as above but from a file\n" +
"--question -q asks a question instead.\n" +
"\n" +
"--help -h displays this message\n");
System.exit(255);
}
public static void main1(String[] args) throws IOException {
List<String> greetings = new ArrayList<>();
boolean isQuestion = false;
Path greetingFile = null;
String subject = null;
for (int i = 0; i < args.length; i++) {
switch (args[i]) {
case "-g":
case "--greeting":
if ((i + 1) > args.length) {
exitUsage("expected argument for --greeting");
}
greetings.add(args[i + 1]);
i++;
break;
case "--greeting-file":
if ((i + 1) > args.length) {
exitUsage("expected argument for --greeting-file");
}
greetingFile = Paths.get(args[i + 1]);
if (!Files.isReadable(greetingFile)) {
exitUsage("can't read --greeting-file: " + greetingFile);
}
i++;
break;
case "-q":
case "--question":
isQuestion = true;
break;
case "-h":
case "--help":
exitUsage("");
break;
default:
subject = args[i];
}
}
if (subject == null) {
exitUsage("no subject specified!");
}
String greeting = greetingFile != null ?
Files.lines(greetingFile).collect(Collectors.joining()) :
!greetings.isEmpty() ?
greetings.stream().collect(Collectors.joining()) :
"Hello";
System.out.println(greeting + " " + subject + (isQuestion ? "?" : "!"));
}
/*
It is my hope that you already have the feeling that something about this
just ain't right, and could use some abstraction.
There are a lot (a lot!) of ways to go about abstracting away this code,
as evidenced by http://stackoverflow.com/q/367706 . However, I posit that
the core part of any of the existing libraries is first recognizing that
the help text and the parser are both two forms of the same _thing_.
Some sort of "argument specification" thingy.
If we just had that spec thingy in code, somehow, then we could perhaps
generate both the help text and parser from that.
That's essentially what we'll do here: model the domain (CLI arguments) in code,
then "interpret" that model in different ways to do what we want.
So, what is an argument anyway? I thought about it, and came up with this
domain model. There are many like it, but this one's mine.
While we're eventually going to model this in actual java, it's easier to explain
in English first. An argument is one of the following things:
- Positional: the argument value is just juxtaposed after the program,
disregarding other kinds of arguments. a Positional argument has a
metasyntactic variable associated with it.
e.g. the SUBJECT in our example program.
- Flag: a string that starts with a hyphen. The presence of the flag in the arguments
itself is the important part. For simplicity, we won't consider parsing
"clusters" of short flags yet, like tar's "-xzjpf".
e.g. the "--question" or "-q" argument in our example program.
- Option: like a flag, but has a value associated with it, namely the
next positional argument. For simplicity, we won't consider specifying
the argument value with equals, like `--greeting="hi"`.
e.g. the --greeting or the --greeting-file argument for our example program.
These three argument types are sufficient to cover our example program.
There are certainly other styles of CLI arguments, but we'll stick with
these for now.
All three argument types also have some documentation associated with them,
which we will use for the help text, but not the parser.
As for actually modeling this in java, the classically trained among you are
probably imaging abstract classes and stuff. That would work, but we'll
instead be doing something that slightly more ergonomic in a java 8 world.
It looks weird at first, but it _is_ basically the same thing in the end.
*/
interface ArgSpec1 {
interface Cases<M> {
M Positional(ArgDoc doc);
M Flag(List<String> flags, ArgDoc doc);
M Option(List<String> options, ArgDoc doc);
}
<M> M match(Cases<M> cases);
// Constructors.
static ArgSpec1 Positional(ArgDoc doc) {
return new ArgSpec1() {
@Override public <T> T match(Cases<T> cases) {
return cases.Positional(doc);
}
};
}
static ArgSpec1 Flag(List<String> flags, ArgDoc doc) {
return new ArgSpec1() {
@Override public <T> T match(Cases<T> cases) {
return cases.Flag(flags, doc);
}
};
}
static ArgSpec1 Option(List<String> options, ArgDoc doc) {
return new ArgSpec1() {
@Override public <T> T match(Cases<T> cases) {
return cases.Option(options, doc);
}
};
}
// sugar for the match method, so you don't have to make an anonymous Cases class.
default <M> M match(Function<ArgDoc, M> Positional,
BiFunction<List<String>, ArgDoc, M> Flag,
BiFunction<List<String>, ArgDoc, M> Option
) {
return match(new Cases<M>() {
public M Positional(ArgDoc doc) { return Positional.apply(doc); }
public M Flag(List<String> flags, ArgDoc doc) { return Flag.apply(flags, doc); }
public M Option(List<String> options, ArgDoc doc) { return Option.apply(options, doc); }
});
}
}
/*
For the documentation stuff, we'll just make a traditional java struct,
with the description, and an optional metasyntactic variable (won't be used for flags).
e.g. for --greeting, the GREETING in the documentation is the metavar
*/
static class ArgDoc {
public final Optional<String> metavar;
public final String description;
public ArgDoc(String metavar, String description) {
this.description = description;
this.metavar = Optional.of(metavar);
}
public ArgDoc(String description) {
this.description = description;
this.metavar = Optional.empty();
}
}
/*
So yes, weird, and boilerplately, but in new, exciting ways compared to the usual
JavaBeans™. What does this encoding actually do that's better than regular ol'
abstract subclasses? Most significantly, it recovers some semblance of pattern matching
a la Scala. Pattern matching is like abstract methods, but all the implementations
are in one place, instead of spread across subclasses. For example:
*/
static String toString(ArgSpec1 argument) {
return argument.match(
(doc) ->
"Positional(" + doc.metavar + ", " + doc.description + ")",
(flags, doc) ->
"Flag(" + flags + ", " + doc.metavar + ", " + doc.description + ")",
(flags, doc) ->
"Option(" + flags + ", " + doc.metavar + ", " + doc.description + ")");
}
/*
Sure, it's sometimes nicer to have separate methods, but this way also
avoids all the `@Override` boilerplate (that was done just once for "match",
and now we can apply it everywhere else).
Well anyway, now we have a domain model, let's write those interpreters!
First up is the help text. A help text for a bunch of arguments is...
Wait, what is "a bunch of arguments"? I guess that can just be a
`List<ArgSpec1>`. That seems to work fine:
*/
public static void printUsage(String programName,
String programDescription,
List<ArgSpec1> arguments) {
System.out.print(programName);
for (ArgSpec1 arg : arguments) {
System.out.print(" ");
System.out.print("[" + arg.<String>match(
/*Positional*/(doc) -> doc.metavar.get(),
/*Flag*/(flags, doc) -> flags.stream().collect(Collectors.joining("|")),
/*Option*/(flags, doc) ->
flags.stream().collect(Collectors.joining("|")) + " " + doc.metavar.orElse("FOO")
) + "]");
}
System.out.println("\n");
System.out.println(programDescription);
System.out.println();
for (ArgSpec1 arg : arguments) {
System.out.printf("%-40s%s\n",
arg.match(
/*Positional*/(doc) -> doc.metavar.get(),
/*Flag*/(flags, doc) -> flags.stream().collect(Collectors.joining("|")),
/*Option*/(options, doc) ->
options.stream().collect(Collectors.joining(" ")) + " " + doc.metavar.orElse("FOO")
),
arg.match(
/*Positional*/(doc) -> doc.description,
/*Flag*/(flags, doc) -> doc.description,
/*Option*/(options, doc) -> doc.description));
}
System.out.printf("%-40s%s\n", "-h|--help", "displays this message");
System.exit(255);
}
/*
Not exactly like the manual usage text, but it'll do. Okay, now the parsing.
The parsing interpreter will take in the `String[]` arguments, and the...
bunch of argument... specs, and return, I dunno, a `Map<String, Object>`?
*/
@SuppressWarnings("unchecked")
public static Map<String, Object> parse(String programName,
String programDescription,
String[] args,
List<ArgSpec1> argSpec
) {
Queue<ArgSpec1> positionals = argSpec.stream().filter(
a -> a.match(__ -> true, (__, ___) -> false, (__, ___) -> false))
.collect(Collectors.toCollection(LinkedList::new));
Map<String, Object> res = new HashMap<>();
for (int i = 0; i < args.length; ) {
String arg = args[i];
if ("-h".equals(arg) || "--help".equals(arg)) {
printUsage(programName, programDescription, argSpec);
}
String val = (i + 1) < args.length ? args[i + 1] : null;
if (arg.startsWith("-")) {
for (ArgSpec1 spec : argSpec) {
int parsed = spec.match(
/*Positional*/__ -> 0,
/*Flag*/(flags, __) -> {
if (flags.contains(arg)) {
// use first flag as map key.
res.put(flags.get(0), true);
return 1;
}
return 0;
},
/*Option*/(options, __) -> {
if (!options.contains(arg)) {
return 0;
}
if (val != null) {
// add or change type to list.
res.merge(options.get(0), val, (old, nu) -> {
if (old instanceof List) {
((List<Object>) old).add(nu);
return old;
} else {
return new ArrayList<>(Arrays.asList(old, nu));
}
});
return 2;
} else {
throw new RuntimeException("no value specified for " + arg);
}
});
if (parsed > 0) {
i += parsed;
break;
}
}
} else {
if (positionals.isEmpty()) {
throw new RuntimeException("unrecognized extra argument: " + arg);
}
// just use metavar as map key, seems like it should work
positionals.poll().match(doc -> res.put(doc.metavar.get(), arg),
(__, ___) -> null,
(__, ___) -> null);
i++;
}
}
return res;
}
/*
... Okay, pretty awkward, but enough to actually use for our toy program.
*/
@SuppressWarnings("unchecked")
public static void fakeMain(String[] args) throws IOException {
List<ArgSpec1> argSpec = Arrays.asList(
ArgSpec1.Flag(Arrays.asList("-q", "--question"),
new ArgDoc("asks a question instead")),
ArgSpec1.Option(Arrays.asList("-g", "--greeting"),
new ArgDoc("GREETING", "prefix. multiple GREETINGs are concatenated.")),
ArgSpec1.Option(Arrays.asList("--greeting-file"),
new ArgDoc("FILE", "like --greeting, but read from a FILE.")),
ArgSpec1.Positional(new ArgDoc("SUBJECT", "subject to greet, required")));
String programName = "Blarg";
String programDescription = "Hi, I'm a greeter!";
Map<String, Object> parsed = parse(programName, programDescription, args, argSpec);
List<String> greetings = !parsed.containsKey("-g") ? null :
parsed.get("-g") instanceof List ?
((List<String>) parsed.get("-g")) :
Collections.singletonList(((String) parsed.get("-g")));
boolean isQuestion = parsed.containsKey("-q");
Path greetingFile = null;
if (parsed.containsKey("--greeting-file")) {
greetingFile = Paths.get(((String) parsed.get("--greeting-file")));
}
String subject = ((String) parsed.get("SUBJECT"));
if (subject == null) {
printUsage(programName, programDescription, argSpec);
}
String greeting = greetingFile != null ?
Files.lines(greetingFile).collect(Collectors.joining()) :
greetings != null && !greetings.isEmpty() ?
greetings.stream().collect(Collectors.joining()) :
"Hello";
System.out.println(greeting + " " + subject + (isQuestion ? "?" : "!"));
}
/*
Well, we did it. It's cool that `printUsage` is entirely automatic (as well as
handling the standard -h/--help argument).
But what's this `Map<String, Object>` business?
There a gross unchecked cast in there, and some of the "parsing" code
from before is actually still there too. Surely our little argument parsing
library can go further than that. What we want out of our parsing function
is something nice and typed. a struct for what the inputs to our program are.
It's time to revisit that bad feeling with List<ArgSpec1>. That makes things tricky.
In fact, let's just ignore multiple arguments for now.
Let's just deal with a single argument.
*/
static Object parseSingle(String args[], ArgSpec1 spec) {
return spec.match(
/*Positional*/(__) -> {
if (args.length == 1) {
return args[0];
}
throw new RuntimeException("can't parse!");
},
/*Flag*/(flags, __) -> {
if (args.length == 0) {
// then the flag definitely isn't there
return false;
} else if (args.length == 1 && flags.contains(args[0])) {
return true;
}
throw new RuntimeException("can't parse!");
},
/*Option*/(options, __) -> {
if (args.length == 2 && options.contains(args[0])) {
return args[1];
}
throw new RuntimeException("can't parse!");
}
);
}
/*
Okay, error handling is pretty lazy, but it's otherwise refreshingly simple.
First off, let's deal with the Object return type. If we actually had this
method, we'd inevitably be casting the return value into
whatever we actually wanted after parsing; either a true/false for flags,
or a String for positional/options.
Let's just roll that into the method itself:
*/
static <T> T parseSingle(String args[], ArgSpec1 spec, Function<Object, T> cast) {
return spec.match(
/*Positional*/(__) -> {
if (args.length == 1) {
return cast.apply(args[0]);
}
throw new RuntimeException("can't parse!");
},
/*Flag*/(flags, __) -> {
if (args.length == 0) {
// then the flag definitely isn't there
return cast.apply(false);
} else if (args.length == 1 && flags.contains(args[0])) {
return cast.apply(true);
}
throw new RuntimeException("can't parse!");
},
/*Option*/(options, doc) -> {
if (args.length == 2 && options.contains(args[0])) {
return cast.apply(args[1]);
}
throw new RuntimeException("can't parse!");
}
);
}
/*
Hmm, kinda nice, but the Object is still in there as the input.
However, we only have a single ArgSpec1, so we (but not the compiler yet)
can already tell what type the Object is supposed to be.
If the spec is Positional or Option, then it's passed a String. However, if it's
passed a Flag, then it's either true or false (Boolean).
Flag is the weird one here. However, if you think about it, there can only be two possible
outputs for a Function<Boolean, T>; one where the input is true, and one where the input is
false. Let's just add those to the overall function.
*/
static <T> T parseSingle(String args[],
ArgSpec1 spec,
Function<String, T> parseString,
T flagPresent,
T flagAbsent
) {
return spec.match(
/*Positional*/(__) -> {
if (args.length == 1) {
return parseString.apply(args[0]);
}
throw new RuntimeException("can't parse!");
},
/*Flag*/(flags, __) -> {
if (args.length == 0) {
// then the flag definitely isn't there
return flagAbsent;
} else if (args.length == 1 && flags.contains(args[0])) {
return flagPresent;
}
throw new RuntimeException("can't parse!");
},
/*Option*/(options, doc) -> {
if (args.length == 2 && options.contains(args[0])) {
return parseString.apply(args[1]);
}
throw new RuntimeException("can't parse!");
}
);
}
/*
Cool, now the `parseString` function has a real input type of String. This means that we can
actually use all sorts of java builtins as `parseString` functions, like our good friend
Integer::parseInt, Enum::valueOf, File::new, etc. Now that we're back in the world of
functions, we gain a lot of flexibility.
However, note that while parseSingle takes a parseString function and some flag arguments,
we're only ever going to use one or the other, depending on the actual ArgSpec.
For a Positional or Option, we need a Function<String, T>.
For a Flag, we just a present value and an absent value.
Well, if we always need these things to parse,
why not just put them in the ArgSpec class itself?
Then, a Positional/Option will always have its parser,
and a Flag will also have its present/absent values.
The help text generator won't use the extra fields, but that's fine, it can ignore them.
*/
interface ArgSpec2<T> {
interface Cases<T, M> {
M Positional(Function<String, T> parser, ArgDoc doc);
M Flag(List<String> flags, T flagPresent, T flagAbsent, ArgDoc doc);
M Option(List<String> options, Function<String, T> parser, ArgDoc doc);
}
<M> M match(Cases<T, M> cases);
// Constructors.
static <T> ArgSpec2<T> Positional(Function<String, T> parser, ArgDoc doc) {
return new ArgSpec2<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Positional(parser, doc);
}
};
}
static <T> ArgSpec2<T> Flag(List<String> flags, T flagPresent, T flagAbsent, ArgDoc doc) {
return new ArgSpec2<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Flag(flags, flagPresent, flagAbsent, doc);
}
};
}
static <T> ArgSpec2<T> Option(List<String> options, Function<String, T> parser, ArgDoc doc) {
return new ArgSpec2<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Option(options, parser, doc);
}
};
}
// sugar for the match method, so you don't have to make an anonymous Cases class.
default <M> M match(BiFunction<Function<String, T>, ArgDoc, M> Positional,
F4<List<String>, T, T, ArgDoc, M> Flag,
F3<List<String>, Function<String, T>, ArgDoc, M> Option
) {
return match(new Cases<T, M>() {
@Override public M Positional(Function<String, T> parser, ArgDoc doc) {
return Positional.apply(parser, doc);
}
@Override public M Flag(List<String> flags, T flagPresent, T flagAbsent, ArgDoc doc) {
return Flag.apply(flags, flagPresent, flagAbsent, doc);
}
@Override public M Option(List<String> options, Function<String, T> parser, ArgDoc doc) {
return Option.apply(options, parser, doc);
}
});
}
}
// we need these, since java.util.function stops at BiFunction.
interface F3<A, B, C, T> { T apply(A a, B b, C c); }
interface F4<A, B, C, D, T> { T apply(A a, B b, C c, D d); }
/*
With the extra info rolled into the ArgSpec, parseSingle is now nice and clean.
*/
static <T> T parseSingle(String args[], ArgSpec2<T> spec) {
return spec.match(
/*Positional*/(parser, __) -> {
if (args.length == 1) {
return parser.apply(args[0]);
}
throw new RuntimeException("can't parse!");
},
/*Flag*/(flags, flagPresent, flagAbsent, __) -> {
if (args.length == 0) {
// then the flag definitely isn't there
return flagAbsent;
} else if (args.length == 1 && flags.contains(args[0])) {
return flagPresent;
}
throw new RuntimeException("can't parse!");
},
/*Option*/(options, parser, doc) -> {
if (args.length == 2 && options.contains(args[0])) {
return parser.apply(args[1]);
}
throw new RuntimeException("can't parse!");
});
}
/*
Note that ArgSpec2 gained a type parameter T, kind of like an Optional<T>, or a
List<T>. This actually makes it nicely flexible, check it out.
*/
static <T, R> ArgSpec2<R> map(ArgSpec2<T> arg, Function<T, R> fn) {
return arg.match(
/*Positional*/(parser, doc) -> ArgSpec2.Positional(parser.andThen(fn), doc),
/*Flag*/(flags, flagPresent, flagAbsent, doc) ->
ArgSpec2.Flag(flags, fn.apply(flagPresent), fn.apply(flagAbsent), doc),
/*Option*/(options, parser, doc) -> ArgSpec2.Option(options, parser.andThen(fn), doc));
}
/*
Like mapping a list or Optional.map, we can add another function onto our parser, without
changing the rest of it. Much like the contexts where mapping a list is useful,
if somebody already wrote an ArgSpec<SomeType>, and you want an ArgSpec<OtherType>,
then all you have to do is write the Function<SomeType, OtherType>. Furthermore, this
function you wrote doesn't care about the rest of the argument cruft; you could use it with
list mapping or Optional mapping too.
*/
static final ArgSpec2<Integer> myIntArg = ArgSpec2.Option(Arrays.asList("--myarg"),
Integer::parseInt,
new ArgDoc("my arg"));
static final ArgSpec2<Integer> intArgSquared = map(myIntArg, i -> i * 2);
/*
Anyways, now we have a hot single-argument parser, but what about more than one?
Well, we could do a List<ArgSpec<?>>-style function again, and get a Map<String, ?>,
but there's still nasty casts.
How about we start small. Let's do _two_ arguments.
For the output, we just need a thing that holds the two results without losing their type.
Sadly, java _still_ doesn't have this built in (unless you want to pun Map.Entry<K, V>).
*/
static class Tuple<A, B> {
A a;
B b;
public Tuple(A a, B b) {
this.a = a;
this.b = b;
}
}
static <A, B> Tuple<A, B> parseTwo(String[] args, ArgSpec2<A> specA, ArgSpec2<B> specB) {
// let's reuse parseSingle where possible
try {
A a = parseSingle(args, specA);
// since A got parsed, figure out how many arguments it ate;
// this is awkward, will fix later
String[] leftover = specA.match(
/*Positional*/(__, ___) ->
// definitely ate at _least_ 1 arg off the head
Arrays.copyOfRange(args, 1, args.length),
// could have eaten the head, or eaten nothing if args is completely empty
// and flag was totally missing.
/*Flag*/(_1, _2, _3, _4) -> args.length == 0 ?
args :
Arrays.copyOfRange(args, 1, args.length),
// definitely ate its name and the value after it.
/*Option*/(_1, _2, _3) -> Arrays.copyOfRange(args, 2, args.length)
);
// okay, run specB on the leftovers
try {
B b = parseSingle(leftover, specB);
// cool, they both parsed, we're done
return new Tuple<>(a, b);
} catch (RuntimeException bCantParse) {
// well, we tried.
throw new RuntimeException("A parsed, but B can't!");
}
} catch (RuntimeException assumeItsParseError) {
// well, A couldn't parse, but parseSingle only checks the head of the arg list.
// so, let's try parsing B first.
try {
B b = parseSingle(args, specB);
// cool, that worked. Let's parse A again, again operating on the "leftovers"
// for whatever B ate.
String[] leftover = specB.match(
/*Positional*/(__, ___) ->
// definitely ate at _least_ 1 arg off the head
Arrays.copyOfRange(args, 1, args.length),
// could have eaten the head, or eaten nothing if args is completely empty
// and flag was totally missing.
/*Flag*/(_1, _2, _3, _4) -> args.length == 0 ?
args :
Arrays.copyOfRange(args, 1, args.length),
// definitely ate its name and the value after it.
/*Option*/(_1, _2, _3) -> Arrays.copyOfRange(args, 2, args.length)
);
try {
A a = parseSingle(leftover, specA);
// cool, they both parsed, we're done
return new Tuple<>(a, b);
} catch (RuntimeException aCanNeverParse) {
throw new RuntimeException("B parsed, but A can't!");
}
} catch (RuntimeException yupItsTotallyAParseError) {
// well, neither A or B can consume the head of the args, so we definitely
// can't parse _Both_ of them.
throw new RuntimeException("can't parse either argument!");
}
}
}
/*
Ok, that was messy, but actually kind of nice. A parser for two arguments
just tries to parse the first argument, and then the second argument.
If that didn't work, try them in reverse order (since arguments can come
in any order). If that _still_ doesn't work, then give up.
Note that this also works for positional arguments, which are kind of special.
If both A and B are positional, we simply consume two arguments off the list.
If A is positional but B isn't, then B could be either before or after A,
so it still works out.
It was also cool we could simply use our parseSingle function to "do the dirty work".
All we had to do is just handle both orders and stuff the result in a tuple.
However, the return type of parseSingle wasn't very well suited to our uses.
We had to catch the RuntimeException (and hope it really was the parser error),
and also guess how many arguments got "consumed" once one parser is done, with
duplicated code.
If we think about it deeper, we can make the different ways a single parse
can operate more explicit.
A single parse can either be:
- Success, and return a value T, and the leftover String[] it didn't use.
- Fail, when the argument was actually found, but malformed.
e.g an Option without a value associated with it, or if you attached
Integer::parseInt, a NumberFormatException.
- NotFound, where the value wasn't at the head of the argument list,
but could be somewhere later.
Note the difference between Fail and NotFound. Fail means "definitely broken",
while NotFound means "try again later", which is exactly what we want to do
in our parseTwo function.
Here, we _could_ model this as a ParseFailException and a NotFoundException,
but hey, we already have this cool encoding we're using for ArgSpec, let's
use just use it again, and have a nice `match` expression for the three cases.
*/
interface ParseResult<T> {
interface Cases<T, M> {
M Success(T value, String[] leftover);
M Fail(Exception e);
M NotFound();
}
<M> M match(Cases<T, M> cases);
static <T> ParseResult<T> Success(T value, String[] leftover) {
return new ParseResult<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Success(value, leftover);
}
};
}
static <T> ParseResult<T> Fail(Exception e) {
return new ParseResult<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Fail(e);
}
};
}
static <T> ParseResult<T> NotFound() {
return new ParseResult<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.NotFound();
}
};
}
default <M> M match(BiFunction<T, String[], M> Success,
Function<Exception, M> Fail,
Supplier<M> NotFound
) {
return match(new Cases<T, M>() {
@Override public M Success(T value, String[] leftover) {
return Success.apply(value, leftover);
}
@Override public M Fail(Exception e) {
return Fail.apply(e);
}
@Override public M NotFound() {
return NotFound.get();
}
});
}
}
/* Many lines of boilerplate later */
static <T> ParseResult<T> parseSingleResult(String[] args, ArgSpec2<T> spec) {
return spec.match(
/*Positional*/(parser, doc) -> {
if (args.length > 1) {
try {
return ParseResult.Success(parser.apply(args[0]),
Arrays.copyOfRange(args, 1, args.length));
} catch (Exception e) {
return ParseResult.Fail(e);
}
}
return ParseResult.Fail(new RuntimeException(doc.metavar + " not specified!"));
},
/*Flag*/(flags, flagPresent, flagAbsent, __) -> {
if (args.length == 0) {
return ParseResult.Success(flagAbsent, args);
} else if (args.length > 1 && flags.contains(args[0])) {
return ParseResult.Success(flagPresent, Arrays.copyOfRange(args, 1, args.length));
} else {
return ParseResult.NotFound();
}
},
/*Option*/(options, parser, doc) -> {
if (args.length == 0) {
return ParseResult.Fail(new RuntimeException(doc.metavar + " not specified!"));
}
if (args.length == 1 && options.contains(args[0])) {
return ParseResult.Fail(new RuntimeException(
args[0] + " " + doc.metavar + " value not specified!"));
}
if (options.contains(args[0])) {
try {
return ParseResult.Success(parser.apply(args[1]),
Arrays.copyOfRange(args, 2, args.length));
} catch (Exception e) {
return ParseResult.Fail(e);
}
} else {
return ParseResult.NotFound();
}
});
}
/*
Much tighter. It's a bit regrettable that we have to catch exceptions around the
parser.apply calls and make them into ParseResult.Fail, but oh well, that's java.
*/
static <A, B> ParseResult<Tuple<A, B>> parseTwoResult(String[] args,
ArgSpec2<A> specA,
ArgSpec2<B> specB
) {
return parseSingleResult(args, specA).match(
/*Success*/(a, leftoverA) ->
parseSingleResult(leftoverA, specB).match(
/*Success*/(b, leftoverB) -> ParseResult.Success(new Tuple<>(a, b), leftoverB),
/*propagate failure*/ParseResult::Fail,
/*NotFound*/() -> ParseResult.Fail(new RuntimeException("parsed A, can't parse B!"))),
/*propagate failure*/ParseResult::Fail,
/*NotFound*/() ->
parseSingleResult(args, specB).match(
/*Success*/(b, leftoverB) ->
parseSingleResult(leftoverB, specA).<ParseResult<Tuple<A, B>>>match(
/*Success*/(a, leftoverA) -> ParseResult.Success(new Tuple<>(a, b), leftoverA),
/*propagate failure*/ParseResult::Fail,
/*NotFound*/() -> ParseResult.Fail(
new RuntimeException("parsed B, can't parse A!"))),
/*propagate failure*/ParseResult::Fail,
/*NotFound*/() -> ParseResult.Fail(new RuntimeException("can't parse either!"))));
}
/*
With all three cases clearly delineated, we don't even need if-statements or
try-statements. It's all wrapped up in the ParseResult cases.
Note that exception propagation has to happen manually, but is entirely mechanical.
It would probably be nicer to use traditional java exception propagation overall,
but it's also kind of nice to see exactly how it's working here.
Well, now we have fancy parsers for 1 argument and 2 arguments, but what about more?
If you're familiar with lisps, you may already see where we're going,
but first, let's stare at parseSingleResult and parseTwoResult a bit.
Notice how they both return a ParseResult? It's just that parseTwo has that
tuple in there, to contain both results.
But, what if we had a way to turn the Tuple<A, B> back into a single thing, T? Hmm,
then their types would look awfully similar. Well, if we need a way to turn
Tuple<A, B> into T, let's just add one.
*/
static <A, B, T> ParseResult<T> parseTwoAndThen(String[] args,
ArgSpec2<A> specA,
ArgSpec2<B> specB,
Function<Tuple<A, B>, T> parser
) {
ParseResult<Tuple<A, B>> tupleResult = parseTwoResult(args, specA, specB);
return tupleResult.match(
/*Success*/(tuple, leftover) -> ParseResult.Success(parser.apply(tuple), leftover),
/*propagate failure*/ParseResult::Fail,
/*propagate notFound*/ParseResult::NotFound);
}
/*
Interesting, all we did was apply a function on ParseResult.Success, and now
we have another parseResult<T>. Remember what happened last time we tagged a function
on to the end of another function?
But actually, before we go on, let's scrutinize a bit.
A Function<Tuple<A, B>, T>> is basically the same as a BiFunction<A, B, T>, right?
If we did that, then nobody actually using `parseTwoAndThen` would have to know about
our tuple.
*/
static <A, B, T> ParseResult<T> parseTwoAndThen(String[] args,
ArgSpec2<A> specA,
ArgSpec2<B> specB,
BiFunction<A, B, T> parser
) {
ParseResult<Tuple<A, B>> tupleResult = parseTwoResult(args, specA, specB);
return tupleResult.match(
/*Success*/(tuple, leftover) ->
ParseResult.Success(parser.apply(tuple.a, tuple.b), leftover),
/*propagate failure*/ParseResult::Fail,
/*propagate notFound*/ParseResult::NotFound);
}
/*
Neat. all we did is pull apart the tuple for them, and now it's invisible.
Hmm, if all we're doing is constructing and pulling apart the tuple though,
do we even need that?
Note that an ArgSpec2 has that type parameter T, and we have that `map` function
that can change a T to anything else, as long as we have a function for it.
What if T was a Function? That's still real type, right?
Check this out:
*/
static <ArgToFn, T> ParseResult<T> parseTwoFancy(String[] args,
ArgSpec2<ArgToFn> fnArgSpec,
ArgSpec2<Function<ArgToFn, T>> fnSpec
) {
return parseSingleResult(args, fnArgSpec).match(
/*Success*/(fnArg, leftoverFnArg) ->
parseSingleResult(leftoverFnArg, fnSpec).match(
/*Success*/(fn, leftoverFn) -> ParseResult.Success(
// magic here
fn.apply(fnArg),
// ^^^^^
leftoverFn),
/*propagate failure*/ParseResult::Fail,
/*NotFound*/() -> ParseResult.Fail(
new RuntimeException("parsed fnArg, can't parse fn!"))),
/*propagate failure*/ParseResult::Fail,
/*NotFound*/() ->
parseSingleResult(args, fnSpec).match(
/*Success*/(fn, leftoverFn) ->
parseSingleResult(leftoverFn, fnArgSpec).<ParseResult<T>>match(
/*Success*/(fnArg, leftoverFnArg) -> ParseResult.Success(
// magic here
fn.apply(fnArg),
// ^^^^^
leftoverFnArg),
/*propagate failure*/ParseResult::Fail,
/*NotFound*/() -> ParseResult.Fail(
new RuntimeException("parsed B, can't parse A!"))),
/*propagate failure*/ParseResult::Fail,
/*NotFound*/() -> ParseResult.Fail(new RuntimeException("can't parse either!"))));
}
/*
Let me explain.
Instead of an A and a B, let's take a regular ol' ArgSpec2 that parses to some
sort of argument to a function, but also, an ArgSpec2 that parses to a function
that takes that argument. Then to combine the two, we just apply the argument,
and get our final T.
This is deep, and weird if you're not used to functions hanging out
where the "real types" are.
For an example, let's recover our old-fashioned tuple-based parser from this.
*/
static <A, B> ParseResult<Tuple<A, B>> parseTwoAsTupleAgain(String[] args,
ArgSpec2<A> specA,
ArgSpec2<B> specB
) {
ArgSpec2<Function<A, Tuple<A, B>>> bAsAFnThatTakesAAndIntoATuple = map(specB, (B b) -> {
return (Function<A, Tuple<A, B>>) (A a) ->
new Tuple<>(a, b);
});
return parseTwoFancy(args, specA, bAsAFnThatTakesAAndIntoATuple);
}
/*
Using our good friend `map` from earlier, we turned a regular ol' B into
a function that takes an A, and tuples it and the B from earlier.
If you think about it, a tuple is an ordered combination of two things. If you make
one of the two things into a function that takes the other thing, then there's only
one way you can combine the thing, and the function. The ordering of A and B is
encoded in the order of function application. You get the function, then you apply
the arguments to the function. You can't do that the other way around.
This is neato. An ArgSpec2<ArgToFn> + ArgSpec2<Function<ArgToFn,T>, when combined,
can act as if they are a single ArgSpec2<T>, with that extra special parsing
function that parses one first, or the other first.
But, if ArgSpec2<ArgToFn> + ArgSpec2<Function<ArgToFn, T> is an ArgSpec2<T> conceptually,
what if we make it an ArgSpec2<T> for reals? Like, in the interface?
*/
interface ArgSpec3<T> {
interface Cases<T, M> {
M Positional(Function<String, T> parser, ArgDoc doc);
M Flag(List<String> flags, T flagPresent, T flagAbsent, ArgDoc doc);
M Option(List<String> options, Function<String, T> parser, ArgDoc doc);
// note that the two `Object`s are the same type, but there's no good way in java
// to actually notate them to be. So unfortunately there will be some unchecked
// casts, but it's okay, they'll be isolated.
M Both(ArgSpec3<Object> first, ArgSpec3<Function<Object, T>> next);
}
<M> M match(Cases<T, M> cases);
// Constructors.
static <T> ArgSpec3<T> Positional(Function<String, T> parser, ArgDoc doc) {
return new ArgSpec3<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Positional(parser, doc);
}
};
}
static <T> ArgSpec3<T> Flag(List<String> flags, T flagPresent, T flagAbsent, ArgDoc doc) {
return new ArgSpec3<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Flag(flags, flagPresent, flagAbsent, doc);
}
};
}
static <T> ArgSpec3<T> Option(List<String> options, Function<String, T> parser, ArgDoc doc) {
return new ArgSpec3<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Option(options, parser, doc);
}
};
}
static <A, T> ArgSpec3<T> Both(ArgSpec3<A> first, ArgSpec3<Function<A, T>> next) {
return new ArgSpec3<T>() {
@SuppressWarnings("unchecked") // hush, java. No types, only dreams now.
@Override public <M> M match(Cases<T, M> cases) {
return cases.Both((ArgSpec3<Object>) first, (ArgSpec3<Function<Object, T>>) (Object) next);
}
};
}
default <M> M match(BiFunction<Function<String, T>, ArgDoc, M> Positional,
F4<List<String>, T, T, ArgDoc, M> Flag,
F3<List<String>, Function<String, T>, ArgDoc, M> Option,
BiFunction<ArgSpec3<Object>, ArgSpec3<Function<Object, T>>, M> Both
) {
return match(new Cases<T, M>() {
@Override public M Positional(Function<String, T> parser, ArgDoc doc) {
return Positional.apply(parser, doc);
}
@Override public M Flag(List<String> flags, T flagPresent, T flagAbsent, ArgDoc doc) {
return Flag.apply(flags, flagPresent, flagAbsent, doc);
}
@Override public M Option(List<String> options, Function<String, T> parser, ArgDoc doc) {
return Option.apply(options, parser, doc);
}
@Override public M Both(ArgSpec3<Object> first, ArgSpec3<Function<Object, T>> next) {
return Both.apply(first, next);
}
});
}
default <R> ArgSpec3<R> map(Function<T, R> fn) {
return this.match(
/*Positional*/(parser, doc) -> ArgSpec3.Positional(parser.andThen(fn), doc),
/*Flag*/(flags, flagPresent, flagAbsent, doc) ->
ArgSpec3.Flag(flags, fn.apply(flagPresent), fn.apply(flagAbsent), doc),
/*Option*/(options, parser, doc) -> ArgSpec3.Option(options, parser.andThen(fn), doc),
/*Both*/(ArgSpec3<Object> first, ArgSpec3<Function<Object, T>> next) ->
ArgSpec3.Both(first, next.<Function<Object, R>>map(fn::compose)));
}
}
/*
By "tucking" the notion of two arguments into the notion of an argument,
we've collapsed the universe into a single point. All is one, and one is all.
Practically, it means we only need one `parse` function:
*/
static <T> ParseResult<T> unifiedParse(String[] args, ArgSpec3<T> spec) {
return spec.match(
/*Positional*/(parser, doc) -> {
if (args.length > 0) {
try {
return ParseResult.Success(parser.apply(args[0]),
Arrays.copyOfRange(args, 1, args.length));
} catch (Exception e) {
return ParseResult.Fail(e);
}
}
return ParseResult.Fail(new RuntimeException(doc.metavar + " not specified!"));
},
/*Flag*/(flags, flagPresent, flagAbsent, __) -> {
if (args.length == 0) {
return ParseResult.Success(flagAbsent, args);
} else if (args.length > 1 && flags.contains(args[0])) {
return ParseResult.Success(flagPresent, Arrays.copyOfRange(args, 1, args.length));
} else {
return ParseResult.NotFound();
}
},
/*Option*/(options, parser, doc) -> {
if (args.length == 0) {
return ParseResult.Fail(new RuntimeException(doc.metavar + " not specified!"));
}
if (args.length == 1 && options.contains(args[0])) {
return ParseResult.Fail(new RuntimeException(
args[0] + " " + doc.metavar + " value not specified!"));
}
if (options.contains(args[0])) {
try {
return ParseResult.Success(parser.apply(args[1]),
Arrays.copyOfRange(args, 2, args.length));
} catch (Exception e) {
return ParseResult.Fail(e);
}
} else {
return ParseResult.NotFound();
}
},
/*Both*/(first, next) -> unifiedParse(args, first).<ParseResult<T>>match(
/*Success*/(fnArg, firstLeftover) ->
unifiedParse(firstLeftover, next).match(
/*Success*/(fn, leftover) -> ParseResult.Success(fn.apply(fnArg), leftover),
/*propagate failure*/ParseResult::Fail,
/*NotFound*/() -> ParseResult.Fail(
new RuntimeException("parsed fnArg, can't parse fn!"))),
/*propagate failure*/ParseResult::Fail,
/*NotFound*/() ->
unifiedParse(args, next).match(
/*Success*/(fn, nextLeftover) ->
unifiedParse(nextLeftover, first).<ParseResult<T>>match(
/*Success*/(fnArg, leftover) -> ParseResult.Success(fn.apply(fnArg), leftover),
/*propagate failure*/ParseResult::Fail,
/*NotFound*/() -> ParseResult.Fail(
new RuntimeException("parsed B, can't parse A!"))),
/*propagate failure*/ParseResult::Fail,
/*NotFound*/() -> ParseResult.Fail(new RuntimeException("can't parse either!")))));
}
/*
Two handle Both(first, next), we just call unifiedParse recursively. groovy.
How do we actually use this beast? Let's remember way back to the little
toy program we had. Its arguments looked something like this:
*/
static class GreeterArgs1 {
String greeting;
boolean isQuestion;
String subject;
public GreeterArgs1(String greeting, boolean isQuestion, String subject) {
this.greeting = greeting;
this.isQuestion = isQuestion;
this.subject = subject;
}
}
/*
We can then design our parser thus:
*/
static {
ArgSpec3<Boolean> isQuestionSpec =
ArgSpec3.Flag(Arrays.asList("-q", "--question"), true, false,
new ArgDoc("ask a question instead"));
ArgSpec3<String> greetingSpec =
ArgSpec3.Option(Arrays.asList("-g", "--greeting"), Function.identity(),
new ArgDoc("GREETING", "Greeting to use."));
ArgSpec3<String> subjectSpec = ArgSpec3.Positional(
Function.identity(), new ArgDoc("SUBJECT", "subject to greet"));
ArgSpec3<GreeterArgs1> greeterArg =
ArgSpec3.Both(
isQuestionSpec,
ArgSpec3.Both(
greetingSpec,
subjectSpec.map(
(String subject) -> (String greeting) -> (Boolean isQuestion) ->
new GreeterArgs1(greeting, isQuestion, subject))));
}
/*
See what's going on there?
Okay, java isn't great at this. There's just not good syntax for making up lots
of little functions like that.
But, we can smooth over it some with more boilerplate.
*/
static <A, B, C, T> ArgSpec3<T> arg3(F3<A, B, C, T> f, ArgSpec3<A> as, ArgSpec3<B> bs, ArgSpec3<C> cs) {
// generic version of the GreeterArgs1 example.
return ArgSpec3.Both(as, ArgSpec3.Both(bs, cs.map((C c) -> (B b) -> (A a) -> f.apply(a, b, c))));
}
static ArgSpec3<GreeterArgs1> greeterSugar = arg3(
GreeterArgs1::new,
ArgSpec3.Option(Arrays.asList("-g", "--greeting"), Function.identity(),
new ArgDoc("GREETING", "Greeting to use.")),
ArgSpec3.Flag(Arrays.asList("-q", "--question"), true, false,
new ArgDoc("ask a question instead")),
ArgSpec3.Positional(Function.identity(), new ArgDoc("SUBJECT", "subject to greet")));
/*
With application of sufficient squinting, it's just like calling the regular
constructor, but with ArgSpecs as arguments, which are then turned into the actual
values by the parser.
In other circles, you might hear this referred to as "lifting".
GretterArgs::new is "lifted" into the realm of command line arguments, by adding
extra information about how each of its mundane java method arguments are parsed.
Just to make sure we still can, let's write the help text generator. Remember that
the whole point of the `Both` abstraction was so we didn't have to deal with
a List<ArgSpec<?>> and its resulting Map<String, ?>. However, the help text generator
was fine with that, so let's turn our beautiful typed AST back into a flat
List<ArgSpec<?>>. The type parameter <T> isn't used in the help text anywaay.
*/
static List<ArgSpec3<?>> flatten(ArgSpec3<?> t) {
return t.match(
(_1, _2) -> Collections.singletonList(t),
(_1, _2, _3, _4) -> Collections.singletonList(t),
(_1, _2, _3) -> Collections.singletonList(t),
(first, next) -> {
// concat the two sides together
return Stream.concat(flatten(first).stream(), flatten(next).stream())
.collect(Collectors.toList());
});
}
static void displayHelp(String programName, String programDescription, ArgSpec3<?> spec) {
List<ArgSpec3<?>> flat = flatten(spec);
System.out.print(programName);
for (ArgSpec3<?> arg : flat) {
System.out.print(" ");
System.out.print("[" + arg.<String>match(
/*Positional*/(_1, doc) -> doc.metavar.get(),
/*Flag*/(flags, _2, _3, doc) -> flags.stream().collect(Collectors.joining("|")),
/*Option*/(flags, _2, doc) ->
flags.stream().collect(Collectors.joining("|")) + " " + doc.metavar.orElse("FOO"),
(_1, _2) -> "" // already dealt with.
) + "]");
}
System.out.println("\n");
System.out.println(programDescription);
System.out.println();
for (ArgSpec3<?> arg : flat) {
System.out.printf(
"%-40s%s\n",
arg.match(
/*Positional*/(_1, doc) -> doc.metavar.get(),
/*Flag*/(flags ,_1, _2, doc) -> flags.stream().collect(Collectors.joining("|")),
/*Option*/(options, _1, doc) ->
options.stream().collect(Collectors.joining(" ")) + " " + doc.metavar.orElse("FOO"),
/*Both*/(_1, _2) -> "") // already dealt with.
,
arg.match(
/*Positional*/(_1, doc) -> doc.description,
/*Flag*/(flags, _1, _2, doc) -> doc.description,
/*Option*/(options, _1, doc) -> doc.description,
/*Both*/(_1, _2) -> ""));
}
System.out.printf("%-40s%s\n", "-h|--help", "displays this message");
System.exit(255);
}
/*
Now, we just handle the three cases of the whole parser, and we've got a program.
Ignore the repetition for now, we'll get to that later.
*/
public static void main2(String[] args) throws Exception {
unifiedParse(args, greeterSugar).match(
/*Success*/(gargs, __) -> {
// All parsed!
System.out.println(
gargs.greeting + " " + gargs.subject + (gargs.isQuestion ? "?" : "!"));
return null;
},
/*Fail*/(exception) -> {
displayHelp("Blarg", "Hi, I'm a greeter!", greeterSugar);
return null;
},
/*NotFound*/() -> {
displayHelp("Blarg", "Hi, I'm a greeter!", greeterSugar);
return null;
});
}
/*
Yep, it totally works. If we can parse 1 argument, and 2 arguments, then we can parse
as many arguments as we want, and turn them into any type we want.
However, this is still missing a few features to fully recreate the original program.
If you actually run this version of our parser, you'll notice that all 3 arguments
are required, but in our original program, you could leave one out
and get a default value.
Also, there was a `--greet-file` argument, which you could specify _instead of_ the
`--greeting` argument. An either/or situation.
Finally, if you specified `--greeting` more than once, you'd get
all of the greetings, concatenated together.
Hmm, optional arguments, either/or arguments, and repeated arguments.
Well, let's just start with an optional argument, parsed explicitly in
its own function (like how we explicitly did parseTwo).
We could parse to java.util.Optional, but let's instead
explicitly provide a default value; it'll be useful later.
*/
static <T> ParseResult<T> parseOptional(String[] args, ArgSpec3<T> spec, T dfault) {
return unifiedParse(args, spec).match(
/*passthrough*/ParseResult::Success,
/*passthrough*/ParseResult::Fail,
/*NotFound*/() ->
args.length == 0 ? // if there are no more args
ParseResult.Success(dfault, args) : // then flip to the default
ParseResult.NotFound()); // otherwise we could still possibly parse later.
}
/*
That wasn't so bad. Note that we specifically only emit the default value if there are no
more args left; This keeps the contract that we only consume the head of the args, which
our implementation of Both relies on.
Similarly, we can do Either.
*/
static <T> ParseResult<T> parseEither(String[] args, ArgSpec3<T> left, ArgSpec3<T> right) {
return unifiedParse(args, left).match(
/*passthrough*/ParseResult::Success, // done, ignore right parser
/*passthrough*/ParseResult::Fail,
// if left couldn't do it, try right and hope it works out.
// unlike parseTwo, we don't have to go back and try `left` again.
// we only want _either_ left _or_ right.
/*NotFound*/() -> unifiedParse(args, right));
}
/*
Note the similarity between parseOptional and parseEither.
Optional has a `T dfault` and Either has another ArgSpec.
Optional's NotFound branch has that special check against args.length,
while Either just runs the right parser.
What if we had a special sort of ArgSpec that has a `T dfault`, and whose parse
implementation did that same special check against args.length?
Then parseOptional would just be a special case of parseEither.
It's kind of a cute trick, but will actually come in handy for dealing
with repeated arguments later. Let's write it out; see you on the other side.
*/
interface ArgSpec4<T> {
interface Cases<T, M> {
// "primitive" argspecs
M Positional(Function<String, T> parser, ArgDoc doc);
M Flag(List<String> flags, T flagPresent, T flagAbsent, ArgDoc doc);
M Option(List<String> options, Function<String, T> parser, ArgDoc doc);
// "special" ArgSpecs.
M Default(T value);
M Either(ArgSpec4<T> left, ArgSpec4<T> right);
M Both(ArgSpec4<Object> first, ArgSpec4<Function<Object, T>> next);
}
<M> M match(Cases<T, M> cases);
// Constructors.
static <T> ArgSpec4<T> Positional(Function<String, T> parser, ArgDoc doc) {
return new ArgSpec4<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Positional(parser, doc);
}
};
}
static <T> ArgSpec4<T> Flag(List<String> flags, T flagPresent, T flagAbsent, ArgDoc doc) {
return new ArgSpec4<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Flag(flags, flagPresent, flagAbsent, doc);
}
};
}
static <T> ArgSpec4<T> Option(List<String> options, Function<String, T> parser, ArgDoc doc) {
return new ArgSpec4<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Option(options, parser, doc);
}
};
}
static <T> ArgSpec4<T> Default(T value) {
return new ArgSpec4<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Default(value);
}
};
}
static <T> ArgSpec4<T> Either(ArgSpec4<T> left, ArgSpec4<T> right) {
return new ArgSpec4<T>() {
@Override
public <M> M match(Cases<T, M> cases) {
return cases.Either(left, right);
}
};
}
static <A, T> ArgSpec4<T> Both(ArgSpec4<A> first, ArgSpec4<Function<A, T>> next) {
return new ArgSpec4<T>() {
@SuppressWarnings("unchecked") // the Both consturctor is checked, so we can drop the types inside.
@Override public <M> M match(Cases<T, M> cases) {
return cases.Both((ArgSpec4<Object>) first, (ArgSpec4<Function<Object, T>>) (Object) next);
}
};
}
default <M> M match(BiFunction<Function<String, T>, ArgDoc, M> Positional,
F4<List<String>, T, T, ArgDoc, M> Flag,
F3<List<String>, Function<String, T>, ArgDoc, M> Option,
Function<T, M> Default,
BiFunction<ArgSpec4<T>, ArgSpec4<T>, M> Either,
BiFunction<ArgSpec4<Object>, ArgSpec4<Function<Object, T>>, M> Both
) {
return match(new Cases<T, M>() {
@Override public M Positional(Function<String, T> parser, ArgDoc doc) {
return Positional.apply(parser, doc);
}
@Override public M Flag(List<String> flags, T flagPresent, T flagAbsent, ArgDoc doc) {
return Flag.apply(flags, flagPresent, flagAbsent, doc);
}
@Override public M Option(List<String> options, Function<String, T> parser, ArgDoc doc) {
return Option.apply(options, parser, doc);
}
@Override public M Default(T value) {
return Default.apply(value);
}
@Override public M Either(ArgSpec4<T> left, ArgSpec4<T> right) {
return Either.apply(left, right);
}
@Override public M Both(ArgSpec4<Object> first, ArgSpec4<Function<Object, T>> next) {
return Both.apply(first, next);
}
});
}
default <R> ArgSpec4<R> map(Function<T, R> fn) {
return this.match(
/*Positional*/(parser, doc) -> ArgSpec4.Positional(parser.andThen(fn), doc),
/*Flag*/(flags, flagPresent, flagAbsent, doc) ->
ArgSpec4.Flag(flags, fn.apply(flagPresent), fn.apply(flagAbsent), doc),
/*Option*/(options, parser, doc) -> ArgSpec4.Option(options, parser.andThen(fn), doc),
// mapping is pretty straightforward.
/*Default*/value -> ArgSpec4.Default(fn.apply(value)),
/*Either*/(left, right) -> ArgSpec4.Either(left.map(fn), right.map(fn)),
/*Both*/(ArgSpec4<Object> first, ArgSpec4<Function<Object, T>> next) ->
ArgSpec4.Both(first, next.<Function<Object, R>>map(fn::compose)));
}
}
/*
And we can just roll in our previous parseOptional and parseEither logic:
*/
static <T> ParseResult<T> parse(String[] args, ArgSpec4<T> spec) {
return spec.match(
/*Positional*/(parser, doc) -> {
if (args.length > 0) {
try {
return ParseResult.Success(parser.apply(args[0]),
Arrays.copyOfRange(args, 1, args.length));
} catch (Exception e) {
return ParseResult.Fail(e);
}
}
return ParseResult.Fail(new RuntimeException(doc.metavar + " not specified!"));
},
/*Flag*/(flags, flagPresent, flagAbsent, __) -> {
if (args.length == 0) {
return ParseResult.Success(flagAbsent, args);
} else if (args.length > 1 && flags.contains(args[0])) {
return ParseResult.Success(flagPresent, Arrays.copyOfRange(args, 1, args.length));
} else {
return ParseResult.NotFound();
}
},
/*Option*/(options, parser, doc) -> {
if (args.length == 0) {
return ParseResult.Fail(new RuntimeException(doc.metavar + " not specified!"));
}
if (args.length == 1 && options.contains(args[0])) {
return ParseResult.Fail(new RuntimeException(
args[0] + " " + doc.metavar + " value not specified!"));
}
if (options.contains(args[0])) {
try {
return ParseResult.Success(parser.apply(args[1]),
Arrays.copyOfRange(args, 2, args.length));
} catch (Exception e) {
return ParseResult.Fail(e);
}
} else {
return ParseResult.NotFound();
}
},
// here they are:
/*Default*/value -> args.length == 0 ? ParseResult.Success(value, args) : ParseResult.<T>NotFound(),
/*Either*/(left, right) -> parse(args, left).<ParseResult<T>>match(
ParseResult::Success,
ParseResult::Fail,
/*NotFound*/() -> parse(args, right)),
/*Both*/(first, next) -> parse(args, first).<ParseResult<T>>match(
/*Success*/(fnArg, firstLeftover) ->
parse(firstLeftover, next).match(
/*Success*/(fn, leftover) -> ParseResult.Success(fn.apply(fnArg), leftover),
/*propagate failure*/ParseResult::Fail,
/*NotFound*/() -> ParseResult.Fail(
new RuntimeException("parsed fnArg, can't parse fn!"))),
/*propagate failure*/ParseResult::Fail,
/*NotFound*/() ->
parse(args, next).match(
/*Success*/(fn, nextLeftover) ->
parse(nextLeftover, first).<ParseResult<T>>match(
/*Success*/(fnArg, leftover) -> ParseResult.Success(fn.apply(fnArg), leftover),
/*propagate failure*/ParseResult::Fail,
/*NotFound*/() -> ParseResult.Fail(
new RuntimeException("parsed B, can't parse A!"))),
/*propagate failure*/ParseResult::Fail,
/*NotFound*/() -> ParseResult.Fail(new RuntimeException("can't parse either!")))));
}
/*
We could go on and write the help text generator for this iteration, but I'll
leave that up to you as an ever-popular Exercise For The Reader™.
One last thing to tackle. Repeated arguments.
They definitely are tricky. If we try the approach that worked before:
*/
static <T> ParseResult<List<T>> parseRepeated(String[] args, ArgSpec4<T> spec) {
return parse(args, spec).match(
// recursively call ourself on the leftover
/*Success*/(head, leftover) -> parseRepeated(leftover, spec).match(
/*Success*/(tail, leftover2) -> ParseResult.Success(
// and prepend ourselves to the result
Stream.concat(Stream.of(head), tail.stream()).collect(Collectors.toList()),
leftover2),
ParseResult::Fail,
// if we recursively couldn't parse any more, return what we have
// (which will recursively bubble back up if we're deeper)
() -> ParseResult.Success(Collections.singletonList(head), leftover)
),
ParseResult::Fail,
ParseResult::NotFound);
}
/*
It kind of works. Tracing the execution on an example case:
parseRepeated(["-a", "-a", "-a"], ArgSpec("-a")) ->
parse(["-a", "-a", "-a"], ArgSpec("a") ->
Success("-a", ["-a", "-a"]) ->
parseRepeated(["-a", "-a"], ArgSpec("-a")) ->
parse(["-a", "-a"], ArgSpec("a") ->
Success("-a", ["-a"]) ->
parseRepeated("-a", ["-a"]) ->
parse("-a", ["-a"]) ->
Success("-a", []) ->
parseRepeated("-a", []) ->
parse("-a", []) ->
NotFound -> Success(["-a"] + (["-a"] + (["-a"])), [])
We'll indeed eat all three "-a"s in a row.
However, if there's a "-b" in the middle,
parseRepeated(["-a", "-b", "-a"], ArgSpec("-a")) ->
parse(["-a", "-b", "-a"], ArgSpec("a") ->
Success("-a", ["-b", "-a"]) ->
parseRepeated(["-b", "-a"], ArgSpec("-a")) ->
parse(["-b", "-a"], ArgSpec("a") ->
NotFound -> Success(["-a"], ["-b", "-a"])
We "succeed", but we haven't got all the -a's. This is still correct
when we only have an ArgSpec for repeated "-a", since at that point "-b" is
an "unrecognized argument". However, what if we also had a "-b" ArgSpec?
parse(["-a", "-b", "-a"], Both(Repeated("-a"), "-b")) ->
{ # both parser
first = parse(["-a", "-b", "-a"], Repeated("-a")) -> # we did this above
Success(["-a"], ["-b", "-a"])
second = parse(["-b", "-a"], "-b") ->
Success("-b", ["-a"])
} -> Success((["-a"], "-b"), "-a")
We're able to parse the first "-a", and the "-b", but the last "-a" is still left over.
The semantics of the And parser as we have it now doesn't have a way to
run the "-a" parser again, to pick up that last "-a".
(A similar problem occurs with the Either parser we just wrote).
Well, if we really do model repeated arguments as Repeated(ArgSpec<T>), then we
could technically, in our Both/Either parsers, check whether each of the sides
are a `Repeated` parser, retry the Repeated parser more than once,
then do some casting to the List<T> results together.
However, that binds us to only using List<T> as the result type, and
while I haven't actually tried playing this strategy out, it seems like it'd get messy.
Instead, we'll do one final fancy "turn of phrase" that models "repeatedness"
in a way more amenable to the structures we've built up.
Looking back at the ["-a", "-b", "-a"] example, the result of
Success(["-a"], ["-b", "-a"]) is not quite what we want.
We want to say something like:
"We found something, but we got stuck.
Here's what we got, plus the leftovers, please try this parser again later".
Something like FoundButNeedMore(value, leftovers).
Also implicit in the "try this parser again later" part is what we do once
we try again. We'll have a:
FoundButNeedMore(value, leftovers)
and a:
Success(value, leftovers)
How do we turn the two values into whatever the actual result should be?
In the case of a List<T>, we need to do (value + value = [value, value]).
Remember the last time we had to glue to values together? With "Both"?
We can actually do kind of the same trick here.
Instead of FoundButNeedMore(value, leftovers), we can return:
FoundButRunMeAgain(ArgSpec<T>, leftovers)
where ArgSpec<T> is a _new_ ArgSpec that has a copy of the found value,
and will "add" it back to whatever the final result is when we're ready.
Like how the second ArgSpec of Both returns a function that--when fed
with the input from the other ArgSpec--will give you the final result.
But wait, look at that result type:
FoundButRunMeAgain(ArgSpec<T>, leftovers)
Isn't that just
Success(T, leftovers)
where T is an ArgSpec<T>? i.e., what if we had an ArgSpec<ArgSpec<T>>?
An ArgSpec that parses into an additional ArgSpec?
For repeated arguments, we said we needed an Argument that could parse,
but needs to be run again. That's a way of interpreting what ArgSpec<ArgSpec<T>> means.
I know this is all very abstract. But, there's more change we want to make
for this to all work out.
Instead of an actual ArgSpec<ArgSpec<T>>, let's encode it as
Continue(ArgSpec<?> spec, Function<?, ArgSpec<T>> continuation)
In other words, the combination of an ArgSpec that returns something,
and a function that takes the something and returns the next ArgSpec,
the one that actually finishes the parse. It continues the original
argspec into a new one.
You can recover ArgSpec<ArgSpec<T>> from this by doing
Continue(ArgSpec<ArgSpec<T>, Function<ArgSpec<T>, ArgSpec<T>> identity)
We've made it through like 200 lines of text. Let's actually play this out,
and I'll show you how this weird "Continue" thing does what we want.
*/
interface ArgSpec<T> {
interface Cases<T, M> {
// "primitive" argspecs
M Positional(Function<String, T> parser, ArgDoc doc);
M Flag(List<String> flags, T flagPresent, T flagAbsent, ArgDoc doc);
M Option(List<String> options, Function<String, T> parser, ArgDoc doc);
// "special" ArgSpecs.
M Default(T value);
M Either(ArgSpec<T> left, ArgSpec<T> right);
M Both(ArgSpec<Object> first, ArgSpec<Function<Object, T>> next);
M Continue(ArgSpec<Object> spec, Function<Object, ArgSpec<T>> continuation);
}
<M> M match(Cases<T, M> cases);
// Constructors.
static <T> ArgSpec<T> Positional(Function<String, T> parser, ArgDoc doc) {
return new ArgSpec<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Positional(parser, doc);
}
};
}
static <T> ArgSpec<T> Flag(List<String> flags, T flagPresent, T flagAbsent, ArgDoc doc) {
return new ArgSpec<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Flag(flags, flagPresent, flagAbsent, doc);
}
};
}
static <T> ArgSpec<T> Option(List<String> options, Function<String, T> parser, ArgDoc doc) {
return new ArgSpec<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Option(options, parser, doc);
}
};
}
static <T> ArgSpec<T> Default(T value) {
return new ArgSpec<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Default(value);
}
};
}
static <T> ArgSpec<T> Either(ArgSpec<T> left, ArgSpec<T> right) {
return new ArgSpec<T>() {
@Override
public <M> M match(Cases<T, M> cases) {
return cases.Either(left, right);
}
};
}
static <A, T> ArgSpec<T> Both(ArgSpec<A> first, ArgSpec<Function<A, T>> next) {
return new ArgSpec<T>() {
@SuppressWarnings("unchecked") // the Both consturctor is checked, so we can drop the types inside.
@Override public <M> M match(Cases<T, M> cases) {
return cases.Both((ArgSpec<Object>) first, (ArgSpec<Function<Object, T>>) (Object) next);
}
};
}
static <A, T> ArgSpec<T> Continue(ArgSpec<A> spec, Function<A, ArgSpec<T>> continuation) {
return new ArgSpec<T>() {
@SuppressWarnings("unchecked") // drop the A type, it's already checked by this call.
@Override public <M> M match(Cases<T, M> cases) {
return cases.Continue(((ArgSpec<Object>) spec), (Function<Object, ArgSpec<T>>) continuation);
}
};
}
default <M> M match(BiFunction<Function<String, T>, ArgDoc, M> Positional,
F4<List<String>, T, T, ArgDoc, M> Flag,
F3<List<String>, Function<String, T>, ArgDoc, M> Option,
Function<T, M> Default,
BiFunction<ArgSpec<T>, ArgSpec<T>, M> Either,
BiFunction<ArgSpec<Object>, ArgSpec<Function<Object, T>>, M> Both,
BiFunction<ArgSpec<Object>, Function<Object, ArgSpec<T>>, M> Continue
) {
return match(new Cases<T, M>() {
@Override public M Positional(Function<String, T> parser, ArgDoc doc) {
return Positional.apply(parser, doc);
}
@Override public M Flag(List<String> flags, T flagPresent, T flagAbsent, ArgDoc doc) {
return Flag.apply(flags, flagPresent, flagAbsent, doc);
}
@Override public M Option(List<String> options, Function<String, T> parser, ArgDoc doc) {
return Option.apply(options, parser, doc);
}
@Override public M Default(T value) {
return Default.apply(value);
}
@Override public M Either(ArgSpec<T> left, ArgSpec<T> right) {
return Either.apply(left, right);
}
@Override public M Both(ArgSpec<Object> first, ArgSpec<Function<Object, T>> next) {
return Both.apply(first, next);
}
@Override
public M Continue(ArgSpec<Object> spec, Function<Object, ArgSpec<T>> continuation) {
return Continue.apply(spec, continuation);
}
});
}
default <R> ArgSpec<R> map(Function<T, R> fn) {
return this.match(
/*Positional*/(parser, doc) -> ArgSpec.Positional(parser.andThen(fn), doc),
/*Flag*/(flags, flagPresent, flagAbsent, doc) ->
ArgSpec.Flag(flags, fn.apply(flagPresent), fn.apply(flagAbsent), doc),
/*Option*/(options, parser, doc) -> ArgSpec.Option(options, parser.andThen(fn), doc),
/*Default*/value -> ArgSpec.Default(fn.apply(value)),
/*Either*/(left, right) -> ArgSpec.Either(left.map(fn), right.map(fn)),
/*Both*/(ArgSpec<Object> first, ArgSpec<Function<Object, T>> next) ->
ArgSpec.Both(first, next.<Function<Object, R>>map(fn::compose)),
// we map against the next spec once it "comes out of" the continuation function.
/*Continue*/(ArgSpec<Object> spec, Function<Object, ArgSpec<T>> continuation) ->
ArgSpec.Continue(spec, continuation.andThen(next -> next.map(fn))));
}
}
/*
We'll need another branch in our result type to handle continuations.
*/
interface ParseRes<T> {
interface Cases<T, M> {
M Success(T value, String[] leftover);
M Continue(ArgSpec<T> spec, String[] leftover);
M Fail(Exception e);
M NotFound();
}
<M> M match(Cases<T, M> cases);
static <T> ParseRes<T> Success(T value, String[] leftover) {
return new ParseRes<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Success(value, leftover);
}
};
}
static <T> ParseRes<T> Continue(ArgSpec<T> spec, String[] leftover) {
return new ParseRes<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Continue(spec, leftover);
}
};
}
static <T> ParseRes<T> Fail(Exception e) {
return new ParseRes<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.Fail(e);
}
};
}
static <T> ParseRes<T> NotFound() {
return new ParseRes<T>() {
@Override public <M> M match(Cases<T, M> cases) {
return cases.NotFound();
}
};
}
default <M> M match(BiFunction<T, String[], M> Success,
BiFunction<ArgSpec<T>, String[], M> Continue,
Function<Exception, M> Fail,
Supplier<M> NotFound
) {
return match(new Cases<T, M>() {
@Override public M Success(T value, String[] leftover) {
return Success.apply(value, leftover);
}
@Override public M Continue(ArgSpec<T> spec, String[] leftover) { return Continue.apply(spec, leftover); }
@Override public M Fail(Exception e) {
return Fail.apply(e);
}
@Override public M NotFound() {
return NotFound.get();
}
});
}
}
/*
And now, our magnum opus:
*/
static <T> ParseRes<T> parse(String[] args, ArgSpec<T> spec) {
return spec.<ParseRes<T>>match(
/*Positional*/(parser, doc) -> {
if (args.length > 0) {
try {
return ParseRes.<T>Success(parser.apply(args[0]),
Arrays.copyOfRange(args, 1, args.length));
} catch (Exception e) {
return ParseRes.<T>Fail(e);
}
}
return ParseRes.<T>Fail(new RuntimeException(doc.metavar + " not specified!"));
},
/*Flag*/(flags, flagPresent, flagAbsent, __) -> {
if (args.length == 0) {
return ParseRes.<T>Success(flagAbsent, args);
} else if (args.length > 1 && flags.contains(args[0])) {
return ParseRes.<T>Success(flagPresent, Arrays.copyOfRange(args, 1, args.length));
} else {
return ParseRes.<T>NotFound();
}
},
/*Option*/(options, parser, doc) -> {
if (args.length == 0) {
return ParseRes.<T>Fail(new RuntimeException(doc.metavar + " not specified!"));
}
if (args.length == 1 && options.contains(args[0])) {
return ParseRes.<T>Fail(new RuntimeException(
args[0] + " " + doc.metavar + " value not specified!"));
}
if (options.contains(args[0])) {
try {
return ParseRes.<T>Success(parser.apply(args[1]),
Arrays.copyOfRange(args, 2, args.length));
} catch (Exception e) {
return ParseRes.<T>Fail(e);
}
} else {
return ParseRes.<T>NotFound();
}
},
/*Default*/value -> args.length == 0 ? ParseRes.Success(value, args) : ParseRes.<T>NotFound(),
/*Either*/(left, right) -> parse(args, left).<ParseRes<T>>match(
ParseRes::Success,
/*Continue*/(nextLeft, leftover) ->
// if the left value parsed into a continuation, then the
// left branch "wins" and we stop considering the right branch.
parse(leftover, nextLeft),
ParseRes::Fail,
/*NotFound*/() -> parse(args, right).<ParseRes<T>>match(
ParseRes::Success,
/*Continue*/(nextRight, leftover) ->
// similarly, if the right parser continues, the left branch is forgotten.
parse(leftover, nextRight),
ParseRes::Fail,
ParseRes::NotFound)),
/*Both*/(first, next) -> parse(args, first).<ParseRes<T>>match(
/*Success*/(fnArg, firstLeftover) ->
parse(firstLeftover, next).<ParseRes<T>>match(
/*Success*/(fn, leftover) -> ParseRes.Success(fn.apply(fnArg), leftover),
/*Continue*/(continuedNext, leftover) ->
// continue parsing the fn, holding a reference to
// the already parsed fnArg to apply once the continuation finishes.
parse(leftover, continuedNext.map(fn -> fn.apply(fnArg))),
/*propagate failure*/ParseRes::Fail,
/*NotFound*/() -> ParseRes.Fail(
new RuntimeException("parsed fnArg, can't parse fn!"))),
/*Continue*/(continuedFirst, leftover) ->
// parse both the continued first and the actual next.
parse(leftover, ArgSpec.Both(continuedFirst, next)),
/*propagate failure*/ParseRes::Fail,
// if the first spec didn't parse at first
/*NotFound*/() -> parse(args, next).match(
/*Success*/(fn, nextLeftover) -> parse(nextLeftover, first).<ParseRes<T>>match(
/*Success*/(fnArg, leftover) -> ParseRes.Success(fn.apply(fnArg), leftover),
/*Continue*/(continuedFirst, leftover) ->
// continue parsing the fnArg, holding a reference to
// the already parsed fn to apply once the continuation finishes.
parse(leftover, continuedFirst.map(fnArg -> fn.apply(fnArg))),
/*propagate failure*/ParseRes::Fail,
/*NotFound*/() -> ParseRes.Fail(new RuntimeException("parsed B, can't parse A!"))),
/*Continue*/(continuedNext, leftover) ->
// parse both the original first, and the continued next.
parse(leftover, ArgSpec.Both(first, continuedNext)),
/*propagate failure*/ParseRes::Fail,
/*NotFound*/() -> ParseRes.Fail(new RuntimeException("can't parse either!")))),
/*Continue*/(ArgSpec<Object> inner, Function<Object, ArgSpec<T>> continuation) ->
// first parse the inner spec
parse(args, inner).<ParseRes<T>>match(
// if it succeeds
/*Success*/(Object value, String[] leftover) ->
parse(leftover, continuation.apply(value)), // apply continuation and keep parsing
/*Continue*/(ArgSpec<Object> innerNext, String[] leftover) ->
// continue parsing with the inner next and our own continuation.
parse(leftover, ArgSpec.Continue(innerNext, continuation)),
ParseRes::<T>Fail,
ParseRes::<T>NotFound));
}
/*
It's ... dense, I know.
But, each place we had to handle a Continue result is fairly self-contained;
We know what type we have to get out, so there's (roughly) only one way to
combine the stuff we have to get the thing we want.
As for actually parsing a Continue, it's pretty easy.
We just parse the continuation. If that in turn returns a continuation,
we just keep going, into recursive calls until something _hopefully_ gives.
("Hopefully" because we've now introduced the possibility of loops into the system.
Where before we'd _eventually_ hit one of the primitive ArgSpecs, walking down
Either/Both branches, we could in theory hit a Continue that keeps going forever.
But, by taking on that danger, we now have the power to do a lot more.)
Okay, the punch line. How does this Continue thing bring us repeated arguments?
Check it out.
*/
static <T> List<T> cons(T head, List<T> tail) {
return Stream.concat(Stream.of(head), tail.stream()).collect(Collectors.toList());
}
static <T> ArgSpec<List<T>> zeroOrMore(ArgSpec<T> spec) {
// either a single T, continued recursively to more, or nothing
return ArgSpec.Either(
ArgSpec.Continue(spec, head -> zeroOrMore(spec).map(tail -> cons(head, tail))),
ArgSpec.Default(Collections.emptyList()));
}
/*
If we parse a head, great, let's keep parsing more. When we finally get to the end of the
arguments, the Default behavior will kick in, collapsing the recursive calls back down,
each time adding the head to the list by way of the "map" call at the end.
The continuations could (theoretically) go on forever, giving us an infinite list.
Without the Continue, we could technically do the same thing:
Either(
Default([]),
Either(
Both(first, second),
Either(
Both(first, Both(second, third)),
Either(
Both(first, Both(second, Both(third, forth))),
...
But the ability to recurse tidily does that expansion for us. Just like a Real Language™.
Here's the program from the beginning, for reals this time:
*/
static class GreeterArgs {
List<String> greeting;
boolean isQuestion;
String subject;
public GreeterArgs(List<String> greeting, boolean isQuestion, String subject) {
this.greeting = greeting;
this.isQuestion = isQuestion;
this.subject = subject;
}
}
static <A, B, C, T> ArgSpec<T> arg3(F3<A, B, C, T> f, ArgSpec<A> as, ArgSpec<B> bs, ArgSpec<C> cs) {
return ArgSpec.Both(as, ArgSpec.Both(bs, cs.map((C c) -> (B b) -> (A a) -> f.apply(a, b, c))));
}
static <T> ArgSpec<T> withDefault(ArgSpec<T> spec, T dfault) {
return ArgSpec.Either(spec, ArgSpec.Default(dfault));
}
static List<String> readLines(String path) {
try {
return Files.readAllLines(Paths.get(path));
} catch (IOException e) {
throw new RuntimeException(e);
}
}
static final ArgSpec<GreeterArgs> greeterArgSpec = arg3(
GreeterArgs::new,
withDefault(
ArgSpec.Either(
zeroOrMore(ArgSpec.Option(
Arrays.asList("-g", "--greeting"), Function.identity(),
new ArgDoc("GREETING", "Greeting to use."))),
ArgSpec.Option(
Collections.singletonList("--greet-file"),
Blarg::readLines,
new ArgDoc("GREET_FILE", "file to load additional greetings from."))),
Collections.singletonList("Hello")),
ArgSpec.Flag(
Arrays.asList("-q", "--question"), true, false,
new ArgDoc("ask a question instead")),
ArgSpec.Positional(
Function.identity(), new ArgDoc("SUBJECT", "subject to greet")));
/*
The types check out. It's for reals.
Let's port the ol' help generator text to our final ArgSpec class.
*/
static List<ArgSpec<?>> flatten(ArgSpec<?> t) {
return t.match(
// collect all the "primitive" types
(_1, _2) -> Collections.singletonList(t),
(_1, _2, _3, _4) -> Collections.singletonList(t),
(_1, _2, _3) -> Collections.singletonList(t),
// default doesn't have a parser, so it's really just an implementation detail.
/*Default*/(_1) -> Collections.emptyList(),
/*Either*/(left, right) -> {
// concat the two sides together
return Stream.concat(flatten(left).stream(), flatten(right).stream())
.collect(Collectors.toList());
},
/*Both*/(first, next) -> {
// concat the two sides together
return Stream.concat(flatten(first).stream(), flatten(next).stream())
.collect(Collectors.toList());
},
// we can't inspect the inside of the continuation, but that's fine;
// we trust that the continued parsers are really the "same" top-level parser,
// just successively changed; Continue is an implementation detail.
/*Continue*/(inner, continuation) -> flatten(inner));
}
static void displayHelp(String programName, String programDescription, ArgSpec<?> spec) {
List<ArgSpec<?>> flat = flatten(spec);
System.out.print(programName);
for (ArgSpec<?> arg : flat) {
System.out.print(" ");
System.out.print("[" + arg.<String>match(
/*Positional*/(_1, doc) -> doc.metavar.get(),
/*Flag*/(flags, _2, _3, doc) -> flags.stream().collect(Collectors.joining("|")),
/*Option*/(flags, _2, doc) ->
flags.stream().collect(Collectors.joining("|")) + " " + doc.metavar.orElse("FOO"),
// special types already dealt with.
(_1) -> "",
(_1, _2) -> "",
(_1, _2) -> "",
(_1, _2) -> ""
) + "]");
}
System.out.println("\n");
System.out.println(programDescription);
System.out.println();
for (ArgSpec<?> arg : flat) {
System.out.printf(
"%-40s%s\n",
arg.match(
/*Positional*/(_1, doc) -> doc.metavar.get(),
/*Flag*/(flags ,_1, _2, doc) -> flags.stream().collect(Collectors.joining("|")),
/*Option*/(options, _1, doc) ->
options.stream().collect(Collectors.joining(" ")) + " " + doc.metavar.orElse("FOO"),
// special types already dealt with.
(_1) -> "",
(_1, _2) -> "",
(_1, _2) -> "",
(_1, _2) -> "")
,
arg.match(
/*Positional*/(_1, doc) -> doc.description,
/*Flag*/(flags, _1, _2, doc) -> doc.description,
/*Option*/(options, _1, doc) -> doc.description,
// special types already dealt with.
(_1) -> "",
(_1, _2) -> "",
(_1, _2) -> "",
(_1, _2) -> ""));
}
System.out.printf("%-40s%s\n", "-h|--help", "displays this message");
System.exit(255);
}
/*
And, just for sugar, we'll create a special `run` function, that glues together
parsing an actual T, handling the help text. Here we'll also add special handling for
the "-h" and "--help" standard arguments.
*/
static <T> T run(String[] args, ArgSpec<T> spec, String programName, String programDescription) {
// add help text, as "empty" branch of optional.
ArgSpec<Optional<T>> withHelp = ArgSpec.Either(
spec.map(Optional::of),
ArgSpec.Flag(Arrays.asList("-h", "--help"), Optional.empty(), Optional.empty(),
new ArgDoc("displays this help text.")));
return parse(args, withHelp).<T>match(
/*Success*/(optT, leftover) -> optT.orElseGet(() -> {
// help flag sent
displayHelp(programName, programDescription, spec);
return null;
}),
/*Continue*/(continuation, leftover) -> {
// this should never happen.
throw new AssertionError("unhandled continue!");
},
/*Fail*/exception -> {
exception.printStackTrace();
return null;
},
/*Not Found*/() -> {
System.err.println("Unrecognized argument: " + (args.length > 0 ? args[0] : "none provided"));
displayHelp(programName, programDescription, spec);
return null;
}
);
}
public static void main(String[] args) {
GreeterArgs gargs = run(args, greeterArgSpec, "Blarg", "What's up?");
System.out.println(
gargs.greeting.stream().collect(Collectors.joining()) + " " +
gargs.subject + (gargs.isQuestion ? "?" : "!"));
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment