richdougherty · April 12, 2016 04:28
diff --git a/Inhabitation.txt b/Inhabitation.txt
 INHABITATION ALGORITHM

 The idea is to label each state in a type's automaton with a ternary value to
 indicate our knowledge about its inhabitation:

 1. all - inhabited by all values (aka any)
 2. none - inhabited by no values (aka void)
 3. some - may be inhabited by some values

 States with a basic type have an implicit label corresponding to the type:

 - any -> all
 - void -> none
 - int, bool, etc -> some

 The label for states with more complex types is determined by the labels of
 the states' children. For example, a negation type whose child has a label of
 'all' will itself have a label of 'none'. The rules for determining the label
 of a complex type are listed later in this document.

 A graph can be labeled by iteratively applying these rules until a fixpoint is
 reached.

 Even when a fixpoint is reached, some states may still be unlabeled. This can
 happen when there are cycles in the automaton, e.g. when a negation or record
 state has only itself as a child.

 If a state remains unlabeled after applying all our labeling rules then we
 know that its type is uninhabited and we can rewrite the state to void
 ('none'). We know the type is uninhabited since, if it were inhabited, the
 rules would have caused the state to be labeled with either 'some' or 'all'.
 Since the state does not have a label of 'some' or 'all', we know that it is
 uninhabited. This means we can change its type to void ('none').

 INHABITATION RULES

 Here are the rules we can use to propagate inhabitation labels through a
 graph. This assumes that we start with an automaton that has implicit labels
 for all states with basic types and with all other states unlabeled.

 - for states with type of negation
  - if the child is labeled 'all', change self to void ('none')
  - if the child is labeled 'some', label self as 'some'
  - if the child is labeled 'none', change self to any ('all')
  - leave unlabeled otherwise

 - for states with a type of union
  - if all children are 'none' (or there no children), change self to void ('none')
  - if any child is 'all', change self to any ('all')
  - if any child is 'some', label self as 'some'
  - leave unlabeled otherwise

 - for states with a compound type that can be empty (list, set)
  - label self as 'some'

 - for states with a compound type that must be non-empty (list+, set+, record)
  - if any child is 'none', change self to void ('none')
  - if all children are 'some' or 'all', label self as 'some'
  - leave unlabeled otherwise

 Note:
  In the rules above we sometimes change the type of states to void or any
  rather than relabeling the state as as 'none' or 'all'. This works because
  void has an implicit label of 'none' and any has an implicit label of 'all'.
  It makes things simpler if we change the type to void or any, rather than
  just relabeling the state, because then we can rely on the fact that all
  states labeled 'none' have a type of void and all states labeled 'all'
  have a type of any and vice versa.

 COMBINING WITH SIMPLIFICATION

 The inhabitation algorithm above overlaps with the existing Whiley
 simplification algorithm. The existing Whiley simplification algorithm has a
 lot of rules related to simplifying void and any values. These rules about
 void and any values are provided in a more general form in the type
 inhabitation algorithm above.

 The remaining simplication rules (i.e. those that aren't related to type
 inhabitation) are listed below:

 - !!X => X
 - X| => X
 - X|void => X
 - X<T | X> => T
 - (T_1 | T_2) | T_3 => (T_1 | T_2 | T_3)
 - T_1 | T_2 where T_1 :> T_2  => T_1

 These simplification rules are just as important as the type inhabitation
 rules if we want to produce a normal form for a type. When we normalise a type
 we should execute both the simplification rules and the type inhabitation
 rules so that we can get a minimal type.

 The two sets of rules can affect each other so we need to run both
 concurrently to get a minimal type. We can either do this by alternatively
 running the inhabitation and simplification rules until we reach a fixpoint.
 Another way, instead of alternating between each set of rules, is to combine
 the two sets of rules into a single set of rules, effectively allowing both
 algorithms to run concurrently.

 FINAL ALGORITHM

 func rewrite(automaton):
  loop until fixpoint is reached:
    // run local rewrite rules
    rewriteLocal(automaton)
    // strip out unreachable/uninhabited parts of the type
    for each state in the automaton:
      if the state is unlabeled then change its type to void ('none')
  return automaton
  
 func rewriteLocal(automaton)
  loop until fixpoint is reached:
    for each state in automaton:
      apply inhabitation and simplification rewrite rules to state

 EXAMPLE

 Here is an example that illustrates the algorithm. Consider the type
 int|X<X[]+>. This type has the following automaton:

    |
    v
  union
  |   |   +--+
  v   v   v  |
 int   list+ -+

 When we run the rewrite function we attach labels to each state. The state
 with the int type has an implicit label of s=some and all other states are
 _=unlabeled:

     |
     v
  union(_)
  |     |     +---+
  v     v     v   |
 int(s)  list+(_) -+

 Next we run rewriteLocal, which causes the union type to get a label of s=some
 because it has a child with the label s=some.

     |
     v
  union(s)
  |     |     +---+
  v     v     v   |
 int(s)  list+(_) -+

 The next time we run rewriteLocal there are no more changes; we're at a
 fixpoint. We now perform the second part of the loop in the rewrite function
 and find all unlabeled states, rewriting them to void with an implicit label
 of n=none.

     |
     v
  union(s)
  |     |
  v     v
 int(s)  void(n)

 Now we run rewriteLocal again and the void state is removed as a child of the
 union.

     |
     v
  union(s)
  |
  v
 int(s)   void(n)

 When rewriteLocal runs again the single-element union is replaced with its
 child.

     |
     v
  int(s)


 int(s)  void(n)

 At this point we reach a fixpoint in rewriteLocal and so we return to the
 rewrite function. We run the second part of the loop, but there are no
 unlabeled nodes to remove.

 We are at a fixpoint in the rewrite function so we return the following automaton:

     |
     v
  int

 int  void

 The normalisation algorithm will eventually strip out the unreachable nodes,
 giving a final type of 'int'.

 FIN
	INHABITATION ALGORITHM

	The idea is to label each state in a type's automaton with a ternary value to
	indicate our knowledge about its inhabitation:

	1. all - inhabited by all values (aka any)
	2. none - inhabited by no values (aka void)
	3. some - may be inhabited by some values

	States with a basic type have an implicit label corresponding to the type:

	- any -> all
	- void -> none
	- int, bool, etc -> some

	The label for states with more complex types is determined by the labels of
	the states' children. For example, a negation type whose child has a label of
	'all' will itself have a label of 'none'. The rules for determining the label
	of a complex type are listed later in this document.

	A graph can be labeled by iteratively applying these rules until a fixpoint is
	reached.

	Even when a fixpoint is reached, some states may still be unlabeled. This can
	happen when there are cycles in the automaton, e.g. when a negation or record
	state has only itself as a child.

	If a state remains unlabeled after applying all our labeling rules then we
	know that its type is uninhabited and we can rewrite the state to void
	('none'). We know the type is uninhabited since, if it were inhabited, the
	rules would have caused the state to be labeled with either 'some' or 'all'.
	Since the state does not have a label of 'some' or 'all', we know that it is
	uninhabited. This means we can change its type to void ('none').

	INHABITATION RULES

	Here are the rules we can use to propagate inhabitation labels through a
	graph. This assumes that we start with an automaton that has implicit labels
	for all states with basic types and with all other states unlabeled.

	- for states with type of negation
	- if the child is labeled 'all', change self to void ('none')
	- if the child is labeled 'some', label self as 'some'
	- if the child is labeled 'none', change self to any ('all')
	- leave unlabeled otherwise

	- for states with a type of union
	- if all children are 'none' (or there no children), change self to void ('none')
	- if any child is 'all', change self to any ('all')
	- if any child is 'some', label self as 'some'
	- leave unlabeled otherwise

	- for states with a compound type that can be empty (list, set)
	- label self as 'some'

	- for states with a compound type that must be non-empty (list+, set+, record)
	- if any child is 'none', change self to void ('none')
	- if all children are 'some' or 'all', label self as 'some'
	- leave unlabeled otherwise

	Note:
	In the rules above we sometimes change the type of states to void or any
	rather than relabeling the state as as 'none' or 'all'. This works because
	void has an implicit label of 'none' and any has an implicit label of 'all'.
	It makes things simpler if we change the type to void or any, rather than
	just relabeling the state, because then we can rely on the fact that all
	states labeled 'none' have a type of void and all states labeled 'all'
	have a type of any and vice versa.

	COMBINING WITH SIMPLIFICATION

	The inhabitation algorithm above overlaps with the existing Whiley
	simplification algorithm. The existing Whiley simplification algorithm has a
	lot of rules related to simplifying void and any values. These rules about
	void and any values are provided in a more general form in the type
	inhabitation algorithm above.

	The remaining simplication rules (i.e. those that aren't related to type
	inhabitation) are listed below:

	- !!X => X
	- X\| => X
	- X\|void => X
	- X<T \| X> => T
	- (T_1 \| T_2) \| T_3 => (T_1 \| T_2 \| T_3)
	- T_1 \| T_2 where T_1 :> T_2 => T_1

	These simplification rules are just as important as the type inhabitation
	rules if we want to produce a normal form for a type. When we normalise a type
	we should execute both the simplification rules and the type inhabitation
	rules so that we can get a minimal type.

	The two sets of rules can affect each other so we need to run both
	concurrently to get a minimal type. We can either do this by alternatively
	running the inhabitation and simplification rules until we reach a fixpoint.
	Another way, instead of alternating between each set of rules, is to combine
	the two sets of rules into a single set of rules, effectively allowing both
	algorithms to run concurrently.

	FINAL ALGORITHM

	func rewrite(automaton):
	loop until fixpoint is reached:
	// run local rewrite rules
	rewriteLocal(automaton)
	// strip out unreachable/uninhabited parts of the type
	for each state in the automaton:
	if the state is unlabeled then change its type to void ('none')
	return automaton

	func rewriteLocal(automaton)
	loop until fixpoint is reached:
	for each state in automaton:
	apply inhabitation and simplification rewrite rules to state

	EXAMPLE

	Here is an example that illustrates the algorithm. Consider the type
	int\|X<X[]+>. This type has the following automaton:

	\|
	v
	union
	\| \| +--+
	v v v \|
	int list+ -+

	When we run the rewrite function we attach labels to each state. The state
	with the int type has an implicit label of s=some and all other states are
	_=unlabeled:

	\|
	v
	union(_)
	\| \| +---+
	v v v \|
	int(s) list+(_) -+

	Next we run rewriteLocal, which causes the union type to get a label of s=some
	because it has a child with the label s=some.

	\|
	v
	union(s)
	\| \| +---+
	v v v \|
	int(s) list+(_) -+

	The next time we run rewriteLocal there are no more changes; we're at a
	fixpoint. We now perform the second part of the loop in the rewrite function
	and find all unlabeled states, rewriting them to void with an implicit label
	of n=none.

	\|
	v
	union(s)
	\| \|
	v v
	int(s) void(n)

	Now we run rewriteLocal again and the void state is removed as a child of the
	union.

	\|
	v
	union(s)
	\|
	v
	int(s) void(n)

	When rewriteLocal runs again the single-element union is replaced with its
	child.

	\|
	v
	int(s)


	int(s) void(n)

	At this point we reach a fixpoint in rewriteLocal and so we return to the
	rewrite function. We run the second part of the loop, but there are no
	unlabeled nodes to remove.

	We are at a fixpoint in the rewrite function so we return the following automaton:

	\|
	v
	int

	int void

	The normalisation algorithm will eventually strip out the unreachable nodes,
	giving a final type of 'int'.

	FIN