wickman · February 19, 2014 19:18
diff --git a/pystachio1_designdoc.txt b/pystachio1_designdoc.txt
 Pystachio 1.0 redesign



 1. Documents:

 Essentially a set of hierarchical key/value pairs a la a JSON document.

 Documents may contain:
  1. Leaves (string or numeric [integer, long, float, etc])
  2. Iterables (can contain values of 1, 2, 3; will always be coerced to
                tuples, regardless of underlying type e.g. set, so set
                properties will not be preserved when coerced to Document)
  3. Maps (keys /must/ be coercible to 1, values can be any of 1, 2, 3)

 The Document itself is just a Map (#3).  String leaves are always coerced to
 either a Ref or Fragment as described in the next section.  Maps are always
 coerced to Documents (in other words, Documents are recursive data types.)

 There are two important methods on documents:
  .bind(*args, **kw)
  .__call__(*args, **kw)

 These used to be different, but in Pystachio 1.0 they are aliased together.

 Each argument in args must be either another document or a dict which is to
 be merged with this document.  **kw should be passed to a Document and
 merged in as well.  Merge is performed by dict .update().

 Documents should provide a .raw_items() iterator analagous to .items() that
 iterates over the raw, unresolved contents of the document.  .bind() and
 .__call__() should merge .raw_items() from Documents, but plain .items()
 from dicts.  .items() should iterate over *resolved* versions of the
 dictionary contents in the case of Documents.

 NOTE: .bind and .__call__ functions return *new* Documents -- the original
 document is immutable and unchanged, so:

  >>> d = Document(hello = 'world')
  >>> d(hello = 'universe')

 Leaves 'd' unchanged, but instead returns a new Document(hello =
 'universe').

 Furthermore, as Documents are like dictionaries, so they can be accessed via
 __getitem__.  Use of __getattr__ is discouraged but just delegates to
 __getitem__.



 2. Mustaches:

 We do not implement the Mustache document format (i.e. no loops or
 conditionals.)  We just use mustaches '{{}}' to denote 'pointer.'

 There are two kinds of indirect objects:
  1. Refs:
     A single mustache instance, e.g.
       {{foo}}
       {{foo.bar}}
       {{foo.bar[baz]}}
       {{foo[`baz`]}}

  2. Fragments
     A sequence of [... str, ref, str, ref, str ...] representing a string. 
     A fragment may neither start with a Ref nor end with a Ref.  A fragment
     cannot be [Ref(...)]; instead it will just be a Ref.

 It is possible to have the following case:
  {{foo.bar[{{baz}}]}}

 This will be parsed as:
  Fragment(['{{foo.bar[', Ref('baz'), ']}}'])

 The reason for the distinction between Refs and Fragments is due to how they
 are resolved.

 Ref resolution is simply "find the reference and substitute it in place of
 this value."  The process of finding the associated reference value is
 described in the next section.

 Fragment resolution on the other hand is an iterative process.  If there are
 any Refs within a fragment that cannot be found within the Document, raise
 NotFound as it cannot be resolved.  If all Refs can be resolved,
 ''.join(resulting strings) and reparse.  There are three possible outcomes:

  1. A single Ref: perform ref resolution (i.e., substitution)
  2. Fragment with no Refs: return fragment.components()[0] (its string representation)
  3. Fragment with Refs: repeat iteration



 3. Evaluating mustache refs

 There are three forms of refs:

  {{name}}
  {{name1.name2}}
  {{name1[name2]}}

 It is possible to escape a mustache using &:
  {{&name}}

 will always result in the string {{name}} rather than Ref('name').  The &
 should be stripped as late as possible.

 Implicitly, {{name}} is equivalent to {{.name}}, which on a document means
 __getitem__['name'].  Furthermore, these can be composed together, e.g. 
 {{name1.name2[name3]}}.  It should also be possible to escape names using
 back-ticks (this was not possible in Pystachio 0.x):

  {{`foo bar`}}
  {{name1.`foo bar`}}
  {{name1[`foo bar`]}}

 As such, back-ticks are not allowed within keys in Documents.  Names must be
 escaped if they do not conform to C-style variable names, i.e. [a-zA-Z_][a-zA-Z0-9_]*
 This includes things like common team names like `aurora-team`.

 In order to find a reference value, each of the 3 primary Document types
 must understand finding via . and []:

  1. Leaves: Dereference is not supported -- raise NotFound

  2. Iterables
    a) .-dereference: Not supported
    b) []-dereference: The dereference is coerced to integer (if not
       coercible, raise NotFound type error) and indexed.

  3. Maps
    a) .-dereference: Equivalent to __getitem__
    b) []-dereference: Equivalent to __getitem__


 *scoping rules*

 Since Documents may be contained with Documents, there are scoping rules to
 be aware of.  Consider the following document

  {
    'name': '{{profile.first}} {{profile.last}}',
    'profile': {
       'title': 'mr.',
       'first': '{{title}} brian',
       'last': 'wickman',
    }
  }

 Resolving {{profile.last}} is unambiguous: dereference .profile, which
 results in the document {'title': 'mr.', 'first': '{{title}} brian', 'last': 'wickman'}. 
 Then dereference .last from said document resulting in 'wickman'.

 Resolving {{profile.first}} is slightly more nuanced.  You begin with
 resolving {{profile}}, then next you must resolve {{.first}}.  In order to
 resolve {{.first}}, we must resolve '{{title}} brian'.  '{{title}}' is
 scoped to the 'profile' document.  In this case, it's simple, as it resolves
 directly to 'mr.'.  However, should {{title}} not be found
 within 'profile', all enclosing documents must be searched *in stack order*.
 In other words, the top level document containing 'name' and 'profile' must be
 used to attempt to resolve {{title}}.  (If it were nested multiple levels,
 this would continue until no more documents are reached, at which point
 NotFound is raised.)

 *special names*

 There are two special names in the dereferencing algorithm that alter
 behavior of resolution.

 1. self: Restricts resolution of the variable to within the current document
 and will not delegate to parent documents.  This won't ever change the
 resolution result (as it's always done locally) but it will change error
 handling.  For example, the difference between {{name}} and {{self.name}}
 is that parent documents should never be able to provide the value for
 {{name}} should it not be provided within the scope of that document.

 2. super: Restricts resolution to the parent document.  For example, if you
 want explicit inheritance:

  task = Document(name = '{{super.name}}', attribute = 'value')
  job  = Document(name = 'the job', task = task)
  job.task.name will be 'the job'

 This can be used multiple times, e.g. {{super.super.cpu}}.

 This does complicate things slightly as Documents must retain the context in
 which they were evaluated:
  job.task is a Document with the parent document of job

 This is necessary in order for job.task.name to be properly evaluated when
 {{name}} is being resolved from "job".

 There are two approaches to implementing this functionality:
  i) If a Ref returns a Document 'd', yield d(super=self), but make Document
     aware that 'super' ought to be hidden from most introspection e.g.
     items(), raw_items(), and __str__.  This means that you must be very
     careful about doing resolution in that you do not lose the contents of
     self['super'].
  ii) Maintain a hidden _super attribute set by the parent and treat it
      specially.

 **Illustration 1**

 d = Document({
      'name': '{{profile.first}} "{{nicknames[{{profile.nick_index}}]}}" {{profile.last}}',
      'yob': '{{profile.yob}}',
      'nicknames': ['b', 'bibby', 'wickyman'],
      'profile': {
         'title': 'mr.',
         'first': '{{title}} brian',
         'last': 'wickman',
         'yob': 1981,
         'occupation': 'engineer',
         'nick_index': '1',
       },
    })

 assert d['name'] == d.name == 'brian "bibby" wickman'
 assert d['yob'] == d.yob == 1981
 assert d['nicknames'][0] == d.nicknames[0] == 'b'
 assert dict(d.items()) == {
   'name': 'brian "bibby" wickman',
   'yob': 1981,
   'nicknames': ('b', 'bibby', 'wickyman'),
   'profile': {
     'title': 'mr.',
     'first': 'mr. brian',
     'last': 'wickman',
     'occupation': 'engineer',
     'nick_index': '1',
   },
 }

 N.B. These should always return copies of the underlying structure.  So that
 "d['nicknames'][0] = 23" should be a no-op, as d['nicknames'] is merely a
 copy of the original.



 4. Traits

 Once a document has been acquired, information must be extracted from that
 document.  Traits are effectively schemas that dictate how to extract and
 optionally serialize content from documents.

 In an ideal world, they would be completely separate from documents, but in
 order to maintain backwards compatibility with Pystachio 0.x, they must be
 slightly conflated with documents in that they must subclass documents.

 In other words, ideally trait expression and extraction would appear like:

  class Process(Trait):
    name         = Required(String)
    cmdline      = Required(String)
    daemon       = Default(Boolean, False)
    ephemeral    = Default(Boolean, False)
    max_failures = Default(Integer, 1)

  d = Document.from_json('process.json')
  process = d.extract_trait(Process)  [extracts trait and type checks]
  subprocess.call(process.cmdline.split())

  Instead Trait's metaclass must inject Document as a parent class and
  behave like so in order to maintain backwards compatibility:

  process = Process.from_json('process.json')
  process.check()
  subprocess.call(process.cmdline.split())

 However, they should not require _any_ shared methods with Documents, so
 they can be tested in isolation.

 This separation of concerns of Documents and Traits should make it simpler
 to extract Traits from other IDLs e.g. Thrift for example using a library
 like ptsd:

  Process = ThriftTrait('thermos.thrift', 'Process')
  process = Process.from_json('process.json')
  process.check()
  process.serialize()  # serialize to thrift byte stream

 where thermos.thrift may look like:

  struct Process {
    1: required string name
    2: required string cmdline
    3: optional bool daemon = false
    4: optional bool ephemeral = false
    5: optional i16 max_failures = 1
  }

 In practice, we'll alias Struct = Trait to maintain backwards compatibility,
 but implicitly you can think of Structs as Documents with an implied Trait.


 5. Trait representation and extraction

 XXX(finishme)

 TBD.  Represent base types:
  Boolean  .coerce
  Integer  .coerce
  Float    .coerce
  String   .coerce
  Enum     .coerce

 Container types:
  List     .coerce
  Map      .coerce

 Trait.coerce(document) ?  Seems reasonable -- should also be compatible with
 ThriftTrait too.

 Requirements:
  Required
  Default

 Then Trait has a class attribute:

  _TYPE_MAP { name => ??? }
  _TRAIT_MAP { name => Value }


 6. Trait merging

 It should be possible to merge certain traits together and/or monkeypatch
 existing traits.  Traits should support a .extends() methods that accept new
 traits and merges them together to produce new ones.

 For example,

  class Job(Trait):
    name = Required(String)
    task = Required(Task)
  
  class Announceable(Trait):
    announce = Announce

  AnnounceableJob = Job.extend(Announceable)

 Then in the Pystachio Loader (which should remain essentially unchanged from
 version 0.x to version 1.x) can do things like:

  Job = Job.extend(AnnounceableJob)

 so that it is possible to create organizational-specific configuration on
 top of jobs and tasks.

 Now unfortunately this only works elegantly for top-most declarations, so it
 should be worth considering to do something along the lines of:

  Job = Job.extend_attribute('task', Task.extend(HealthCheckable))

 which could correspondingly be chained:

  Job = Job.extend_attribute('task',
      Task.extend_attribute('process', Process.extend(HealthCheckable)))


 7. Lambdas

 Just joking, I'm not proposing implementing Lambdas for Pystachio.

 However, there are certain use-cases where it makes sense to provide a
 smarter document.  Consider a typical schema:


 class Resources(Trait):
  cpu = Default(Float, 1.0)
  ram = Default(Integer, 1 * GB)
  disk = Default(Intege, 1 * GB)

 class Process(Trait):
  name = Required(String)
  cmdline = Required(String)

 class Task(Trait):
  name = Default(String, '{{processes[0].name}}')
  processes = Required(List(Process))
  resources = Default(Resources, Resources())


 Now you may construct a task in the following manner:
  task = Task(
    processes = [Process(name = 'hello_world', cmdline = 'echo hello world')],
    resources = Resources(ram = 8 * GB),
  )


 Consider the case where we want to run a JVM but must set things like -Xmx
 properly.  We may not want to set -Xmx blindly to -Xmx{{super.resources.ram}}
 but instead perform arithmetic on the values provided.

 Documents accept both dicts and Documents as Mappings, but explicitly only
 coerce dicts to Documents.  Therefore, it is perfectly acceptable to
 subclass Document to do smarter things.


 class JavaProcess(Document):
  def __init__(self, jar_name, args, **kw):
    self._jar_name = jar_name
    self._args = args
    super(JavaProcess, self).__init__(**kw)

  def __produce_cmdline(self):
    cpu = self._resolve('{{super.resources.cpu}}')
    assert cpu >= 1, 'cpu cannot be less than 1'
    if cpu <= 8:
      gc_threads = cpu
    else:
      gc_threads = int(math.ceil(8 + (cpu - 8) * 5/8))
    return 'java -jar %s -XX:ParallelGCThreads=%s %s' % (
        self._jar_name, gc_threads, ' '.join(self._args))

  def __getitem__(self, name):
    # only resolve a unique cmdline:
    if name == 'cmdline':
      return self.__produce_cmdline()
    return super(JavaProcess, self).__getitem__(name)


 This way it is possible to do:

  task = Task(
    processes = [JavaProcess('foo.jar', ['-httpPort', '80'], name='foo')],
    resources = '{{profile.resources}}',
  )(profile = {'resources': {'cpu': 16}})

 and have the following hold true:

  assert task.processes[0].cmdline == 'java -jar foo.jar -XX:ParallelGCThreads=13.0 -httpPort 80'


 whereas before it was not possible to add logic to template evaluation.

 The downside of course is that it is no longer possible to do d =
 Document.from_json('task.json') and have any way to express that a process
 should be evaluated as a JavaProcess.  However, task.to_json() would work
 correctly in the above situation.


 8. Dynamic Documents

 Much in the same spirit of section 7, it is possible to consider dynamic
 documents.  There are specific use-cases where we have need for this at
 Twitter:

  1) Resolving package locations
  2) Resolving jenkins artifacts
  3) Resolving build artifacts

 For example,
 {{artifactory[`wickman-cache`][`org.apache.aurora.scheduler`][`0.5.0`]}}

 This might dynamically resolve this artifact and replace it with an https
 URL that can be curled.
	Pystachio 1.0 redesign



	1. Documents:

	Essentially a set of hierarchical key/value pairs a la a JSON document.

	Documents may contain:
	1. Leaves (string or numeric [integer, long, float, etc])
	2. Iterables (can contain values of 1, 2, 3; will always be coerced to
	tuples, regardless of underlying type e.g. set, so set
	properties will not be preserved when coerced to Document)
	3. Maps (keys /must/ be coercible to 1, values can be any of 1, 2, 3)

	The Document itself is just a Map (#3). String leaves are always coerced to
	either a Ref or Fragment as described in the next section. Maps are always
	coerced to Documents (in other words, Documents are recursive data types.)

	There are two important methods on documents:
	.bind(args, *kw)
	.__call__(args, *kw)

	These used to be different, but in Pystachio 1.0 they are aliased together.

	Each argument in args must be either another document or a dict which is to
	be merged with this document. **kw should be passed to a Document and
	merged in as well. Merge is performed by dict .update().

	Documents should provide a .raw_items() iterator analagous to .items() that
	iterates over the raw, unresolved contents of the document. .bind() and
	.__call__() should merge .raw_items() from Documents, but plain .items()
	from dicts. .items() should iterate over resolved versions of the
	dictionary contents in the case of Documents.

	NOTE: .bind and .__call__ functions return new Documents -- the original
	document is immutable and unchanged, so:

	>>> d = Document(hello = 'world')
	>>> d(hello = 'universe')

	Leaves 'd' unchanged, but instead returns a new Document(hello =
	'universe').

	Furthermore, as Documents are like dictionaries, so they can be accessed via
	__getitem__. Use of __getattr__ is discouraged but just delegates to
	__getitem__.



	2. Mustaches:

	We do not implement the Mustache document format (i.e. no loops or
	conditionals.) We just use mustaches '{{}}' to denote 'pointer.'

	There are two kinds of indirect objects:
	1. Refs:
	A single mustache instance, e.g.
	{{foo}}
	{{foo.bar}}
	{{foo.bar[baz]}}
	{{foo[`baz`]}}

	2. Fragments
	A sequence of [... str, ref, str, ref, str ...] representing a string.
	A fragment may neither start with a Ref nor end with a Ref. A fragment
	cannot be [Ref(...)]; instead it will just be a Ref.

	It is possible to have the following case:
	{{foo.bar[{{baz}}]}}

	This will be parsed as:
	Fragment(['{{foo.bar[', Ref('baz'), ']}}'])

	The reason for the distinction between Refs and Fragments is due to how they
	are resolved.

	Ref resolution is simply "find the reference and substitute it in place of
	this value." The process of finding the associated reference value is
	described in the next section.

	Fragment resolution on the other hand is an iterative process. If there are
	any Refs within a fragment that cannot be found within the Document, raise
	NotFound as it cannot be resolved. If all Refs can be resolved,
	''.join(resulting strings) and reparse. There are three possible outcomes:

	1. A single Ref: perform ref resolution (i.e., substitution)
	2. Fragment with no Refs: return fragment.components()[0] (its string representation)
	3. Fragment with Refs: repeat iteration



	3. Evaluating mustache refs

	There are three forms of refs:

	{{name}}
	{{name1.name2}}
	{{name1[name2]}}

	It is possible to escape a mustache using &:
	{{&name}}

	will always result in the string {{name}} rather than Ref('name'). The &
	should be stripped as late as possible.

	Implicitly, {{name}} is equivalent to {{.name}}, which on a document means
	__getitem__['name']. Furthermore, these can be composed together, e.g.
	{{name1.name2[name3]}}. It should also be possible to escape names using
	back-ticks (this was not possible in Pystachio 0.x):

	{{`foo bar`}}
	{{name1.`foo bar`}}
	{{name1[`foo bar`]}}

	As such, back-ticks are not allowed within keys in Documents. Names must be
	escaped if they do not conform to C-style variable names, i.e. [a-zA-Z_][a-zA-Z0-9_]*
	This includes things like common team names like `aurora-team`.

	In order to find a reference value, each of the 3 primary Document types
	must understand finding via . and []:

	1. Leaves: Dereference is not supported -- raise NotFound

	2. Iterables
	a) .-dereference: Not supported
	b) []-dereference: The dereference is coerced to integer (if not
	coercible, raise NotFound type error) and indexed.

	3. Maps
	a) .-dereference: Equivalent to __getitem__
	b) []-dereference: Equivalent to __getitem__


	scoping rules

	Since Documents may be contained with Documents, there are scoping rules to
	be aware of. Consider the following document

	{
	'name': '{{profile.first}} {{profile.last}}',
	'profile': {
	'title': 'mr.',
	'first': '{{title}} brian',
	'last': 'wickman',
	}
	}

	Resolving {{profile.last}} is unambiguous: dereference .profile, which
	results in the document {'title': 'mr.', 'first': '{{title}} brian', 'last': 'wickman'}.
	Then dereference .last from said document resulting in 'wickman'.

	Resolving {{profile.first}} is slightly more nuanced. You begin with
	resolving {{profile}}, then next you must resolve {{.first}}. In order to
	resolve {{.first}}, we must resolve '{{title}} brian'. '{{title}}' is
	scoped to the 'profile' document. In this case, it's simple, as it resolves
	directly to 'mr.'. However, should {{title}} not be found
	within 'profile', all enclosing documents must be searched in stack order.
	In other words, the top level document containing 'name' and 'profile' must be
	used to attempt to resolve {{title}}. (If it were nested multiple levels,
	this would continue until no more documents are reached, at which point
	NotFound is raised.)

	special names

	There are two special names in the dereferencing algorithm that alter
	behavior of resolution.

	1. self: Restricts resolution of the variable to within the current document
	and will not delegate to parent documents. This won't ever change the
	resolution result (as it's always done locally) but it will change error
	handling. For example, the difference between {{name}} and {{self.name}}
	is that parent documents should never be able to provide the value for
	{{name}} should it not be provided within the scope of that document.

	2. super: Restricts resolution to the parent document. For example, if you
	want explicit inheritance:

	task = Document(name = '{{super.name}}', attribute = 'value')
	job = Document(name = 'the job', task = task)
	job.task.name will be 'the job'

	This can be used multiple times, e.g. {{super.super.cpu}}.

	This does complicate things slightly as Documents must retain the context in
	which they were evaluated:
	job.task is a Document with the parent document of job

	This is necessary in order for job.task.name to be properly evaluated when
	{{name}} is being resolved from "job".

	There are two approaches to implementing this functionality:
	i) If a Ref returns a Document 'd', yield d(super=self), but make Document
	aware that 'super' ought to be hidden from most introspection e.g.
	items(), raw_items(), and __str__. This means that you must be very
	careful about doing resolution in that you do not lose the contents of
	self['super'].
	ii) Maintain a hidden _super attribute set by the parent and treat it
	specially.

	Illustration 1

	d = Document({
	'name': '{{profile.first}} "{{nicknames[{{profile.nick_index}}]}}" {{profile.last}}',
	'yob': '{{profile.yob}}',
	'nicknames': ['b', 'bibby', 'wickyman'],
	'profile': {
	'title': 'mr.',
	'first': '{{title}} brian',
	'last': 'wickman',
	'yob': 1981,
	'occupation': 'engineer',
	'nick_index': '1',
	},
	})

	assert d['name'] == d.name == 'brian "bibby" wickman'
	assert d['yob'] == d.yob == 1981
	assert d['nicknames'][0] == d.nicknames[0] == 'b'
	assert dict(d.items()) == {
	'name': 'brian "bibby" wickman',
	'yob': 1981,
	'nicknames': ('b', 'bibby', 'wickyman'),
	'profile': {
	'title': 'mr.',
	'first': 'mr. brian',
	'last': 'wickman',
	'occupation': 'engineer',
	'nick_index': '1',
	},
	}

	N.B. These should always return copies of the underlying structure. So that
	"d['nicknames'][0] = 23" should be a no-op, as d['nicknames'] is merely a
	copy of the original.



	4. Traits

	Once a document has been acquired, information must be extracted from that
	document. Traits are effectively schemas that dictate how to extract and
	optionally serialize content from documents.

	In an ideal world, they would be completely separate from documents, but in
	order to maintain backwards compatibility with Pystachio 0.x, they must be
	slightly conflated with documents in that they must subclass documents.

	In other words, ideally trait expression and extraction would appear like:

	class Process(Trait):
	name = Required(String)
	cmdline = Required(String)
	daemon = Default(Boolean, False)
	ephemeral = Default(Boolean, False)
	max_failures = Default(Integer, 1)

	d = Document.from_json('process.json')
	process = d.extract_trait(Process) [extracts trait and type checks]
	subprocess.call(process.cmdline.split())

	Instead Trait's metaclass must inject Document as a parent class and
	behave like so in order to maintain backwards compatibility:

	process = Process.from_json('process.json')
	process.check()
	subprocess.call(process.cmdline.split())

	However, they should not require _any_ shared methods with Documents, so
	they can be tested in isolation.

	This separation of concerns of Documents and Traits should make it simpler
	to extract Traits from other IDLs e.g. Thrift for example using a library
	like ptsd:

	Process = ThriftTrait('thermos.thrift', 'Process')
	process = Process.from_json('process.json')
	process.check()
	process.serialize() # serialize to thrift byte stream

	where thermos.thrift may look like:

	struct Process {
	1: required string name
	2: required string cmdline
	3: optional bool daemon = false
	4: optional bool ephemeral = false
	5: optional i16 max_failures = 1
	}

	In practice, we'll alias Struct = Trait to maintain backwards compatibility,
	but implicitly you can think of Structs as Documents with an implied Trait.


	5. Trait representation and extraction

	XXX(finishme)

	TBD. Represent base types:
	Boolean .coerce
	Integer .coerce
	Float .coerce
	String .coerce
	Enum .coerce

	Container types:
	List .coerce
	Map .coerce

	Trait.coerce(document) ? Seems reasonable -- should also be compatible with
	ThriftTrait too.

	Requirements:
	Required
	Default

	Then Trait has a class attribute:

	_TYPE_MAP { name => ??? }
	_TRAIT_MAP { name => Value }


	6. Trait merging

	It should be possible to merge certain traits together and/or monkeypatch
	existing traits. Traits should support a .extends() methods that accept new
	traits and merges them together to produce new ones.

	For example,

	class Job(Trait):
	name = Required(String)
	task = Required(Task)

	class Announceable(Trait):
	announce = Announce

	AnnounceableJob = Job.extend(Announceable)

	Then in the Pystachio Loader (which should remain essentially unchanged from
	version 0.x to version 1.x) can do things like:

	Job = Job.extend(AnnounceableJob)

	so that it is possible to create organizational-specific configuration on
	top of jobs and tasks.

	Now unfortunately this only works elegantly for top-most declarations, so it
	should be worth considering to do something along the lines of:

	Job = Job.extend_attribute('task', Task.extend(HealthCheckable))

	which could correspondingly be chained:

	Job = Job.extend_attribute('task',
	Task.extend_attribute('process', Process.extend(HealthCheckable)))


	7. Lambdas

	Just joking, I'm not proposing implementing Lambdas for Pystachio.

	However, there are certain use-cases where it makes sense to provide a
	smarter document. Consider a typical schema:


	class Resources(Trait):
	cpu = Default(Float, 1.0)
	ram = Default(Integer, 1 * GB)
	disk = Default(Intege, 1 * GB)

	class Process(Trait):
	name = Required(String)
	cmdline = Required(String)

	class Task(Trait):
	name = Default(String, '{{processes[0].name}}')
	processes = Required(List(Process))
	resources = Default(Resources, Resources())


	Now you may construct a task in the following manner:
	task = Task(
	processes = [Process(name = 'hello_world', cmdline = 'echo hello world')],
	resources = Resources(ram = 8 * GB),
	)


	Consider the case where we want to run a JVM but must set things like -Xmx
	properly. We may not want to set -Xmx blindly to -Xmx{{super.resources.ram}}
	but instead perform arithmetic on the values provided.

	Documents accept both dicts and Documents as Mappings, but explicitly only
	coerce dicts to Documents. Therefore, it is perfectly acceptable to
	subclass Document to do smarter things.


	class JavaProcess(Document):
	def __init__(self, jar_name, args, **kw):
	self._jar_name = jar_name
	self._args = args
	super(JavaProcess, self).__init__(**kw)

	def __produce_cmdline(self):
	cpu = self._resolve('{{super.resources.cpu}}')
	assert cpu >= 1, 'cpu cannot be less than 1'
	if cpu <= 8:
	gc_threads = cpu
	else:
	gc_threads = int(math.ceil(8 + (cpu - 8) * 5/8))
	return 'java -jar %s -XX:ParallelGCThreads=%s %s' % (
	self._jar_name, gc_threads, ' '.join(self._args))

	def __getitem__(self, name):
	# only resolve a unique cmdline:
	if name == 'cmdline':
	return self.__produce_cmdline()
	return super(JavaProcess, self).__getitem__(name)


	This way it is possible to do:

	task = Task(
	processes = [JavaProcess('foo.jar', ['-httpPort', '80'], name='foo')],
	resources = '{{profile.resources}}',
	)(profile = {'resources': {'cpu': 16}})

	and have the following hold true:

	assert task.processes[0].cmdline == 'java -jar foo.jar -XX:ParallelGCThreads=13.0 -httpPort 80'


	whereas before it was not possible to add logic to template evaluation.

	The downside of course is that it is no longer possible to do d =
	Document.from_json('task.json') and have any way to express that a process
	should be evaluated as a JavaProcess. However, task.to_json() would work
	correctly in the above situation.


	8. Dynamic Documents

	Much in the same spirit of section 7, it is possible to consider dynamic
	documents. There are specific use-cases where we have need for this at
	Twitter:

	1) Resolving package locations
	2) Resolving jenkins artifacts
	3) Resolving build artifacts

	For example,
	{{artifactory[`wickman-cache`][`org.apache.aurora.scheduler`][`0.5.0`]}}

	This might dynamically resolve this artifact and replace it with an https
	URL that can be curled.
No results found