Markov Jr. is an open source C# application that creates procedural content primarily via applying Markov rewrite rules to a 2D or 3D grid. A rewrite rule has an input and output pattern, which essentially specifies what pattern to look for in the existing grid, and what to replace it with.
For example, given a 2D grid, this would replace any white dot with a white cross:
***/*W*/*** :: *W*/WWW/*W*
The left hand side is the rule input, and the right hand side is the output. The /
character is used to delimit rows, and space is used to delimit Z-layers (in 3D grids). The input rule above translates to the 2D pattern:
***
*W*
***
Note that the input and output of a rule should be the same size (i.e. character length).
Markov Jr. uses a limited set of characters that are associated with colors; in this case W
represents white. Wildcards are represented by *
. If a wildcard is used in the input of a rule, it means it will match any color. Meanwhile if a wildcard is in the rule output, it means that the grid cell should not be modified by the rule.
Rules can specify a combination of rotations/reflections allowed. Internally this creates duplicates of the rule rotated/reflected as specified, so is primarily a convenience for authoring.
Marjov Jr, specifies Rule Nodes, which form a tree structure. Different rule nodes perform different types of operations and/or influence which of their child nodes are executed at any point.
In general, the runtime steps forward the simulation, asking the currently executing node to apply its transformation or return false if the node has more work to do. Generally Markov Jr. programs have a top-level Sequence node, which will execute each child node in sequence, continuing to the next node in its list until it is done. Once all nodes have finished executing, execution ends.
There are three main types of Rule Nodes that execute rewrite rules:
- The
<one>
node takes a set of rewrite rules. Each step, it will find all rules that have a match somewhere in the grid, and apply a single (randomly-chosen) match to apply. The rule will end if there are no rules left that have any matches. - The
<all>
node takes a set of rewrite rules and applies all possible rewrites of all its rules in sequence, all within one step. The sequence in which all possible rewrites is performed is random, and an earlier rewrite in the sequence might prevent later rewrites from taking place. - The
<prl>
(parallel) node will apply all possible rewrites of all its rules each step, or return false if no rules are matched. Note that since it performs all rewrites in parallel (against a static input state), it might rewrite individual grid cell multiple times if there are overlapping rules. It seems intended primarily as an optimization where the guarantees of<all>
are not required.
Here is an example of a simple rule program that makes a maze:
1. <one values="BRW" origin="True">
2. <rule in="RBB" out="WWR"/>
3. <rule in="R*W" out="W*R"/>
4. </one>
- Line 1: defines a one node, and the list of all possible values (colors) that can exist in the grid. The first listed color (
B
=Black) indicates what the starting color is for all grid cells. The "origin" attribute specifies that a single pixel using the second listed color (R
=Red) should be placed in the center of the grid. - Line 2: Defines the first rule for the One node. This searches for any occurrence of
RBB
in the grid and replaces it withWWR
. By default, rule patterns are fully rotatable/mirrored, so this will also matchBBR
,R/B/B
, etc. - Line 3: Defines the second rule for the One node. A
*
wildcard is used in both the rule input and output, meaning that any color matches the rule at this point, and will not be modified when this rule is applied.
Each step, the runtime will find all potential matches for both rules, and then (by default) pick a random one to apply. One the first step, a match for the first rule RBB
will be found in the center of the grid (which has a single red pixel surrounded by black). It replaces those 3 grid cells with WWR
, effectively moving the red "cursor" forward. On the second step, there are matches for both rules: RBB
will be the area in front of the cursor, and RWW
will be the area behind the cursor (because WWR
matches the mirrored version of R*W
). The runtime will randomly pick which rule to apply this step, adjust colors based on the rule output, and continue to the next step. Effectively this will move the red "cursor" randomly around the grid, leaving behind white pixels. If the red cursor reaches a dead end (no neighboring black pixels), no match for RBB
will be found, so it will be forced to backtrack via the second rule. Once there are no more matches for either rule, the program terminates.
Potential fields are a generalization of Dijkstra Fields. One can think of a Dijkstra field as an iterative flood-fill of a graph, originating from a point or set of points. The origin point cells are initialized to 0. Each cell neighboring an origin cell is set to 1, then neighbors of those cells not yet touched are set to 2, and so on. Once the Dijkstra field has been built, the value of each cell represents its distance from the nearest origin cell. A simple path-finding algorithm to a source from any point on the field is to simply move to the neighbor cell that has the lowest distance value.
Potential fields in Markov Jr. specify a color to create the field for, the color that serves as the "origin points", and a set of colors specifying the fillable area (i.e. the "substrate"). For example:
<field for="G" from="R" on="WY" />
This creates a potential field for the color Green, where all Red pixels serve as the source points for the "flood fill", and the flood fill can touch only White or Yellow cells. Any cells that cannot be reached (e.g. a Blue wall dividing the grid in half) would have a negative potential.
Potential fields must be specified within a Rule Node. The user can specify whether a field is created on each step, or only the first time a rule node is (re)visited. When the rule node makes a choice of which potential rewrite to apply, it consults the potential fields: if the rewrite would change a color on the grid, it consults the potential field for that color at that cell. If the potential field indicates that cell is unreachable (a negative value in the potential field's cell), that rule application is disallowed. Otherwise, a score is calculated, which is the sum of all values in the potential field that the rewrite would change. The rule application that has the lowest score, representing the quickest approach to the field origin, is chosen.
Some additional tuning parameters are available that allow some randomized influence, so that the lowest score is not necessarily always picked. In particular, the "temperature" attribute on a rule node adds some randomization to each score, allowing you to control how quickly the field is followed.
Potential fields are primarily useful in terms of guiding or biasing rule transformations towards points on the map. Here is a simple example from BiasedMazeGrowth.xml
:
<sequence values="NWAUB" origin="True">
<prl in="N/*" out="U/*" symmetry="()"/>
<prl in="*/U" out="*/B" symmetry="()"/>
<one temperature="10.0">
<rule in="WBB" out="WAW"/>
<field for="W" to="N" on="B" recompute="False"/>
</one>
</sequence>
This first initializes the grid to Brown (N
), with a single white pixel in the center. Then, the first parallel (prl
) node replaces all 1x2 blocks that have Brown on top with Blue (U
). Symmetry is disabled via "()"
, so this will only apply to that exact configuration. This leaves a single horizontal Brown line on the bottom of the grid, with everything else Blue.
The next parallel node replaces all 1x2 blocks that have Blue on bottom with Black on bottom. This replaces everything except the bottom and top row with Black. In effect, the two <prl>
nodes create a black background, with a horizontal blue line on top of the grid, a horizontal brown line on the bottom, and a white pixel in the center.
The next One node has a single rule that replaces WBB
with WAW
(A
is grey). Application of this rule without a potential field would look something like this. However, when the One node is first visited, it creates a potential field for White that "points toward" Brown (with the potential field flood-filling all Black nodes). This will bias toward the selection of potential rule applications that place white cells toward the bottom of the grid (where the brown line is). The temperature
attribute controls how much it biases toward rewrites that place white toward the bottom of the grid versus rewrites that move in other directions.
While potential fields are useful for guiding results towards what is desired, MarkovJr supports a stronger means of constraining outputs via the use of what it calls observations. These essentially state a desired goal state that must be met, and biases the random selections towards meeting that state.
Like potential fields, observations are added to Rule Nodes. An observation defines a source color and one or more desired colors as the goal, as well as a "search color", which specifies which grid cells this observation should apply to. For example:
<observe value="G" from="B" to="R"/>
Specifies that "for every Green grid cell, initialize it to Black and ensure it eventually becomes Red."
Every step, all observations in the rule node are consulted to create a future-state, via the following:
For each cell of the current grid state:
If the cell is an observation's "search color",
Set the future-state for this cell to the observation's desired color ("to").
Set the current state for this cell to its source color ("from").
Otherwise,
Set the future-state for this cell to its current state.
Thus, the future-state is initialized to the current state, except all observed values are replaced with their desired values. In practice, the set of observations for a rule should be used to specify the desired final state. Implicit here is that anything unobserved should not change. Note that since an observation can specify multiple "to" values, there may be multiple possible final states that are acceptable. For example, a "to" value set to WR
on an observation indicates that the final value could be White or Red.
From here, there are two methods of inferrence: unidirectional inference, and bidirectional inference. The latter is more powerful but much slower.
Given a future-state, a potential field is created for each possible color in the grid (which is known in advance). If there are four possible colors on the grid, four potential fields are created.
Each cell in a color's potential field is initialized to a distance of 0 if that is the future-state's color at that cell, or -1 otherwise. All potential field cells set to 0 are added to a queue.
One by one, each potential field cell is removed from the queue: the location of the cell, the color of the potential field, and its distance value.The distance of a potential field cell corresponds to the number of rule applications remaining to reach the desired color. Since we have initialized each color's potential field based on our desired-future state, the distance of these will be zero.
For each cell pulled from the queue, each rule in the observation's parent Rule Node is considered to see if the rule's output could be applied at that time. Multiple offsets of the rule may need to be considered if the rule has multiple occurrences of the same color, e.g. WAW
. To determine whether the application is allowed, it checks the potential fields to see if each cell in the rule's output corresponds with a change at a lesser distance.
If a rule can be applied, it sets the distance of any previously unvisited cell (where potential is -1) to the distance of the cell popped from the queue, plus one. It then adds all of these cells to the queue. The process continues until the queue has been fully emptied. Once complete, each cell of each color's potential field is representation of how far away the desired future-state would be if that cell was changed to that color.
Note that, currently in Markov Jr, the potential fields are recomputed each step. While it is possible to see where this might be desireable, it is unclear whether doing so is strictly necessary.
Once the potential fields have been built, selection of which potential rule application to apply is done similarly to what was described in the "Potential Fields" section: the rule application that moves toward the goal the quickest (on average) is chosen (with some user-controlled biasing). In particular, rule applications that would lead to states where the desired outcome is impossible are avoided, because they will appear in a color's potential field as -1, meaning that if a cell was changed to that color, the desired future-state would be impossible to achieve.
Unidirectional inference can thus be summarized as looking backward from the desired future-state and finding all transformations that would leads us from the current state to that future-state, with a distance value that acts as a heuristic to guide toward the future-state. It is important to node that this does not guarantee that the future-state will be found, given an arbitrary set of rules. Since it is following essentially an average of multiple potential fields, it remains possible to get "stuck" in a dead-end of decisions. Still, this can be worthwhile: it is fairly efficient to calculate and can be used to encode "weak" constraints; i.e. constraints that are desireable but not required, or fuzzy.
For hard constraints, the more powerful but less efficient Bidirectional Inference method is needed.
Unidirectional Inference employed backward propagation: it starts with the desired future-state and essentially applies rules in backward order to figure out paths from the current state. As its name implies, bidirectional inference also performs forward propagation.
Like backward propagtion, forward propagation calculates a separate potential field for every possible color. Instead of initializing based on future-state, the potential field cells are initialized to 0 for the current state of the grid. Propagation proceeds the same as backward propagation does: cells are dequeued, rule applications are considered, and potential fields are "grown" outward until the queue is fully empty. Whereas backward propagation creates fields that represent transformations to reach future-state, forward-propagation creates fields that represent all possible transformations from the current state (and how many rule applications it takes to reach that state).
Once a forward and backward potential set has been computed, forward and backward estimates are computed. The forward estimate is determined by inspecting the future-state: for each cell, given the color(s) specified in the future-state, the forward potential field value for that color/cell is gathered and summed. The backward estimate is what you might expect: given the current state, the backward potential field for each color/cell in the backward potential fields is gathered and summed.
From here, the root node of a state-tree is constructed. This tree will represent all potential future actions: each node in the tree contains a copy of the grid at a given state, and the children of each node represent all possible choices that could be applied. Since this state-tree would quickly become combinitatorically massive, a search heursitic is employed (as well as an optional, user-specified, search depth).
State-tree nodes are added to a priority-queue, beginning with the root node. The priority of a node is calculated at the time of insertion based on the forward/backward estimates, a depth coefficient, and a small amount of randomization. The depth coefficient influences whether to prefer nodes that are deeper in the search tree (which might be closer to the desired state) or higher (which might be faster by avoiding deep or difficult-to-solve subtrees).
Each time a node is removed from the priority-queue, its range of potential child states is computed. For <one>
nodes, this creates a child node for each possible rule application that could happen, given the state of the parent node. For <all>
nodes, this creates a child node for each permutation of rule application order that could be applied. Recall that <all>
nodes execute as many of its rules as possible in sequence, within a single step -- this by itself can result in a potential combinatoric explosion if the author is not careful.
The created children nodes are then checked to see if this state already exists as a node in the state-tree. This may be possible; different combinations or orders of rule transformations might lead to the same state. If this occurs, and the current node depth is less than the existing node's depth, the existing node is reparented and reinserted with new priority into the queue. This is to ensure that the smallest number of transformations are used to lead to that state (if it is determined desireable).
Otherwise, if the state is new, a new state node is created, its future/backward potentials are computed, and it is added to the priority-queue.If at any point, a node is determined to have reached the goal if the forward state estimate equals zero, i.e., the future-state and the node's state match. Upon finding a desired state, a trajectory is calculated by visiting each node's parent in the state-tree, starting from the desired state node, storing the grid state at that point. Thus a vector of grid states is computed, representing the rule transformations to lead from the current state to the desired state. Once a trajectory has been computed, the search-tree is disposed.
If a node has a computed trajectory, it simply applies each state in the trajectory each time the runtime steps, until it has reached the end. At this point, the Rule Node returns false to indicate the next Rule Node should begin execution (or the program should terminate if there are no rule nodes left). This multi-step application of the trajectory is not strictly necessary - execution could simply immediately skip to the trajectory's destination since it has been proven to be reachable - but following the trajectory step-by-step aids in visualization or animation (see Sokaban puzzles).
This is brilliant, and the inference discussion reminds me of Andrew Wuensche's and Mike Lesser's gorgeous coffee table book entitled "The Global Dynamics of Cellular Automata", and the algorithm for directly computing the multiple merging trajectories of the systems past, and running CA backwards in time to compute a state's predecessors with a direct reverse algorithm.
I've written about it on Hacker News:
https://news.ycombinator.com/item?id=14468707
DonHopkins on June 2, 2017 | on: Oh My Gosh, It’s Covered in Rule 30s
There's a thing called a "Garden of Eden" configuration that has no predecessors, which is impossible to get to from any other possible state.
For a rule like Life, there are many possible configurations that must have been created by God or somebody with a bitmap editor (or somebody who thinks he's God and uses Mathematica as a bitmap editor, like Stephen Wolfram ;), because it would have been impossible for the Life rule to evolve into those states. For example, with the "Life" rule, no possible configuration of cells could ever evolve into all cells with the value "1".
https://en.wikipedia.org/wiki/Garden_of_Eden_(cellular_automaton)
For a rule that simply sets the cell value to zero, all configurations other than pure zeros are garden of eden states, and they all lead directly into a one step attractor of all zeros which always evolves back into itself, all zeros again and again (the shortest possible attractor loop that leads directly to itself).
There is a way of graphically visualizing that global rule state space, which gives insight into the behavior of the rule and the texture and complexity of its state space!
Andrew Wuensche and Mike Lesser published a gorgeous coffee table book entitled "The Global Dynamics of Cellular Automata" that plots out the possible "Garden of Eden" states and the "Basins of Attraction" they lead into of many different one-dimensional cellular automata like rule 30.
http://uncomp.uwe.ac.uk/wuensche/gdca.html
The beautiful color plates begin on page 79 in the free pdf file:
https://donhopkins.com/home/global_dynamics_of_CA.pdf
I've uploaded the money shots to imgur:
http://imgur.com/gallery/s3dhz
Those are not pictures of 1-d cellular automata rule cell states on a grid themselves, but they are actually graphs of the abstract global state space, showing merging and looping trajectories (but not branching since the rules are deterministic -- time flows from the garden of eden leaf tips around the perimeter into (then around) the basin of attractor loops in the center, merging like springs (GOE) into tributaries into rivers into the ocean (BOA)).
The rest of the book is an atlas of all possible 1-d rules of a particular rule numbering system (like rule 30, etc), and the last image is the legend.
He developed a technique of computing and plotting the topology network of all possible states a CA can get into -- tips are "garden of eden" states that no other states can lead to, and loops are "basins of attraction".
Here is the illustration of "rule 30" from page 144 (the legend explaining it is the last photo in the above album). [I am presuming it's using the same rule numbering system as Wolfram but I'm not sure -- EDIT: I visually checked the "space time pattern from a singleton seed" thumbnail against the illustration in the article, and yes it matches rule 30!]
http://imgur.com/a/lKAbP
"The Global Dynamics of Cellular Automata introduces a powerful new perspective for the study of discrete dynamical systems. After first looking at the unique trajectory of a system's future, an algorithm is presented that directly computes the multiple merging trajectories of the systems past. A given cellular automaton will "crystallize" state space into a set of basins of attraction that typically have a topology of trees rooted on attractor cycles. Portraits of these objects are made accessible through computer generated graphics. The "Atlas" presents a complete class of such objects, and is intended , with the accompanying software, as an aid to navigation into the vast reaches of rule behavior space. The book will appeal to students and researchers interested in cellular automata, complex systems, computational theory, artificial life, neural networks, and aspects of genetics."
https://en.wikipedia.org/wiki/Attractor
"Basins of attraction in cellular automata", by Andrew Wuensche:
http://onlinelibrary.wiley.com/doi/10.1002/1099-0526(200007/08)5:6%3C19::AID-CPLX5%3E3.0.CO;2-J/full
"To achieve the global perspective. I devised a general method for running CA backwards in time to compute a state's predecessors with a direct reverse algorithm. So the predecessors of predecessors, and so on, can be computed, revealing the complete subtree including the "leaves," states without predecessors, the so-called “garden-of-Eden" states.
Trajectories must lead to attractors in a finite CA, so a basin of attraction is composed of merging trajectories, trees, rooted on the states making up the attractor cycle with a period of one or more. State-space is organized by the "physics" underlying the dynamic behavior into a number of these basins of attraction, making up the basin of attraction field."
scrumper on June 2, 2017
DonHopkins on June 2, 2017
If you like the book, you'll love the code!
http://www.ddlab.com/
http://www.ddlab.com/screensave3.png
http://uncomp.uwe.ac.uk/wuensche/2006_ddlab_slides1.pdf
http://uncomp.uwe.ac.uk/wuensche/meta.html
http://uncomp.uwe.ac.uk/wuensche/boa_idea.html
http://uncomp.uwe.ac.uk/wuensche/downloads/papers/2008_dd_overview_preprint.pdf