Summary of performance optimizations found

#Summary of optimizations in Ember-Graph

This combines 3 branches worth of optimization work into one PR

TL;DR version by branch:

First branch: feature/addingHash

Adds a hash to store/relationship.js to allow quickly locating a relationship based on the id values of the endpoints

Second branch: feature/moreOptimizations

Adds a "metaMap" member to models to track actual relationships w/o needing to iterate through all attributes
Requires manually climbing through the prototype hierarchy in eachRelationship to maintain EG inheritance

Third branch: feature/evenMoreOptimizations

Switches pendingRelationships in store/relationship to also use a relationshipHash
Greatly refactors updateRelationshipsWithNewId to avoid performance bottlenecks

Summary of changes by file:

store/relationship.js

Most operations require some form of "find targeted relationships" which runs basically like this:

get set of ALL defined relationships in EG,
filter these down based on matches to type1/id1 or type2/id2 (occasionally name is mixed in as well)

This is changed to:

Maintain the set of all relationships as indexed by id1 and id2 (future revision will likely make this type+':'+id

This is done through a hash where the keys are ids, and the elements are linked list elements containing references to relationships linking the given id.

Now, most "find target relationships" work like this:

get set of ALL relationships based on id1/id2 in relationship (or id/type) we want to match
filter these down based on matches to type1/id1

feature/addingHash introduces the relationship_hash component as described above
feature/evenMoreOptimizations changes queuedRelationships to also use a relationshipHash

relationship/relationship_hash.js

Implementation of the above described open hash table

added in feature/addingHash

ember-graph/shim

Added new relationship_hash

-modified in feature/addingHash

model/relationship.js

When performing a "eachRelationship", EG is depending on the underlying Ember implementation to iterate through all attributes of a given object and then is selecting only the ones with a defined "meta"; however, this is extremely inefficient.

This is changed to:

When a relationship is declared on a model, the associated meta is tracked in the "metaMap" member of the model, the name of the relationship is also added to the "_all" field in the metaMap.

Now when we need to iterate through all relationships for a given model, we can look at the prototype (which is the model data structure) and directly query it's metaMap to find all of the relationships and to easily fetch and iterate across the associated metas.

However, since we aren't iterating across all members (including prototype members), we are breaking the effective inheritance scheme for EG. This is easily restored by explicitly iterating through the prototype hierarchy until we fall off of the EG model inheritance chain (no metaMap is defined).

Specifically, we want include the _all member to avoid any form of hasOwnProperty and for/in here as the number of fields we are forced to iterate through again provides a performance drain.

added in feature/evenMoreOptimiations

General observations

Most of the performance issues we were seeing were with low level commonly repeated operations, especially searching for relationships. While the original code was probably written to maximize modularity (e.g. by allowing for the possibility of filtering relationships in multiple ways), and seems to assume that load operations are infrequent and therefore should not cause any severe performance bottlenecks, we found through profiling that the biggest performance bottlenecks were in the following areas of the original code:

store/relationship.js: relationshipsForRecord()

linear search of all relationships currently loaded in EG

model/relationship.js: eachRelationship()

linear search of all attributes and inherited attributes of an EmberObject

store/relationship.js: connectQueuedRelationships()

linear search of all queuedRelationships

The central issue in each of these was that we were doing a brute force linear search across a potentially very large search space (all attributes and inherited attributes of an Ember Object, all relationships currently defined in EG).

The most damaging performance issue of all was the relationshipsForRecord() function, which basically was causing a linear degradation in performance of any single operation relative to the size of the number of relationships currently loaded into EG.

This "effectively" caused an exponential slowdown because as the complexity/size of the data increases, the number of required operations (for example the number of object loads) increases linearly, and the cost of each of these operations individually increases linearly as well (linear x linear = squared (exponential)).

mtheoryx/eg-perf.md