Skip to content

Instantly share code, notes, and snippets.

@arpitbbhayani
Last active January 29, 2020 04:47
Show Gist options
  • Save arpitbbhayani/60e508d6067a56865bb0b473e8218d0d to your computer and use it in GitHub Desktop.
Save arpitbbhayani/60e508d6067a56865bb0b473e8218d0d to your computer and use it in GitHub Desktop.
def construct_tree(X, current_height, max_height):
"""The function constructs a tree/sub-tree on points X.
current_height: represents the height of the current tree to
the root of the decision tree.
max_height: the max height of the tree that should be constructed.
The current_height and max_height only exists to make the algorithm efficient
as we assume that no anomalies exist at depth >= max_height.
"""
if current_height >= max_height:
# here we are sure that no anomalies exist hence we
# directly construct the external node.
return new_external_node(X)
# pick any attribute at random.
attribute = get_random_attribute(X)
# for set of inputs X, for the tree we get a random value
# for the chosen attribute. preferably around the median.
split_value = get_random_value(max_value, min_value)
# split X instances based on `split_values` into Xl and Xr
Xl = filter(X, lambda x: X[attribute] < split_value)
Xr = filter(X, lambda x: X[attribute] >= split_value)
# build an internal node with its left subtree created from Xl
# and right subtree created from Xr, recursively.
return new_internal_node(
left=construct_tree(Xl, current_height + 1, max_height),
right=construct_tree(Xr, current_height + 1, max_height),
split_attribute=attribute,
split_value=split_value,
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment