Skip to content

Instantly share code, notes, and snippets.

@Eligijus112
Last active June 2, 2022 03:47
Show Gist options
  • Save Eligijus112/f2fdc73bb9285a7148440b104e592065 to your computer and use it in GitHub Desktop.
Save Eligijus112/f2fdc73bb9285a7148440b104e592065 to your computer and use it in GitHub Desktop.
Data for feature importance calculation
# Entries in the nodes
n_entries = {
"node 1": 15480,
"node 2": 12163,
"node 3": 3317,
"node 4": 5869,
"node 5": 6294,
"node 6": 2317,
"node 7": 1000,
"node 8": 2454,
"node 9": 3415,
"node 10": 1372,
"node 11": 4922,
"node 12": 958,
"node 13": 1359,
"node 14": 423,
"node 15": 577
}
# Nodes squared errors
i_sq = {
"node 1": 1.335,
"node 2": 0.832,
"node 3": 1.214,
"node 4": 0.546,
"node 5": 0.834,
"node 6": 0.893,
"node 7": 0.776,
"node 8": 0.648,
"node 9": 0.385,
"node 10": 1.287,
"node 11": 0.516,
"node 12": 0.989,
"node 13": 0.536,
"node 14": 0.773,
"node 15": 0.432
}
# Defining the features used in the nodes that have a splitting rule
feature_in_node = {
"node 1": "MedInc",
"node 2": "MedInc",
"node 3": "MedInc",
"node 4": "AveRooms",
"node 5": "AveOccup",
"node 6": "AveOccup",
"node 7": "MedInc"
}
# The first node's count is the total number of entries in the tree
_n = n_entries["node 1"]
# Calculating the n entries probability
n_entries_weight = {}
for node in n_entries:
n_entries_weight[node] = n_entries[node] / _n
# Defining the relationship between nodes;
# The key is the parent node, and the value is a list of
# [left child node, right child node]
node_pairs = {
"node 1": ["node 2", "node 3"],
"node 2": ["node 4", "node 5"],
"node 3": ["node 6", "node 7"],
"node 4": ["node 8", "node 9"],
"node 5": ["node 10", "node 11"],
"node 6": ["node 12", "node 13"],
"node 7": ["node 14", "node 15"]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment