-
-
Save ylogx/53fef94cc61d6a3e9b3eb900482f41e0 to your computer and use it in GitHub Desktop.
Just tried the notebook (xgboost version 0.90). Unfortunately, call to xgb.train
(for one shot learning) raises error "XGBoostError: [13:44:55] src/gbm/gbtree.cc:278: Check failed: model_.trees.size() < model_.trees_to_update.size() (0 vs. 0) :"
I ran the gist on master branch and it works fine. Should be fixed with new model IO routines.
I also got the same error as Karelin. And I this the same as Venkatesh
Check failed: model_.trees.size() < model_.trees_to_update.size() (0 vs. 0) :
I saw somewhere that it is needed to add the number of trees created in the first iteration, however, I cannot get that number. And it is never added in the code above.
Same issue on XGBoost 1.4.0. Has anyone figured this out yet?
Hi,
I have found the solution. Per xgboost documentation, the parameter 'update' should be 'updater'... this is a mistake in the notebook above. If you fix this, then you will see the right results.
model = xgb.train({
'learning_rate': 0.007,
'updater':'refresh',
'process_type': 'update',
'refresh_leaf': True,
#'reg_lambda': 3, # L2
'reg_alpha': 3, # L1
'silent': False,
}, dtrain=xgb.DMatrix(x_tr[start:start+batch_size], y_tr[start:start+batch_size]), xgb_model=model)
Disregard, I figured it out. I was using handle_unknown='ignore' in OneHotEncoder, but one of the features has too few of a particular category, hence the mismatch.
Thank you for this gist. How can we implement this in a pipeline?
I am unable to test on the Boston dataset as it's been removed from sklearn, but on a different dataset I get a mismatch in number of columns. Even though I use the same pipeline the saved model seems to have one less feature than the new training data and I am unable to figure out why.
Great example!
Few people know that xgboost is able to perform incremental learning by adding boosting rounds.
Hi,
I am also unable to replicate the analogy of this model, for example of a Telecom Churn prediction, as and when a new customer gets added who will I re-use the old model to train the model instead of retraining with complete data.