假设:
- 不考虑无交互的行为(即用户突然听新歌的行为)
- 不考虑用户规模的问题
- 若考虑实际用户规模大于user表规模,则做二级的回归:第一级首先对用户-歌曲的每日听歌量进行回归,第二级是根据user表中的听歌量预测总用户的听歌量
// Use Gists to store code you would like to remember later on | |
console.log(window); // log the "window" object to the console |
from autogluon.tabular import TabularPredictor, TabularDataset | |
if __name__ == '__main__': | |
path_prefix = 'https://autogluon.s3.amazonaws.com/datasets/CoverTypeMulticlassClassification/' | |
path_train = path_prefix + 'train_data.csv' | |
path_test = path_prefix + 'test_data.csv' | |
label = 'Cover_Type' | |
sample = 20000 # Number of rows to use to train |