Skip to content

Instantly share code, notes, and snippets.

View myui's full-sized avatar

Makoto YUI myui

View GitHub Profile
set ytics nomirror
set y2tics
set yr[-20:20]
plot \
"cf2d.dat" using 1:2 with lines title "x", \
"cf2d.dat" using 1:3 with lines title "outlier" axes x1y2, \
"cf2d.dat" using 1:4 with lines title "change" axes x1y2
@myui
myui / gnuplot.plt
Created September 2, 2016 02:24
gnuplot
set ytics nomirror
unset y2tics
set y2r[-200:300]
set yr[0:60]
set datafile separator " "
plot \
"hivemall_twitter2d.dat" using 1:2 with lines title "x" axes x1y2, \
"hivemall_twitter2d.dat" using 1:3 with lines title "outlier" axes x1y1
/*
* Hivemall: Hive scalable Machine Learning Library
*
* Copyright (C) 2015 Makoto YUI
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0

Item-base collabolative filtering

1. Prepare transaction table

Prepare following transaction table. We are generating feature_vector for each item_id based on cooccurrence of purchased items, a sort of bucket analysis.

userid item_id purchase_at timestamp
1 31231 2015-04-9 00:29:02
1 13212 2016-05-24 16:29:02
run: +main
_export:
td:
apikey: ${TD_API_KEY}
database: criteo
engine: hive
+main:
+prepare:
/*
* Hivemall: Hive scalable Machine Learning Library
*
* Copyright (C) 2015 Makoto YUI
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
/*
* Hivemall: Hive scalable Machine Learning Library
*
* Copyright (C) 2015 Makoto YUI
* Copyright (C) 2013-2015 National Institute of Advanced Industrial Science and Technology (AIST)
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
/*
* Hivemall: Hive scalable Machine Learning Library
*
* Copyright (C) 2015 Makoto YUI
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
/*
* Hivemall: Hive scalable Machine Learning Library
*
* Copyright (C) 2015 Makoto YUI
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
CREATE TABLE bprmf_model
as
WITH t as (
  select
    bpr_sampling(userid, pos_items, "-sampling_rate 5.0")
      as (userid, pos_item, neg_item)
  from
    ranking_train
)