Skip to content

Instantly share code, notes, and snippets.

View myui's full-sized avatar

Makoto YUI myui

View GitHub Profile
q = calibrated probability 
  = p / (p + (1-p) / w)

https://pdfs.semanticscholar.org/daf9/ed5dc6c6bad5367d7fd8561527da30e9b8dd.pdf

where 
    p = predicted probability 
    w = negative down-sampling rate
 = (Neg/Neg+(Pos*k)) / (Neg/(Neg+Pos))
69.613 129.070 52.111
70.670 128.161 52.446
72.303 128.450 52.853
73.759 127.522 51.786
74.085 129.067 53.352
74.561 134.031 50.992
74.911 134.944 50.744
75.205 129.162 52.800
75.395 129.711 52.844
75.554 132.642 51.427
@myui
myui / auc.py
Last active June 8, 2018 08:08
def auc(num_positives, num_negatives, predicted):
l_sorted = sorted(range(len(predicted)),key=lambda i: predicted[i],
reverse=True)
fp_cur = 0.0
tp_cur = 0.0
fp_prev = 0.0
tp_prev = 0.0
fp_sum = 0.0
auc_tmp = 0.0
last_score = float("nan")
# -*- coding: utf-8 -*-
# sort list.txt | uniq | grep -v '#' | grep -v 'noreply' | grep -v 'local' | grep -e '\.' | grep -v 'internal' | grep -v 'contact'
import os
import sys
import requests
import time
from github3 import login
from tqdm import tqdm
create table page (
docid int,
contents string
);
INSERT OVERWRITE TABLE page_exploded
select
d.docid,
normalize_unicode(t.word) as word
from
WITH term_frequency as (
select
docid,
word,
freq
from (
select
docid,
tf(word) as word2freq
from
create table page (
docid int,
contents string
);
INSERT OVERWRITE TABLE page_exploded
select
d.docid,
normalize_unicode(t.word) as word
from
--------------------
Hivemall
Hivemall is a library for machine learning implemented as Hive
UDFs/UDAFs/UDTFs.
Hivemall has been incubating since 2016-09-13.
Three most important issues to address in the move towards graduation:
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0