Skip to content

Instantly share code, notes, and snippets.

View myui's full-sized avatar

Makoto YUI myui

View GitHub Profile
16/02/08 11:28:19 INFO ApplicationMasterService: Starting Application Master
16/02/08 11:28:19 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/02/08 11:28:19 INFO RMProxy: Connecting to ResourceManager at dm01/10.14.1.38:8030
16/02/08 11:28:20 INFO ApplicationMasterService: Registered ApplicationMaster: maximumCapability { memory: 15360 virtual_cores: 32 } queue: "default" scheduler_resource_types: MEMORY
16/02/08 11:28:20 INFO NMClientAsyncImpl: Upper bound of the thread pool size is 500
16/02/08 11:28:20 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
16/02/08 11:28:22 INFO AMRMClientImpl: Received new token for : ip-10-14-130-102.ec2.internal:60563
16/02/08 11:28:22 INFO AMRMClientImpl: Received new token for : ip-10-14-129-101.ec2.internal:48201
16/02/08 11:28:22 INFO AMRMClientImpl: Received new token for : ip-10-14-130-107.ec2.internal:54722
16/02/08 11:28:22 INFO ApplicationMasterServi
@myui
myui / csr_matrix.py
Last active December 17, 2015 13:21
>>> d.toarray()
array([[2, 1, 0, 0],
[0, 1, 1, 1]])
>>> c.toarray()
array([[2, 1, 0, 0],
[0, 1, 1, 1]])
>>> d
<2x4 sparse matrix of type '<type 'numpy.int64'>'
with 5 stored elements in Compressed Sparse Row format>
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
#!/bin/bash
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#!/bin/bash
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
Caused by: java.lang.NumberFormatException: empty String
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1020)
at java.lang.Double.parseDouble(Double.java:540)
at org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getDouble(PrimitiveObjectInspectorUtils.java:744)
at hivemall.ftvec.trans.QuantitativeFeaturesUDF.evaluate(QuantitativeFeaturesUDF.java:91)
at hivemall.ftvec.trans.QuantitativeFeaturesUDF.evaluate(QuantitativeFeaturesUDF.java:41)
at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166)
at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77)
at hivemall.tools.array.ConcatArrayUDF.evaluate(ConcatArrayUDF.java:86)
Hive history file=/mnt/hive/tmp/1/hive_job_log_eb8582f1-0cc1-4fdc-99fc-52b4896e5f74_1231952821.txt
**
** WARNING: time index filtering is not set!
** This query could be very slow as a result.
** If you used 'unix_timestmap' please modify your query to use TD_SCHEDULED_TIME instead
** or rewrite the condition using TD_TIME_RANGE
** Please see http://docs.treasure-data.com/articles/performance-tuning#leveraging-time-based-partitioning
**
**
** WARNING: time index filtering is not set!
ble to execute method public boolean hivemall.fm.FMPredictUDAF$Evaluator.iterate(org.apache.hadoop.hive.serde2.io.DoubleWritable,java.util.List,org.apache.hadoop.hive.serde2.io.DoubleWritable) throws org.apache.hadoop.hive.ql.metadata.HiveException on object hivemall.fm.FMPredictUDAF$Evaluator@3272e5d6 of class hivemall.fm.FMPredictUDAF$Evaluator with arguments {1.3495872020721436:org.apache.hadoop.hive.serde2.io.DoubleWritable, []:java.util.ArrayList, 1.0:org.apache.hadoop.hive.serde2.io.DoubleWritable} of size 3
at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:1253)
at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFBridge$GenericUDAFBridgeEvaluator.iterate(GenericUDAFBridge.java:170)
at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:184)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:641)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:838)
@myui
myui / kaggle_dataget.sh
Last active October 28, 2015 01:54
Kaggle data get script
curl -s https://www.kaggle.com/c/titanic/data | pup 'tbody tr td a attr{href}' | awk '{print "https://kaggle.com" $1}' > urls.txt
# pup is required
# https://github.com/ericchiang/pup
for f in `cat urls.txt`;
do
echo "downloading file: ${f}"
wget -O `basename ${f}` https://www.kaggle.com/account/login?ReturnUrl=${f} --post-data "username=${KAGGLE_USER}&password=${KAGGLE_PASSWD}"
echo
done
/*
* Hivemall: Hive scalable Machine Learning Library
*
* Copyright (C) 2015 Makoto YUI
* Copyright (C) 2013-2015 National Institute of Advanced Industrial Science and Technology (AIST)
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*