Skip to content

Instantly share code, notes, and snippets.

View fuyi's full-sized avatar
🎯
Focusing

YI FU fuyi

🎯
Focusing
  • Stockholm, Sweden
View GitHub Profile
'--include', 'users*.avro'
@fuyi
fuyi / dag.yaml
Created September 15, 2020 19:26
dag.yaml
_global: &globals
owner: test_owner
schedule_interval: "* * * * *"
retries: 3
retry_delay: 30
connection: some_connection
operator: BigQueryOperator
slack:
enabled: true
slack_connection_id: slack
@fuyi
fuyi / bq_hackday.md
Last active October 21, 2020 09:31
BQ+ Data studio hack day

Questions to answer

  • Heart beat Event distribution in the last 24 hours, per hour resolution.
  • Distribution of All events from Oxford Production in the last 48 hours.
  • Total number vs Actual number of categorization events on Rugby production per hour in the last 24 hours.

Learning:

  • Data studio doesn't support alert yet.
@fuyi
fuyi / airflow-mysql.md
Created December 10, 2020 14:18
Airflow MySQL Meta store performance tuning

Login MySQL with this shell script

 #!/bin/bash
SECRET_VALUE=$(tink-kubectl -n airflow get secrets airflow-credentials -o json)
PASSWORD=$(echo $SECRET_VALUE | jq ."data" | jq ."rootPassword")
CORRECT_PASSWORD=$(echo "$PASSWORD" | sed -e 's/^"//' -e 's/"$//' | base64 -d)
tink-kubectl -n airflow exec -it airflow-mysql-0 bash
mysql -h 127.0.0.1 -P 3306 -u root -p $CORRECT_PASSWORD
use airflow;
@fuyi
fuyi / cp_archi.md
Last active May 26, 2021 07:56
CP archtecture questions

General

  • What AEM and Asset Link stand for?

Diagram 3.2

  • How do we manage authentication between API gateway and web browser.
  • should Retrieves Asset Link and Sends Reward Info seem need to be switched?
  • Prediction-to-reward matching DB, where can I find the table schema definition?
@fuyi
fuyi / qingmaio.py
Created July 5, 2021 13:56
qingmiao_zhang_python
INPUT = [
1,
2,
3,
4,
5,
11111,
9,
8,
7,
---------------------------------------------------------------------------
Py4JJavaError Traceback (most recent call last)
<command-2690650308556142> in <module>
5 primary_key=["customer_id"])
6
----> 7 add_to_cart_last_7days_fg.save(empty_add_to_cart_last_7days)
/databricks/python/lib/python3.8/site-packages/hsfs/feature_group.py in save(self, features, write_options)
642
643 # fg_job is used only if the hive engine is used
@fuyi
fuyi / hopsworks_debug-log.md
Last active August 23, 2021 14:44
hopsworks_debug-log
tail -400 /srv/hops/onlinefs/logs/onlinefs.log
Enable succeeded: 
[stdout]
-b083-4198-9ae5-d966137a8076 sending LeaveGroup request to coordinator 10.0.0.9:9091 (id: 2147483646 rack: null)
2021-08-23 14:30:52 INFO  AbstractCoordinator:879 - [Consumer clientId=consumer-3, groupId=0] Member consumer-3-5d5705ab-e091-4f8a-a331-974f410e4273 sending LeaveGroup request to coordinator 10.0.0.9:9091 (id: 2147483646 rack: null)
@fuyi
fuyi / pdp_message.json
Created January 13, 2022 10:09
pdd_message_example
{
"parameters": [
{
"name": "touchpoint",
"value_string": "DESKTOP"
},
{
"name": "market",
"value_string": "SE"
},
@fuyi
fuyi / window.md
Created March 1, 2022 15:06
Create custom window strategy to fulfil sequencer model feature requirement

Create custom window strategy to fulfil sequencer model feature requirement

Background

To train our neural sequencer model properly, we need to get the ordered sequence of user interaction with articles in a certain period of time. To formulate the requirements, the input dataset for the model training can be described as:

The last X articles the user has interacted with in the last Y days in chronological order.

X: variable with integer value between [0, 20] inclusive