Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save audhiaprilliant/5ef7ff356263b761c543221d4b9c3758 to your computer and use it in GitHub Desktop.

Select an option

Save audhiaprilliant/5ef7ff356263b761c543221d4b9c3758 to your computer and use it in GitHub Desktop.
Apache Airflow as Job Orchestration
# Function to save the daily province's data
def provinces_save_csv(**context):
value = context['task_instance'].xcom_pull(task_ids = 'provinces_scraping_data')
with open(dir_path+'/data/covid19/daily_update_covid.csv','r') as f:
lines = f.read().splitlines()
last_line = lines[-1]
if re.findall(r'^(.+?),',last_line)[0] == value['date'].unique().tolist()[0]:
notif = 'Last update:',re.findall(r'^(.+?),',last_line)[0]
else:
with open(dir_path+'/data/covid19/daily_update_covid.csv','a') as ff:
value.to_csv(ff,header=False,index=False)
ff.close()
notif = 'Up to date data:', value['date'].unique().tolist()[0]
return(notif)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment