raybuhr raybuhr

fruit_name	tastiness	coolness
Açaí	97	0.03422379581559809
Akee	62	0.061128096593225156
Apple	84	0.7736677658268445
Apricot	90	0.44466537189702615
Avocado	67	0.4503439195930089
Banana	53	0.5896125958014631
Bilberry	30	0.18371360990823216
Blackberry	54	0.33187359972401076
Blackcurrant	49	0.9812347721124494

the tl;dr of https://medium.com/dunder-data/minimally-sufficient-pandas-a8e67f2a2428

select a column of data, use brackets df['column_name']

select rows of data, use .loc or datetime index df['2019-01-01':'2019-02-28' ]

if performance is primary concern, using numpy array instead of Pandas

use read_csv and it's many arguments for reading files

use .isna method to filter NaN rows

Loops

Loops are a concept for repeat an action on each item in a collection. In day to day life, you might think of this like brushing your teeth -- for each tooth in your mouth scrub with toothbrush and a little bit of toothpaste.

For Loops

The basic format for looping in python is usually taught like this:

&gt;&gt;&gt; for number in range(5):

re: dbt-labs/dbt-core#559

New Adapter Information Sheet

The following questions are intended to provide information about a warehouse, for use in building a new dbt adapter. Different data warehouses have pretty different semantics, so some of these questions may be more applicable than others. Where possible, please provide as much detail as you can!

Please copy this questionnaire to a GitHub gist, then fill out answers to the questions to the best of your ability. Where you're done, please post a link to the public gist in the relevant issue. If the issue does not already exist, please create it first.

	[sqlfluff]
	verbose = 0
	nocolor = False
	dialect = postgres
	templater = jinja
	rules = None
	exclude_rules = None
	recurse = 0
	output_line_length = 80
	runaway_limit = 10

	import argparse
	import os
	import pathlib
	import sys
	from time import sleep
	from bs4 import BeautifulSoup
	import pandas as pd
	import requests

	## This is my variation on Chase's first pass
	## https://gist.github.com/chasemc/1a6978d500aa47268ecaf9f99719af61

	library(htmltab)
	library(reshape2)
	library(ggplot2)

	data_url <- "https://web.archive.org/web/20200513204922/http://covidtracking.com/data/state/illinois"
	a <- htmltab(doc = data_url, which = 4)
	a$date <- as.Date(a$Date, "%a %b %d %Y")

	#!/bin/bash

	wget --mirror \
	--recursive \
	--convert-links \
	--no-parent \
	--domains learning.oreilly.com \
	--random-wait \
	--adjust-extension \
	--header='cookie: cd_user_id=...; BrowserCookie=...; timezoneoffset=21600; csrfsafari=...; sign-on-email="..."; sessionid=...; logged_in=y;' \

	# coding: utf-8
	from slacker import Slacker
	import pandas as pd

	def pull_messages(slack_api_token, channel_name, from_time=None):
	"""
	slack_api_token can be registered at https://api.slack.com/tokens.
	from_time assumes unix_timestamp, such as 1456709957.004145
	-- mysql can convert between unix and posix time easily
	with the function FROM_UNIXTIME so best to just keep that

	// Use Gists to store code you would like to remember later on
	console.log(window); // log the "window" object to the console

raybuhr raybuhr

Loops

For Loops

New Adapter Information Sheet

Questions