- If values are integers in [0, 255], Parquet will automatically compress to use 1 byte unsigned integers, thus decreasing the size of saved DataFrame by a factor of 8.
- Partition DataFrames to have evenly-distributed, ~128MB partition sizes (empirical finding). Always err on the higher side w.r.t. number of partitions.
- Pay particular attention to the number of partitions when using
flatMap, especially if the following operation will result in high memory usage. TheflatMapop usually results in a DataFrame with a [much] larger number of rows, yet the number of partitions will remain the same. Thus, if a subsequent op causes a large expansion of memory usage (i.e. converting a DataFrame of indices to a DataFrame of large Vectors), the memory usage per partition may become too high. In this case, it is beneficial to repartition the output offlatMapto a number of partitions that will safely allow for appropriate partition memory sizes, based upon the
| #ย Count total EBS based storage in AWS | |
| aws ec2 describe-volumes | jq "[.Volumes[].Size] | add" | |
| #ย Count total EBS storage with a tag filter | |
| aws ec2 describe-volumes --filters "Name=tag:Name,Values=CloudEndure Volume qjenc" | jq "[.Volumes[].Size] | add" | |
| # Describe instances concisely | |
| aws ec2 describe-instances | jq '[.Reservations | .[] | .Instances | .[] | {InstanceId: .InstanceId, State: .State, SubnetId: .SubnetId, VpcId: .VpcId, Name: (.Tags[]|select(.Key=="Name")|.Value)}]' | |
| # Wait until $instance_id is running and then immediately stop it again | |
| aws ec2 wait instance-running --instance-id $instance_id && aws ec2 stop-instances --instance-id $instance_id | |
| #ย Get 10th instance in the account |
| # A simple cheat sheet of Spark Dataframe syntax | |
| # Current for Spark 1.6.1 | |
| # import statements | |
| from pyspark.sql import SQLContext | |
| from pyspark.sql.types import * | |
| from pyspark.sql.functions import * | |
| #creating dataframes | |
| df = sqlContext.createDataFrame([(1, 4), (2, 5), (3, 6)], ["A", "B"]) # from manual data |
| #!/bin/bash | |
| # | |
| # /etc/kernel/postinst.d script to sign akmods kmods after kernel upgrade | |
| # | |
| # Author: Michael Goodwin Date: 2016-09-21 | |
| # 1. Copy this script to /etc/kernel/postinst.d/ and `chmod +x` it | |
| # | |
| # 2. Create signing keys (store these somewhere useful and safe): | |
| # $ mkdir -p /etc/pki/tls/private/mok |
| #!/bin/bash | |
| # Fetch 24-hour AWS STS session token and set appropriate environment variables. | |
| # See http://docs.aws.amazon.com/cli/latest/reference/sts/get-session-token.html . | |
| # You must have jq installed and in your PATH https://stedolan.github.io/jq/ . | |
| # Add this function to your .bashrc or save it to a file and source that file from .bashrc . | |
| # https://gist.github.com/ddgenome/f13f15dd01fb88538dd6fac8c7e73f8c | |
| # | |
| # usage: aws-creds MFA_TOKEN [OTHER_AWS_STS_GET-SESSION-TOKEN_OPTIONS...] | |
| function aws-creds () { | |
| local pkg=aws-creds |
A curated list of AWS resources to prepare for the AWS Certifications
A curated list of awesome AWS resources you need to prepare for the all 5 AWS Certifications. This gist will include: open source repos, blogs & blogposts, ebooks, PDF, whitepapers, video courses, free lecture, slides, sample test and many other resources.
| module.exports = function (context, data) { | |
| context.res = { | |
| body: parseQuery(data.form) | |
| }; | |
| context.done(); | |
| }; | |
| function parseQuery(qstr) { |
Tested with Apache Spark 2.1.0, Python 2.7.13 and Java 1.8.0_112
For older versions of Spark and ipython, please, see also previous version of text.
EMOJI CHEAT SHEET
Emoji emoticons listed on this page are supported on Campfire, GitHub, Basecamp, Redbooth, Trac, Flowdock, Sprint.ly, Kandan, Textbox.io, Kippt, Redmine, JabbR, Trello, Hall, plug.dj, Qiita, Zendesk, Ruby China, Grove, Idobata, NodeBB Forums, Slack, Streamup, OrganisedMinds, Hackpad, Cryptbin, Kato, Reportedly, Cheerful Ghost, IRCCloud, Dashcube, MyVideoGameList, Subrosa, Sococo, Quip, And Bang, Bonusly, Discourse, Ello, and Twemoji Awesome. However some of the emoji codes are not super easy to remember, so here is a little cheat sheet. โ Got flash enabled? Click the emoji code and it will be copied to your clipboard.
People
๐
