- Don’t
SELECT *
, Specify explicit column names (columnar store) - Avoid large JOINs (filter each table first)
- In PRESTO tables are joined in the order they are listed!!
- Join small tables earlier in the plan and leave larger fact tables to the end
- Avoid cross joins or 1 to many joins as these can degrade performance
- Order by and group by take time
- only use order by in subqueries if it is really necessary
- When using GROUP BY, order the columns by the highest cardinality (that is, most number of unique values) to the lowest.
I had a bit of trouble trying to configure permissions to upload files from my Google Compute Engine instance to my Google Cloud Storage bucket. The process isn't as intuitive as you think. There are a few permissions issues that need to be configured before this can happen. Here are the steps I took to get things working.
Let's say you want to upload yourfile.txt
to a GCS bucket from your virtual machine.
You can use the gsutil
command line tool that comes installed on all GCE instances.
If you've never used the gcloud
or gsutil
command line tools on this machine before, you will need to initialize them with a service account.
Download Boost Library: http://www.boost.org (Choose the expected version)
wget https://cfhcable.dl.sourceforge.net/project/boost/boost/1.54.0/boost_1_54_0.tar.gz
wget https://phoenixnap.dl.sourceforge.net/project/boost/boost/1.58.0/boost_1_58_0.tar.gz
wget https://dl.bintray.com/boostorg/release/1.64.0/source/boost_1_64_0.tar.gz
wget https://dl.bintray.com/boostorg/release/1.65.1/source/boost_1_65_1.tar.gz
wget https://dl.bintray.com/boostorg/release/1.67.0/source/boost_1_67_0.tar.gz
wget https://dl.bintray.com/boostorg/release/1.68.0/source/boost_1_68_0.tar.gz
wget https://dl.bintray.com/boostorg/release/1.69.0/source/boost_1_69_0.tar.gz
- Install
osxfuse
:
brew cask install osxfuse
-
Reboot your Mac.
-
Install
ntfs-3g
:
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sudo wget http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo -O /etc/yum.repos.d/epel-apache-maven.repo | |
sudo sed -i s/\$releasever/6/g /etc/yum.repos.d/epel-apache-maven.repo | |
sudo yum install -y apache-maven | |
mvn --version |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"""Kernel K-means""" | |
# Author: Mathieu Blondel <[email protected]> | |
# License: BSD 3 clause | |
import numpy as np | |
from sklearn.base import BaseEstimator, ClusterMixin | |
from sklearn.metrics.pairwise import pairwise_kernels | |
from sklearn.utils import check_random_state |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- Remove the history from | |
rm -rf .git | |
-- recreate the repos from the current content only | |
git init | |
git add . | |
git commit -m "Initial commit" | |
-- push to the github remote repos ensuring you overwrite history | |
git remote add origin [email protected]:<YOUR ACCOUNT>/<YOUR REPOS>.git |
NewerOlder