- If values are integers in [0, 255], Parquet will automatically compress to use 1 byte unsigned integers, thus decreasing the size of saved DataFrame by a factor of 8.
- Partition DataFrames to have evenly-distributed, ~128MB partition sizes (empirical finding). Always err on the higher side w.r.t. number of partitions.
- Pay particular attention to the number of partitions when using
flatMap
, especially if the following operation will result in high memory usage. TheflatMap
op usually results in a DataFrame with a [much] larger number of rows, yet the number of partitions will remain the same. Thus, if a subsequent op causes a large expansion of memory usage (i.e. converting a DataFrame of indices to a DataFrame of large Vectors), the memory usage per partition may become too high. In this case, it is beneficial to repartition the output offlatMap
to a number of partitions that will safely allow for appropriate partition memory sizes, based upon the
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Modern secure (OpenSSH Server 7+) SSHd config by HacKan | |
# Refer to the manual for more info: https://www.freebsd.org/cgi/man.cgi?sshd_config(5) | |
# Server fingerprint | |
# Regenerate with: ssh-keygen -f /etc/ssh/ssh_host_rsa_key -N '' -t rsa -b 4096 | |
HostKey /etc/ssh/ssh_host_rsa_key | |
# Regerate with: ssh-keygen -f /etc/ssh/ssh_host_ed25519_key -N '' -t ed25519 | |
HostKey /etc/ssh/ssh_host_ed25519_key | |
# Log for audit, even users' key fingerprint |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* | |
A single regex to parse and breakup a full URL including query parameters and anchors e.g. | |
https://www.google.com/dir/1/2/search.html?arg=0-a&arg1=1-b&arg3-c#hash | |
*/ | |
Url.regex = /^((http[s]?|ftp):\/)?\/?([^:\/\s]+)((\/\w+)*\/)([\w\-\.]+[^#?\s]+)(.*)?(#[\w\-]+)?$/; | |
url: RegExp['$&'], | |
protocol: RegExp.$2, | |
host: RegExp.$3, |
In this article, I will share some of my experience on installing NVIDIA driver and CUDA on Linux OS. Here I mainly use Ubuntu as example. Comments for CentOS/Fedora are also provided as much as I can.
This was tested on a ThinkPad P70 laptop with an Intel integrated graphics and an NVIDIA GPU:
lspci | egrep 'VGA|3D'
00:02.0 VGA compatible controller: Intel Corporation Device 191b (rev 06)
01:00.0 VGA compatible controller: NVIDIA Corporation GM204GLM [Quadro M3000M] (rev a1)
A reason to use the integrated graphics for display is if installing the NVIDIA drivers causes the display to stop working properly.
In my case, Ubuntu would get stuck in a login loop after installing the NVIDIA drivers.
This happened regardless if I installed the drivers from the "Additional Drivers" tab in "System Settings" or the ppa:graphics-drivers/ppa
in the command-line.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
" show hidden characters in Vim | |
:set list | |
" settings for hidden chars | |
" what particular chars they are displayed with | |
:set lcs=tab:▒░,trail:▓ | |
" or | |
:set listchars=tab:▒░,trail:▓ | |
" used \u2592\u2591 for tab and \u2593 for trailing spaces in line. | |
" In Vim help they suggest using ">-" for tab and "-" for trail. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
def _map_to_pandas(rdds): | |
""" Needs to be here due to pickling issues """ | |
return [pd.DataFrame(list(rdds))] | |
def toPandas(df, n_partitions=None): | |
""" | |
Returns the contents of `df` as a local `pandas.DataFrame` in a speedy fashion. The DataFrame is | |
repartitioned if `n_partitions` is passed. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ssh-keygen -t rsa -b 4096 -N '' -C "[email protected]" -f ~/.ssh/id_rsa | |
ssh-keygen -t rsa -b 4096 -N '' -C "[email protected]" -f ~/.ssh/github_rsa | |
ssh-keygen -t rsa -b 4096 -N '' -C "[email protected]" -f ~/.ssh/mozilla_rsa |