###Tested with:
- Spark 2.0.0 pre-built for Hadoop 2.7
- Mac OS X 10.11
- Python 3.5.2
Use s3 within pyspark with minimal hassle.
| # Install R + RStudio on Ubuntu 14.04 | |
| sudo apt-key adv βkeyserver keyserver.ubuntu.com βrecv-keys E084DAB9 | |
| # Ubuntu 12.04: precise | |
| # Ubuntu 14.04: trusty | |
| # Ubuntu 16.04: xenial | |
| # Basic format of next line deb https://<my.favorite.cran.mirror>/bin/linux/ubuntu <enter your ubuntu version>/ | |
| sudo add-apt-repository 'deb https://ftp.ussg.iu.edu/CRAN/bin/linux/ubuntu trusty/' | |
| sudo apt-get update |
| 1)What is Difference between Secondary namenode, Checkpoint namenode & backupnod Secondary Namenode, a poorly named component of hadoop. | |
| (2)What are the Side Data Distribution Techniques. | |
| (3)What is shuffleing in mapreduce? | |
| (4)What is partitioning? | |
| (5)Can we change the file cached by Distributed Cache |
| // Find the minimum path sum (from root to leaf) | |
| public static int minPathSum(TreeNode root) { | |
| if(root == null) return 0; | |
| int sum = root.val; | |
| int leftSum = minPathSum(root.left); | |
| int rightSum = minPathSum(root.right); | |
| if(leftSum < rightSum){ | |
| sum += leftSum; |
| # MWS API docs at http://docs.developer.amazonservices.com/en_US/orders-2013-09-01/Orders_Datatypes.html#Order | |
| # MWS Scratchpad at https://mws.amazonservices.com/scratchpad/index.html | |
| # Boto docs at http://docs.pythonboto.org/en/latest/ref/mws.html?#module-boto.mws | |
| from boto.mws.connection import MWSConnection | |
| ... | |
| # Provide your credentials. | |
| conn = MWSConnection( |
Picking the right architecture = Picking the right battles + Managing trade-offs
People
:bowtie: |
π :smile: |
π :laughing: |
|---|---|---|
π :blush: |
π :smiley: |
:relaxed: |
π :smirk: |
π :heart_eyes: |
π :kissing_heart: |
π :kissing_closed_eyes: |
π³ :flushed: |
π :relieved: |
π :satisfied: |
π :grin: |
π :wink: |
π :stuck_out_tongue_winking_eye: |
π :stuck_out_tongue_closed_eyes: |
π :grinning: |
π :kissing: |
π :kissing_smiling_eyes: |
π :stuck_out_tongue: |
/!\ Be very carrefull in your setup : any misconfiguration make all the git config to fail silently ! Go trought this guide step by step and it should be fine π
~/.ssh/config, set each ssh key for each repository as in this exemple:| import multiprocessing #:) | |
| def do_this(number): | |
| print number | |
| return number*2 | |
| # Create a list to iterate over. | |
| # (Note: Multiprocessing only accepts one item at a time) | |
| some_list = range(0,10) |
| import copy | |
| # write to a path using the Hudi format | |
| def hudi_write(df, schema, table, path, mode, hudi_options): | |
| hudi_options = { | |
| "hoodie.datasource.write.recordkey.field": "recordkey", | |
| "hoodie.datasource.write.precombine.field": "precombine_field", | |
| "hoodie.datasource.write.partitionpath.field": "partitionpath_field", | |
| "hoodie.datasource.write.operation": "write_operaion", | |
| "hoodie.datasource.write.table.type": "table_type", |