- Homebrew
Confirmed working with
In a classic hadoop job, you've got mappers and reducers. The "thing" being mapped and reduced are key-value pairs for some arbitrary pair of types. Most of your parallelism comes from the mappers, since they can (ideally) split the data and transform it without any coordination with other processes.
By contrast, the amount of parallelism in the reduction phase has an important limitation: although you may have many reducers, any given reducer is guaranteed to receive all the values for some particular key.
So if there are a HUGE number of values for some particular key, you're going to have a bottleneck because they're all going to be processed by a single reducer.
However, there is another way! Certain types of data fit into a pattern:
name: CI | |
on: [push] | |
jobs: | |
test: | |
runs-on: ubuntu-latest | |
services: |
#!/usr/bin/env bash | |
# Find the latest Amazon-created "Deep Learning AMI (Ubuntu 18.04)" AMI image ID | |
# | |
# args explanation: | |
# --region us-east-1 | |
# Specifies the AWS region (you can also specify it in your | |
# ~/.aws/config or via the `AWS_REGION` or `AWS_DEFAULT_REGION` | |
# env vars) | |
# |