- If values are integers in [0, 255], Parquet will automatically compress to use 1 byte unsigned integers, thus decreasing the size of saved DataFrame by a factor of 8.
- Partition DataFrames to have evenly-distributed, ~128MB partition sizes (empirical finding). Always err on the higher side w.r.t. number of partitions.
- Pay particular attention to the number of partitions when using
flatMap
, especially if the following operation will result in high memory usage. TheflatMap
op usually results in a DataFrame with a [much] larger number of rows, yet the number of partitions will remain the same. Thus, if a subsequent op causes a large expansion of memory usage (i.e. converting a DataFrame of indices to a DataFrame of large Vectors), the memory usage per partition may become too high. In this case, it is beneficial to repartition the output offlatMap
to a number of partitions that will safely allow for appropriate partition memory sizes, based upon the
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Async/Await requirements: Latest Chrome/FF browser or Babel: https://babeljs.io/docs/plugins/transform-async-to-generator/ | |
// Fetch requirements: Latest Chrome/FF browser or Github fetch polyfill: https://github.com/github/fetch | |
// async function | |
async function fetchAsync () { | |
// await response of fetch call | |
let response = await fetch('https://api.github.com'); | |
// only proceed once promise is resolved | |
let data = await response.json(); | |
// only proceed once second promise is resolved |
- act2vec, trace2vec, log2vec, model2vec https://link.springer.com/chapter/10.1007/978-3-319-98648-7_18
- apk2vec https://arxiv.org/abs/1809.05693
- app2vec http://paul.rutgers.edu/~qma/research/ma_app2vec.pdf
- ast2vec https://arxiv.org/abs/2103.11614
- attribute2vec https://arxiv.org/abs/2004.01375
- author2vec http://dl.acm.org/citation.cfm?id=2889382
- baller2vec https://arxiv.org/abs/2102.03291
- bb2vec https://arxiv.org/abs/1809.09621
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from numpy.linalg import solve | |
class ExplicitMF(): | |
def __init__(self, | |
ratings, | |
n_factors=40, | |
item_reg=0.0, | |
user_reg=0.0, | |
verbose=False): | |
""" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Add field | |
echo '{"hello": "world"}' | jq --arg foo bar '. + {foo: $foo}' | |
# { | |
# "hello": "world", | |
# "foo": "bar" | |
# } | |
# Override field value | |
echo '{"hello": "world"}' | jq --arg foo bar '. + {hello: $foo}' | |
{ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
package main | |
import ( | |
"bufio" | |
"bytes" | |
"fmt" | |
"io" | |
"log" | |
"net" | |
"os" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import com.thinkaurelius.titan.core.TitanFactory; | |
import com.thinkaurelius.titan.core.TitanGraph; | |
import com.thinkaurelius.titan.core.TitanKey; | |
import com.thinkaurelius.titan.core.attribute.Geoshape; | |
import com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration; | |
import com.tinkerpop.blueprints.Edge; | |
import com.tinkerpop.blueprints.Vertex; | |
import com.tinkerpop.blueprints.util.ElementHelper; | |
import org.apache.commons.configuration.BaseConfiguration; | |
import org.apache.commons.configuration.Configuration; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
linux.img | |
.lock | |
record | |
.gdbinit |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Coverage targets | |
if HAVE_GCOV | |
.PHONY: clean-gcda | |
clean-gcda: | |
@echo Removing old coverage results | |
-find -name '*.gcda' -print | xargs -r rm | |
.PHONY: coverage-html generate-coverage-html clean-coverage-html |
Each of these commands will run an ad hoc http static server in your current (or specified) directory, available at http://localhost:8000. Use this power wisely.
$ python -m SimpleHTTPServer 8000