Tested with Apache Spark 2.1.0, Python 2.7.13 and Java 1.8.0_112
For older versions of Spark and ipython, please, see also previous version of text.
from sre_parse import Pattern, SubPattern, parse as sre_parse | |
from sre_compile import compile as sre_compile | |
from sre_constants import BRANCH, SUBPATTERN | |
class Scanner(object): | |
def __init__(self, tokens, flags=0): | |
subpatterns = [] | |
pat = Pattern() |
Tested with Apache Spark 2.1.0, Python 2.7.13 and Java 1.8.0_112
For older versions of Spark and ipython, please, see also previous version of text.
The point of this is to use cheap machines with small/slow storage to coordinate client requests while dedicating the machines with the big and fast storage to doing what they do best. I found that request coordination was contributing to about half the CPU usage on our Cassandra nodes, on average. Solid state storage is quite expensive, nearly doubling the cost of typical hardware. It also means that if people have control over hardware placement within the network, they can place proxy nodes closer to the client without impacting their storage footprint or fault tolerance characteristics.
This is accomplished in Cassandra by passing the -Dcassandra.join_ring=false option when the process is started. These nodes will connect to the seeds, cache the gossip data, load the schema, and begin listening for client requests. Messages like "/x.x.x.x is now UP!" will appear on the other nodes.
There are also some more practical benefits to this. Handling client requests caused us to push the NewSize of the heap up
Short version: I strongly do not recommend using any of these providers. You are, of course, free to use whatever you like. My TL;DR advice: Roll your own and use Algo or Streisand. For messaging & voice, use Signal. For increased anonymity, use Tor for desktop (though recognize that doing so may actually put you at greater risk), and Onion Browser for mobile.
This mini-rant came on the heels of an interesting twitter discussion: https://twitter.com/kennwhite/status/591074055018582016
2015-01-29 Unofficial Relay FAQ
Compilation of questions and answers about Relay from React.js Conf.
Disclaimer: I work on Relay at Facebook. Relay is a complex system on which we're iterating aggressively. I'll do my best here to provide accurate, useful answers, but the details are subject to change. I may also be wrong. Feedback and additional questions are welcome.
Relay is a new framework from Facebook that provides data-fetching functionality for React applications. It was announced at React.js Conf (January 2015).
#!/usr/bin/env python | |
# -*- coding: utf-8 -*- | |
import subprocess | |
import time | |
VERBOSE = False | |
def wait(seconds): |
# -*- coding: utf-8 -*- | |
from __future__ import absolute_import | |
import os | |
from scrapy.contrib.httpcache import FilesystemCacheStorage | |
from .dupefilter import splash_requst_fingerprint | |
class SplashAwareFSCacheStorage(FilesystemCacheStorage): | |
def _get_request_path(self, spider, request): |
"""XPath extension functions for lxml, inspired by: | |
https://gist.github.com/shirk3y/458224083ce5464627bc | |
Usage: | |
import xpathfuncs; xpathfuncs.setup() | |
""" | |
import string |
(by @andrestaltz)
If you prefer to watch video tutorials with live-coding, then check out this series I recorded with the same contents as in this article: Egghead.io - Introduction to Reactive Programming.
Magic words:
psql -U postgres
Some interesting flags (to see all, use -h
or --help
depending on your psql version):
-E
: will describe the underlaying queries of the \
commands (cool for learning!)-l
: psql will list all databases and then exit (useful if the user you connect with doesn't has a default database, like at AWS RDS)