- Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
- Models and Issues in Data Stream Systems
- Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
- Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
- [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from flask import Flask | |
from flask import request | |
app = Flask(__name__) | |
#Normally this would be an external file like object, but here | |
#it's inlined | |
FORM_PAGE = """ | |
<html> | |
<head> | |
<title>Flask Form</title> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Latency Comparison Numbers (~2012) | |
---------------------------------- | |
L1 cache reference 0.5 ns | |
Branch mispredict 5 ns | |
L2 cache reference 7 ns 14x L1 cache | |
Mutex lock/unlock 25 ns | |
Main memory reference 100 ns 20x L2 cache, 200x L1 cache | |
Compress 1K bytes with Zippy 3,000 ns 3 us | |
Send 1K bytes over 1 Gbps network 10,000 ns 10 us | |
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def nth_percentile(dataset, percentile = 90): | |
sorted_dataset = sorted(dataset) | |
new_length = len(sorted_dataset) * percentile / 100 | |
return sorted_dataset[0:new_length] | |
def mean(dataset): | |
return sum(dataset)/float(len(dataset)) | |
dataset = [5, 9, 7, 101, 4, 8, 109, 104, 6, 1, 110, 106, 3, 107, 105, 2, 102, 10, 103, 108] | |
percentile_90 = nth_percentile(dataset) |
mathclub是最近做的一个个人项目,帮助考SAT的同学通过在线做题、回顾、问答提高成绩。用户功能有:计次/计时做题、成绩单、错题分布、错题回顾、提问、汇总以及注册登录。管理后台主要是题库管理、学员管理、成绩单管理、问题回复。怎么看都像学校里的课设,的确项目本身并不出奇,开发上选用的一些方案或许更有意思。
整个项目一个人从产品需求、原型设计、前后端开发到部署历时2周左右。可以从截图上感受一下:
技术选型上服务端是Node.js,应用框架选了老牌的Express(4.x变化挺大不少中间件都废了),数据服务用的是MongoLab(MongoDB的云服务平台),图片上传用的是又拍云,程序部署在Nodejitsu上。模板引擎没选主流的Jade或ejs,而是用Express React Views它实现了在服务端渲染React组件。前端框架是用React,这次有意想追求前后端全部组件化的组织。之前是用Webpack实现CommonJS模块打包,这次用Browserify配置更简单,它有丰富的transform很赞,其中的reactify转换React的JSX很完美。CSS用Sass+autoprefixer让人省心。将这一切串起来的自动构建工具是Gulp。我其实崇尚用最精简的工具组合开发,上述组合在我看来比较精简了。(帖纸留念)
 into a PNG:
{
"cmd": [ "dot", "-Tpng", "-o", "$file_base_name.png", "$file"],
"selector": "source.dot"
}
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Note that this is sandbox. If you have not registered profiles in sandbox you | |
will need to do so prior to making these calls. You will also need at least | |
one campaign, ad group, and keyword before you see anything in the report. | |
While in sandbox, we will return “dummy” data so that you can see how an actual | |
report would look. | |
To make it look nicer, I always export my access token prior to making calls. | |
You could do the same for API-Scope (profile Id) if you wish. Make sure to use | |
quotes around the access token if it isn’t URL encoded. It has a | (pipe) | |
symbol and your shell won’t like it. |
OlderNewer