Skip to content

Instantly share code, notes, and snippets.

View MohammadHeydari's full-sized avatar
💭
Big Data Science

Mohammad Heydari MohammadHeydari

💭
Big Data Science
View GitHub Profile
main_url = "http://books.toscrape.com/index.html"
import requests
result = requests.get(main_url)
result.text[:1000]
import tensorflow as tf
import numpy as np
corpus_raw = 'He is the king . The king is royal . She is the royal queen '
# convert to lower case
corpus_raw = corpus_raw.lower()
words = []
for word in corpus_raw.split():
@wagnerjgoncalves
wagnerjgoncalves / week_1.md
Created April 27, 2016 23:36
Introduction Big Data

What's in Big Data Applications and Systems?

Introduction

So we will start by introducing you to where big data comes from and what kinds of things you can do with it.

We'll also provide an overview of some of the key characteristics of big data and a short summary of the data science process to get value out of big data.

@PurpleBooth
PurpleBooth / README-Template.md
Last active November 17, 2024 18:07
A template to make good README.md

Project Title

One Paragraph of project description goes here

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

@rbnvrw
rbnvrw / community_detection.py
Last active May 7, 2022 09:34
python-igraph example
from igraph import *
import numpy as np
# Create the graph
vertices = [i for i in range(7)]
edges = [(0,2),(0,1),(0,3),(1,0),(1,2),(1,3),(2,0),(2,1),(2,3),(3,0),(3,1),(3,2),(2,4),(4,5),(4,6),(5,4),(5,6),(6,4),(6,5)]
g = Graph(vertex_attrs={"label":vertices}, edges=edges, directed=True)
visual_style = {}
@ceteri
ceteri / log.scala
Last active May 14, 2020 13:12
Intro to Apache Spark: code example for RDD animation
// load error messages from a log into memory
// then interactively search for various patterns
// base RDD
val lines = sc.textFile("log.txt")
// transformed RDDs
val errors = lines.filter(_.startsWith("ERROR"))
val messages = errors.map(_.split("\t")).map(r => r(1))
messages.cache()
@ceteri
ceteri / clk.tsv
Last active May 14, 2020 13:13
Intro to Apache Spark: code example for (K,V), join, operator graph
2014-03-04 15dfb8e6cc4111e3a5bb600308919594 11
2014-03-06 81da510acc4111e387f3600308919594 61
@ceteri
ceteri / 01.repl.txt
Last active April 17, 2022 18:46
Intro to Apache Spark: general code examples
$ ./bin/spark-shell
14/04/18 15:23:49 INFO spark.HttpServer: Starting HTTP Server
14/04/18 15:23:49 INFO server.Server: jetty-7.x.y-SNAPSHOT
14/04/18 15:23:49 INFO server.AbstractConnector: Started [email protected]:49861
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 0.9.1
/_/
@clayadavis
clayadavis / nx_to_d3.py
Last active May 29, 2019 20:47
Convert networkx graph to d3 graph
nodes = [{'name': n, 'group': G.node[n]['question_id'], 'size': G.node[n]['count']} for n in G]
l = G.edges()
edges = [{'source': l.index(s), 'target': l.index(t), 'value': G[s][t]['weight']} for s,t in itertools.product(l, l) if s in G and t in G[s]]
json.dump({'nodes': nodes, 'links': edges}, open('filename.json', 'w'))
@steveburkett
steveburkett / gist:4583542
Last active October 28, 2019 16:31
Given an input N, print 'hello world' N times.
#include <iostream>
using namespace std;
int main() {
int i, n;
cin >> n;
for (i=0; i<n; i++) {
cout << "hello world\n";
}
return 0;
}