This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python | |
# version 0.1 Wenming Ye 2/25/2012 | |
#Extract English and Text only content out of the Gutenberg DVD. 2010 | |
# If you have questions, please contact me for the latest version. | |
# feel free to modify the scripts to your needs. | |
# STEP 1: Run this in the Cygwin Environment. if you don't want to use Cygwin, you can modify "cp command embeded in the script". | |
# This file parses the html index pages (TITLES) and find english Language books and their ZIP resource URLs. | |
# Run this in the gutenberg main INDEXES dir in gutenberg "www.gutenberg.org/INDEXES" | |
# Removes pdf, html, and images, and non-english items. All the zip files will be copied into the INDEXES/zips | |
# STEP 2: Then you can extract all the zip files by running >>>>find ./ -name "*.zip" -exec unzip -o {} \;<<<< |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from azure.storage import * | |
import base64 | |
import os | |
def upload(blob_service, container_name, blob_name, file_path): | |
blob_service.create_container(container_name, None, None, False) | |
blob_service.put_blob(container_name, blob_name, '', 'BlockBlob') | |
chunk_size = 65536 | |
block_ids = [] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
REM install the redist so that openssl will not complain w/vc9 -- could use %ROLEROOT% here | |
vcredist_x86.exe /q | |
REM set a path for openssl and freetds | |
SET PATH=%PATH%;E:\lib\freedts;E:\lib\openssl | |
REM Download directly from rubyinstaller.org | |
E: | |
powershell -c "(new-object System.Net.WebClient).DownloadFile('http://rubyforge.org/frs/download.php/75894/railsinstaller-2.1.0.exe', 'railsinstaller.exe')" | |
REM install silently | |
railsinstaller.exe /verysilent /dir="%RUBY_PATH%" /tasks="assocfiles,modpath" | |
REM remove any tiny tds copies |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Microsoft-Web/PRESENTATION-Keynote | |
Microsoft-Web/PRESENTATION-BuildingServiceLayerWithASPNETWebAPI | |
Microsoft-Web/DEMO-BuildingForTheMobileWeb | |
Microsoft-Web/DEMO-BuildingServiceLayerWithASPNETWebAPI | |
Microsoft-Web/PRESENTATION-BuildingForTheMobileWeb | |
Microsoft-Web/PRESENTATION-BuildingSocialWebApps | |
Microsoft-Web/DEMO-BuildingSocialWebApps | |
Microsoft-Web/PRESENTATION-UsingCloudApplicationServices | |
Microsoft-Web/PRESENTATION-HTML5andjQuery | |
Microsoft-Web/PRESENTATION-RealtimeCommunicationsWithSignalR |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Raw notes for downloading 6.6+ million files from blob storage within hours using a few simple tools on a single machine. | |
1. Get a list of files from blob storage. A few lines of c# code will do. | |
//In app config. | |
<configuration> | |
<appSettings> | |
<add key="StorageConnectionString" | |
value="DefaultEndpointsProtocol=https;AccountName=storagename;AccountKey=yourkey" /> | |
</appSettings> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python | |
import os | |
import sys | |
import json | |
import pprint | |
file = open("twitter_stream_seq2.txt", 'r') | |
lines = file.readlines() | |
i = 0 | |
str = "" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Hadooponazure.com is strictly a private CTP for microsoft's hadoop distro. It supports HIVE, PIG, a javascript console, a web portal. You can also terminal service into the actual clusters as needed. There's a lot of tutorials in the training kit, there's a deck and there's a bunch of tutorials. | |
You should also be able to find content on windowsazure.com | |
http://www.windowsazure.com/en-us/develop/net/scenarios/big-data/ | |
http://www.windowsazure.com/en-us/develop/net/how-to-guides/hadoop/ | |
I recommend going through at least one of these tutorials: | |
http://www.windowsazure.com/en-us/develop/net/tutorials/hadoop-marketplace/ | |
and perhaps look at this deck in addition to the one included in the training kit. the http://view.officeapps.live.com/op/view.aspx?src=http%3a%2f%2fvideo.ch9.ms%2fteched%2f2012%2fna%2fAZR325.pptx |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Hefinition of HPC: | |
High Performance Computing (HPC) is the use of servers, clusters, and supercomputers – plus associated software, tools, components, storage, and services – for scientific, engineering, or analytical tasks that are particularly intensive in computation, memory usage, or data management. HPC is used by scientists and engineers both in research and in production across industry, government, and academia. Within industry, HPC can frequently be distinguished from general business computing in that companies generally will use HPC applications to gain advantage in their core endeavors – e.g., finding oil, designing automobile parts, or protecting clients’ investments – as opposed to non-core endeavors such as payroll management or resource planning. | |
Azure HPC scheduler is a great way to run batch workload including but not limited to HPC. | |
The Azure HPC Scheduler includes 3 programming models: | |
MPI, SOA, and Parametric sweep. | |
MPI is a traditional HPC programming model which you can look up on |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"coordinates": null, | |
"created_at": "Thu Oct 21 16:02:46 +0000 2010", | |
"favorited": false, | |
"truncated": false, | |
"id_str": "28039652140", | |
"entities": { | |
"urls": [ | |
{ | |
"expanded_url": null, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
program convertsongdat | |
c | |
integer nrows, ncols, tnnz | |
integer nrow(384546), matrix(384546) | |
integer rowe, matrixe, rowi, coli, ncol, nonzero | |
c | |
nrows=384546 | |
ncols=1019318 | |
tnnz=48373586 |
NewerOlder