Skip to content

Instantly share code, notes, and snippets.

View tsycho's full-sized avatar

Asif tsycho

View GitHub Profile
@dannguyen
dannguyen / README.md
Last active September 10, 2024 19:41
Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data

Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.

The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.

On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:

####### 1. A low-resolution photo of road signs

import Foundation
public func getASTString() -> String {
// get the file path for the file "test.json" in the playground bundle
// let filePath = NSBundle.mainBundle().pathForResource("FirstTtest", ofType: "ast")
// get the contentData
let contentData = NSFileManager.defaultManager().contentsAtPath("a.txt")
@AlamofireSoftwareFoundation
AlamofireSoftwareFoundation / alamofire-04-27-2015-statement.md
Last active January 10, 2020 06:25
A response to recent concerns about security vulnerabilities in AFNetworking

Last week, a number of publications ran a story about 1,000's of apps allegedly being vulnerable due to an SSL bug in AFNetworking. These articles contain a number of inaccurate and misleading statements on this matter.

We are publishing this response to clarify and correct these statements.

Background Information

For those not familiar with AFNetworking, here are some relevant details about the library for this story:

  • AFNetworking is an open source, third-party library that provides convenience functionality on top of Apple's built-in frameworks.
  • One component of AFNetworking is AFSecurityPolicy, which handles authentication challenges according to a policy configured by the application. This includes the evaluation of X.509 certificates which servers send back when connecting over HTTPS.
@jspahrsummers
jspahrsummers / GHRunLoopWatchdog.h
Created January 28, 2015 20:50
A class for logging excessive blocking on the main thread
/// Observes a run loop to detect any stalling or blocking that occurs.
///
/// This class is thread-safe.
@interface GHRunLoopWatchdog : NSObject
/// Initializes the receiver to watch the specified run loop, using a default
/// stalling threshold.
- (id)initWithRunLoop:(CFRunLoopRef)runLoop;
/// Initializes the receiver to detect when the specified run loop blocks for
@christiangenco
christiangenco / hn_impersonator.rb
Created October 7, 2014 18:46
Impersonate your favorite HN commenter
require 'http'
require 'json'
require 'peach'
require 'gabbler'
require 'pry'
USERNAME = "patio11"
unless File.exists?("comments.txt")
def get_json(url)
@staltz
staltz / introrx.md
Last active November 17, 2024 01:08
The introduction to Reactive Programming you've been missing
@gruber
gruber / Liberal Regex Pattern for Web URLs
Last active November 15, 2024 16:37
Liberal, Accurate Regex Pattern for Matching Web URLs
The regex patterns in this gist are intended only to match web URLs -- http,
https, and naked domains like "example.com". For a pattern that attempts to
match all URLs, regardless of protocol, see: https://gist.github.com/gruber/249502
# Single-line version:
(?i)\b((?:https?:(?:/{1,3}|[a-z0-9%])|[a-z0-9.\-]+[.](?:com|net|org|edu|gov|mil|aero|asia|biz|cat|coop|info|int|jobs|mobi|museum|name|post|pro|tel|travel|xxx|ac|ad|ae|af|ag|ai|al|am|an|ao|aq|ar|as|at|au|aw|ax|az|ba|bb|bd|be|bf|bg|bh|bi|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn|co|cr|cs|cu|cv|cx|cy|cz|dd|de|dj|dk|dm|do|dz|ec|ee|eg|eh|er|es|et|eu|fi|fj|fk|fm|fo|fr|ga|gb|gd|ge|gf|gg|gh|gi|gl|gm|gn|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|im|in|io|iq|ir|is|it|je|jm|jo|jp|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|me|mg|mh|mk|ml|mm|mn|mo|mp|mq|mr|ms|mt|mu|mv|mw|mx|my|mz|na|nc|ne|nf|ng|ni|nl|no|np|nr|nu|nz|om|pa|pe|pf|pg|ph|pk|pl|pm|pn|pr|ps|pt|pw|py|qa|re|ro|rs|ru|rw|sa|sb|sc|sd|se|sg|sh|si|s
@abeppu
abeppu / feelbetter.js
Last active February 6, 2021 19:06
Implementation of @FeelBetterBot
var Twit = require("twit");
var config = require('./oauthconfig');
console.log("config:");
console.log(config);
var T = new Twit({
consumer_key: config.consumer_key,
consumer_secret: config.consumer_secret,
access_token: config.access_token,
@anthonywu
anthonywu / osx_pdf_join.sh
Created April 18, 2013 02:35
Mac OS X – bash function to join pdfs on the command line
function pdf_join {
join_py="/System/Library/Automator/Combine PDF Pages.action/Contents/Resources/join.py"
read -p "Name of output file > " output_file && "$join_py" -o $output_file $@ && open $output_file
}
@ghalimi
ghalimi / XIRR.js
Created January 30, 2013 01:11
XIRR Function
// Copyright (c) 2012 Sutoiku, Inc. (MIT License)
// Some algorithms have been ported from Apache OpenOffice:
/**************************************************************
*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file