Skip to content

Instantly share code, notes, and snippets.

View ganesh-srinivas's full-sized avatar

ganesh-srinivas

  • India
View GitHub Profile
@ganesh-srinivas
ganesh-srinivas / gsoc_redhenlab_laughter_categorization.md
Last active September 29, 2017 06:53
GSoC 2017 - Red Hen Lab - Learning Embeddings for Laughter Categorization - Work Product Submission

Learning Embeddings for Laughter Categorization

https://github.com/ganesh-srinivas/laughter/

UPDATE: This project was deemed successful, and I received a very positive evaluation from my mentors! :-) (you can view it at http://ganesh-srinivas.github.io/gsoc_final_evaluation.pdf)

The main deliverables from this project are machine learning classifiers that can perform laughter detection and categorization: identify if an audio clip contains laughter or not, and categorize the laughter (giggle, baby laugh, chuckle/chortle, snicker, belly laugh).

Model Architecture Input Feature Output pooling Test set metrics
@ganesh-srinivas
ganesh-srinivas / proposal-dark-data-extraction-research.md
Last active November 24, 2022 15:12
Proposal for Dark Data Extraction Research

This document will document progress, ideas and source code for dark data extraction systems. These systems use statistical inference to perform data extraction, integration and cleaning from unstructured/"dark" sources (forum posts, webpages, etc.). Data programming is the predominant paradigm for dark data extraction: noisy/conflicting user-defined functions are supplied to a generative model, which can recover the parameters of labelling process. Wherever possible, my projects are based on Snorkel/DeepDive.

Ideas (Extensions for the system):

  • There isn't any work on Domain Specific primitives (DSPs) for audio data. Pre-trained audio models (VGGish) can serve as feature extractors for high-level concepts like emotion, accent and personality for speech data(WaveNet paper mentions that these are possible), musical genre (Sander Dieleman's Spotify CNN blog post), etc.

Ideas (Applications):

  • Ecological/Environmental monitoring: use audio DSPs for building models of migration, logging/poaching, etc.
@ganesh-srinivas
ganesh-srinivas / proposal-usable-privacy-infra.md
Last active December 10, 2017 20:52
Proposal for Usable Privacy Infrastructure

This gist will document progress, ideas and source code for my work on usable privacy (and security) infrastructure.

Inspiration

  • I loved Jeff Huang's essay Ph.D. 2.0: Adopting the Startup Culture for Research. Take-aways:
    • "wow" products/demos > papers
    • reliable code and documentation (eg., Colin Raffel's PhD thesis) > flimsy academic code that that is unusable to anyone else
    • risky attempts at creating "magical" results > safe incremental
    • feedback loops with users (DAWN project)
    • share work with the public ("how will this look in the paper") through expository writing (Michael Nielsen essays or Zachary Lipton's The Mythos of Model Interpretability
  • create compelling demos on the web or in real life (examples: https://clickclickclick.click, my style transfer mirror at SNU's Convocation 2017)

Keybase proof

I hereby claim:

  • I am ganesh-srinivas on github.
  • I am gsrinivas (https://keybase.io/gsrinivas) on keybase.
  • I have a public key ASBDA9vJrkbic6qLa_R93POWmmLCcuxmF_pKgkeo32ZGXwo

To claim this, I am signing this object: