Skip to content

Instantly share code, notes, and snippets.

@Kriyszig
Last active March 27, 2019 05:47
Show Gist options
  • Save Kriyszig/5a469166410ee7809a5e3f44213b14ca to your computer and use it in GitHub Desktop.
Save Kriyszig/5a469166410ee7809a5e3f44213b14ca to your computer and use it in GitHub Desktop.
Google Summer of Code Project Proposal

Project Proposal

Google Summer of Code 2019

Adding Linear Algebra ops to
the TensorFlow.js CPU back-end

TensorFlow
[TensorFlow.js]


Prateek Nayak
Email: [email protected]
Phone: +91 897 100 5339

Index

  1. Introduction
  2. Synopsis
  3. Project Goals
  4. Timeline
  5. Deliverables

Introduction

Personal Information

Full Name Prateek Nayak
Institute 1st Year B. Tech Student
Computer Science and Engineering
PES University
Bengaluru, India
Email [email protected]
Phone +91 897 100 5339
Github https://github.com/Kriyszig
Timezone Indian Standard Time (GMT + 530)

About Me

I am Prateek Nayak, a first year undergraduate Computer Science and Engineering student studying in PES University. I have experience in programming with multiple languages such as Python, JavaScript, TypeScript, C, etc. I am passionate about Web Development, Algorithms, Machine Learning and Distributed Computing. I am especially interested in Open Source Software and I actively try to contribute to OSS - anything from fixing simple documentation issue to submitting simple bug fixes and writing unit tests.

I have used TensorFlow in past (mostly in Python) for university projects and was surprised how robust the whole framework is. Most of the cutting-edge components one would need while building a model is already implemented in TensorFlow which allows us to shift our focus on more important optimization tasks rather than reinventing the wheel. TensorFlow.js has especially piqued my interests among all the TensorFlow products as it brings Machine Learning to browser making it more accessible than ever.

When I'm not on my laptop, you can find me reading fantasy novels, listening to podcasts, solving Physics problems, playing badminton or cycling through nature.

Why Google Summer of Code with TensorFlow?

Machine Learning has fascinated me, right from the moment I discovered the topic. Every time I wanted to try out something new in the field, TensorFlow has always come to the rescue. TensorFlow has helped not just me, but millions of people - people who are trying something new, people who are carrying out ground-breaking research, people who are trying to make this world a better place. I would like to give back to the community and help people transcend their thoughts to code faster and easier than ever.

TensorFlow.js lags behind its Python counterpart in the sheer number of functionalities it has to offer. By adding more Linear Algebra operations, certain data pre-processing operations will become easier than ever making Machine Learning on browser more lucrative than ever.

Previous contributions to TensorFlow

TensorFlow.js [tfjs-core]

  • #1430 - Allow squeeze to use negative index
  • #1454 - Fixed squeeze() mutating the arguments passed

List of the contributions can be found here

How much time will I be able to dedicate to contribute to this project?

I will be working 6 - 8 hrs per day throughout the duration of the project.

Other commitments during summer

I will have my end semester examination in second and third week of May during which I will be little busy but other than that, I'll be free throughout the summer.

Synopsis

TensorFlow.js is an open source WebGL accelerated Machine Learning library to train and deploy ML models. Compared to the Python counterpart, Tensorflow.js had been around for very little time. Even with less time to work with, Tensorflow.js has matured quite a lot. Given the obvious restrictions of a browser, Tensorflow.js pushes itself to the boundary.

Machine Learning is a field moving at speed of light. While catering to recent demands of features to keep up with the cutting-edge technology, development of less exciting features like linear algebra operations fell behind. In order to be compatible with browser and maintain a small footprint, implementation of features natively is appealing compared to adding a new dependency that will either break the compatibility or increase the footprint of library more than desired.

Benefits to Community

Linear Algebra operations can be of great help in data pre-processing stage. Implementation of these linear algebra operations in the TensirFlow.js core API can help many developers. It will help bridge the gap between Tensorflow.js and its Python counterpart and will greatly improve tasks like image processing in the browser.

Project Goals

Objective

  • Implement significant linear algebra operations namely Singular Value Decomposition, Principal Component Analysis and Matrix Inversion in TensorFlow.js core API.
  • Document the additions in the official documentation of Tensorflow.js
  • Optimize the solution

Project Tasks

Implement Matrix Inversion
Gauss-Jordan Elimination would be an ideal way to compute the inverse of matrix with O(n^3) complexity. However, the variation of Method of Four Russians might be a better option because of it's lucrative tight bound of Θ(n^3 / log n). The algorithm is described in this paper by Gregory V. Brad.

Implement Singular Value Decomposition
The PDF from University of Texas has covered most of the algorithms that can be used to find SVD of a matrix. I have looked into some of the implementation of these algorithms and the ones I found promising were:

  • SVD by Jacobi Rotation (Implementation: Planeshifter/SVD)
    The implementation uses Householder Reduction and QR factorization to compute the SVD.
  • SVD by Least Square Method (Implementation: danilosalvati/svd-js)
    Implements solution described in this paper by G. H. Golub and C. Reinsch

implementation of Principal Component Analysis I propose an implementation similar to one described by Tomáš Bouda in this blog post. This implementation requires the need of Eigenvectors and Eigenvalues which will give us an additional opportunity to implement the same and expose it in the core API of TensorFlow.js

The algorithms discussed above were a result of limited research. Further discussions with the mentors will help us select the most efficient and optimized algorithm for the tasks described above.

Timeline

Duration Task
April 9 Deadline for submitting Project Proposal
April 9 - May 6
  • Learn more about Linear Algebra Operations
  • Learn about the implementation of popular linear algebra operations in libraries like Eigen3 (TensorFlow Python API is dependent on this library for SVD) and Numpy
  • Learn about the codebase structure of Tensorflow.js
May 6 - May 27 Commencement of Community Bonding Period
  • Discuss the algorithmic implementation with mentors
  • Discuss Code Placement
Begin Implementation of Matrix Inversion
  • Figure out prototype of function
  • Setup development environment
May 27 - June 19 Official Coding Period Commences
  • Finish implementation of Matrix Inversion
  • Writing tests to verify behavior of matrix inversion function
Begin Implementation of Singular Value Decomposition
  • Figure out prototype of SVD operation
  • Begin with implementation (If the Implementation involves operation like Householders Transform, expose function to calculate the given transform)
June 19 - June 24
  • Time period for any unexpected delays
  • If on track: Micro optimize the implemented operation. Update Documentation
June 24 - June 28 Phase I Evaluation
  • Goal: Merge Matrix Inversion into Master along with the documentation for function
June 28 - July 18 Phase 2
  • Start work on core SVD Algorithm
  • Write tests to check for consistent and correct behavior
Begin Implementation of Principle Component Analysis
  • Expose a function to get eigenvectors and eigenvalues
    July 18 - July 22
    • Time period for any unexpected delays
    • If on track: Micro optimize the implemented operation. Update Documentation
    July 22 - July 26 Phase II Evaluation
    • Goal: Merge implementation of SVD and the Householders Transform into Master along with the documentation for function
    July 26 - August 16 Phase III - Final Phase
    • Finish implementation of PCA
    • Write tests to verify consistent and correct behavior
    • Document the implementation and improve documentations for the other implementations by adding more examples
    August 16 - August 19
    • Time period for any unexpected delays
    • If on track: Micro optimize the implemented operations.
    August 19 Final Evaluation
    • Goal: Merge implementation of PCA and the Eigenvectors and Eigenvalues into Master along with the documentation for function

    Note

    • The timeline above assumes the implementation of the algorithms discussed in Synopsis. The timeline may change if the discussion with mentors yields a better approach to solving the problem however the goals mentioned for each phase will remain mostly unchanged.
    • The implementation discussed aren't consistent with the implementation of these operations in TensorFlow Python API. If found necessary, algorithms involved can be changed to make function prototype in TensorFlow.js consistent with TensorFlow Python API.

    Deliverables

    • Implement popular linear algebra operations - Matrix Inversion, Singular Value Decomposition, and Principle Component Analysis - in the core API of TensorFlow.js
    • Implement and expose operations such as Householders Transform which act as helper function to the main operations discussed above.

    Future Goals

    • Implement more linear algebra operations in the core API of TensorFlow.js
    • Get involved with the TF.js community and help in developing features outside Linear Algebra operations.

    :octocat: Thank you for going through the proposal

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment