Large-scale computing backend for Jupyter notebooks - HTCondor batch job submission and monitoring using the Ganga toolkit
Mentors:
- Ulrik Egede
- Jakub Moscicki
- Ben Jones
- Enric Tejedor
- Diogo Castro
Aman Pratap Singh
Email: [email protected]
Phone: +91-8266928969
Full Name | Aman Pratap Singh |
Institute | 2nd Year B.Tech Student Computer Science and Engineering Indian Institute of Technology Bhubaneswar |
[email protected] [email protected] |
|
Phone | +91-8266928969 |
Blog | https://blog.amanpratapsingh.in |
Github | https://github.com/apsknight |
IRC Nick | apsknight (Freenode/OFTC) |
Timezone | Indian Standard Time (GMT +0530) |
Address | 54, Transit Hostel-2, NISER Campus, Indian Institute of Technology Bhubaneswar Jatni, Odisha, India 752050 |
Reference Contact | Tatvam Dadheech Email: [email protected] Phone: +91-8107407676 |
I am Aman Pratap Singh, a 2nd year undergraduate Computer Science and Engineering student at Indian Institute of Technology Bhubaneswar. I have experience in programming with multiple languages such as Python, C/C++, Java, Javascript etc. I frequently use Jupyter Notebooks for my lab assignments and similar purposes as it provides simple and user friendly interface for programming and simultaneously allows documenting the explanations and output of code. I like coding for fun and have worked on various small projects which can be found on my Github Profile.
Recently I have been involved in JupyterLab community. JupyterLab is a next-generation user interface for Project Jupyter and provides all features of classic Jupyter Notebook. I fixed few documentation and UI bugs in JupyterLab repository and created a JupyterLab extension which scraps comics from XKCD and shows it in JupyterLab.
I also enjoy Competitive Programming and actively participate in coding challenges on various sites like Codeforces, Codechef etc. I also have deep interest in Physics, Indian History and Cricket.
I have been always passionate about the projects which links Basic Sciences with Programming, which is surely the main inspiration for me to work with CERN. I eagerly want to work on this project since it will help many scientists and researchers in their research work. Since I am regular user of Jupyter Notebook, I strongly believe interactive programming greatly simplifies the efforts required in performing complex experiments and elucidate the output. This project integrate powerful backends for big computation with interactive programming environment of notebooks so I believe I am the perfect match for working on this project.
I will be working 6-8 hours per day for the entire duration of the project.
I have my end semester exams from April 28 to Mar 5 during which I'll be little busy, other than that I do not have any commitments for summer.
I am perfectly fine with IRC, Email, Skype or any other similar medium of communication. My preferred language for communication is English.
Jupyter Notebook is an interactive computing environment that creates notebooks which contains computer code as well as rich text elements like equations, figures, plots, widgets and theory. These notebooks are easily understandable and can be executed to perform interactive data analysis, scientific computing and code prototyping.
In experiments like LHC(Large Hadron Collider), a very large amount of data (in order of petabytes) is generated. This huge amount of data is then processed using a collection of powerful computers at multiple computing sites by distributing the data in small chunks and processing them individually at remote distributed computing network and then finally collecting the result. These multiple sites are interconnected by a grid. These type of Grids can be accessed by a toolkit called Ganga.
Ganga is an open source iPython based interface tool to the computing grid which leverage the power of distributed computing grid and provide scientists an interface supported by a powerful backend where they can submit their computation intensive programs to Ganga as a batch job. After submitting the job, Ganga processes the program somewhere on the grid, it keeps track of status of the job and after completion of job it gives back output to the user. It can also provide job statistics and job errors, if any.
HTCondor is a workload management system created by University of Wisconsin-Madison. It is based on High-Throughput Computing which effectively utilizes the computing power of idle computers on a network or on a computing grid and offload computing intensive tasks on the idle machines available on a network or computing grid. It provides various features such as job queueing, job prioritization, resource monitoring and management etc. HTCondor provides intelligent resource management by match-making resources available on different machines and resources required by program.
This project aims to create a plugin for Jupyter Notebook and also integrate it to SWAN Notebook service which is a cloud data analysis service developed and powered by CERN. This plugin will easily submit and monitor batch computation jobs to HTCondor using Ganga toolkit. The plugin will display status of ongoing job, progress bar, job statistics and errors in Notebook itself and will also allow termination of ongoing jobs. The plugin shall provide user-friendly Notebook interface to easily perform computation intensive task on Notebook by integrating cell based structure of Notebook to submit jobs and peeking the progress and statistics of the job executed from a cell.
This project streamlines the process of large scale computation by providing an integration of powerful backend to Jupyter Notebook which is an interactive web application easily deployable on cloud and remotely accessible. The project will provide scientists and researchers a unified application to write interactive computing intensive program, executing it, monitoring its progress and run-time statistics as well as getting output on successful execution of the program. The project will enhance the process of large scale computing of batch jobs at CERN and other similar organizations.
- Create a plugin for Jupyter Notebook that can offload batch jobs from notebook.
- Using HTCondor, apply the plugin to real batch jobs of CERN.
- Test the plugin on CERN’s batch infrastructure.
- Integrate the plugin to CERN’s notebook service SWAN.
-
Create a plugin to submit and monitor batch computation jobs from notebook
- Design a prototype of plugin for submitting and monitoring jobs from Jupyter Notebook.
- Design the user interface and kernel side module prototype.
- Determine an architecture for the plugin.
- Explore all possible widgets and features of Jupyter Notebook that can be applied to the plugin.
- Determine how plugin will interact with Ganga Toolkit.
- Design an interface to display progress bar, job statistics and output of the job.
- Implement Kernel side of the plugin.
- Integrate the designed user interface with kernel side module.
- Test the plugin on local backend server
- Test the plugin by running small jobs on local backend server.
- Perform tests for various corner cases that can arise.
- Implement error handling mechanism of plugin
- Intentionally create errors to test various event listeners.
- Implement how plugin should respond in case of any unexpected request/error.
- Write comprehensive documentation of the code written for Task 1.
- Design a prototype of plugin for submitting and monitoring jobs from Jupyter Notebook.
-
Apply the plugin to real batch jobs at CERN using HTCondor
- Apply the plugin to real and small batch jobs at CERN on local backend.
- Test the plugin for complex but low computation real batch jobs at CERN.
- Use HTCondor instead of local backend.
- Change backend server from local to one provided by HTCondor.
- Test the plugin for complex and relatively large computation batch jobs at CERN.
- Implement some sample notebooks illustrating the process.
- Ask for feedback from users and implement the suggestions.
- Write comprehensive documentation of the code written for Task 2.
- Apply the plugin to real and small batch jobs at CERN on local backend.
-
Deploy and test the plugin to CERN IT Infrastructure.
- Test the plugin on CERN IT Infrastructure.
- Integrate the plugin with SWAN notebook service.
- Ask for feedback from users and implement the suggestions.
- Write comprehensive documentation of the code written for Task 3.
Duration | Task |
---|---|
March 27 | Deadline for submitting Project Proposal |
March 27 - April 23 |
|
April 23 - May 14 | Official Community Bonding Period
|
May 14 - June 6 | Official Coding Period Start
|
June 6 - June 11 |
|
June 11 - June 15 | Phase 1 evaluation
|
June 15 - July 4 | Begin Task 2 : Integrate plugin with HTCondor
|
July 4 - July 9 |
|
July 9 - July 13 | Phase 2 evaluation
|
July 13 - August 10 | Begin Task 3 : Deploy plugin to CERN IT Infrastructure
|
August 10 - August 14 | Finish Task 3 Final Submission
|
- Working Jupyter Notebook plugin with following features.
- Submitting batch jobs from Jupyter Notebook
- Displaying progress bar and job statistics of the ongoing jobs.
- Cancellation of ongoing jobs.
- SWAN Notebook service integration of the plugin.
- Detailed documentation for the plugin.
- Sample Notebooks demonstrating application of plugin using Ganga toolkit and HTCondor.
- Explore the possibility of improving the plugin and implement a similar plugin for JupyterLab which is next generation user interface of Project Jupyter.