# A [Cloud Service][sumatra-cloud] to Record Simulation Metadata ### [Daniel Wheeler][daniel-wheeler] and Yannick Congo Thermodynamics and Kinetics Group <br> Materials Science and Engineering Division <br> National Institute of Standards and Technology <br> Gaithersburg, MD ### Brief Description The notion of capturing each execution of a script or workflow and its associated metadata is enormously appealing and should be at the heart of any attempt to make scientific simulations reproducible. In view of this, we are developing a [backend and frontend service][sumatra-cloud] to both store and view metadata simulation records using a robust data scheme agnostic approach. ### Extended Description The concept of event control is orthogonal to the concept of version control, it is the concept of capturing every execution of a workflow or script rather than the changes in that script. Research projects often use version control as a poor man's event control, a symptom of the lack of a good event control tool. Given the general focus on reproducing computational experiments, it is surprising that the scientific computing community has not been more active in developing and promoting a good tool for event control. [Sumatra][sumatra] is a lightweight Python-based event control tool that is suited to scientists that engage in both research and development. Since its inception, [Sumatra][sumatra] has not seen wide use in the scientific computing community evidenced by the lack of activity on its mailing list. From the authors' experience, one of the main difficulties with using [Sumatra][sumatra] is the requirement to maintain and communicate with the backend databases used to store the generated metadata. Furthermore, the tool offers no effective public view and no automated way to share the data store. In light of these drawbacks, we are developing a [backend and frontend service][sumatra-cloud] to both store and view Sumatra records. Our intent is for the web service to be as robust and data scheme agnostic as possible so that changes to the Sumatra client, backend model, API and frontend do not break the interactions between each element or backwards compatibility with existing record sets. The Python web framework, Flask is used to generate the backend endpoints. The frontend is entirely JavaScript based and generates the data object model based solely on the data available. The MVC paradigm and data scheme agnostic approach are vital for Sumatra to effectively leverage the web and become established across the wider scientific computing community. ## [Daniel Wheeler][daniel-wheeler] Profile I am interested in the development and deployment of software for applied scientific applications. I have a comprehensive knowledge of numerical algorithms for solving partial differential equations as well as extensive experience in using and developing more general scientific computing tools. I am currently working on developing web services for scientific data discovery. I am one of the lead developers of the [FiPy][fipy] open-source PDE solver and the [PyMKS][pymks] materials informatics toolkit. ## [Yannick Congo][yannick-congo] Profile I am currently a PhD student working on the development of both standards and tools for data reproducibility. I have a background in software engineering and distributed systems. My interests involve data manipulation in distributed architectures across a variety of platforms. I am experienced in software design and implementation with a variety of modern scripting languages. I have recently contributed to [OOF3D][oof], an object oriented finite element tool for materials science. [sumatra]: http://neuralensemble.org/sumatra/ [sumatra-cloud]: https://github.com/materialsinnovation/sumatra-cloud [daniel-wheeler]: http://wd15.github.io/about.html [yannick-congo]: https://www.linkedin.com/pub/yannick-congo/60/16/411 [fipy]: http://www.ctcms.nist.gov/fipy [pymks]: http://pymks.org [oof]: http://www.ctcms.nist.gov/oof/