raymcdermott/depgen.md

Last active November 25, 2019 19:20

Star (2) You must be signed in to star a gist
Fork (1) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/raymcdermott/fcd4480f0cb5d4c22f154cd3d3115819.js"></script>
Save raymcdermott/fcd4480f0cb5d4c22f154cd3d3115819 to your computer and use it in GitHub Desktop.

Download ZIP

Adhoc CLJS dependency generation

Raw

depgen.md

Introduction

A conversation started at the Heart of Clojure conference in Belgium on Friday August 2nd 2019.

The group represented project owners from Maria.Cloud, Next.Journal, Klipse and Replete-Web. The projects make significant use of self-hosted CLJS.

This is a proposal to have a service that generates and caches JS files for a specific CLJS dependency (name & version).

Background

The ClojureScript compiler generates JS files for each of an apps stated CLJS dependencies. The runtime environment then loads each of the needed JS files to satisfy the dependency at runtime.

Problem statement

When a dependency is added at runtime

it adds significant time delays on the user experience
we don't yet have a shared, generic solution to calculating / generating the correct dependency tree per dep

Suggestions

API / Cache

Call a hosted API to generate a lib given an argument such as

{:deps {mvngrp/mvnartefact {:mvn/version "0.6.2"}}}

{:deps {gitname {:git/url "https://github.com/org-name/repo-name.git"
                 :sha     "ad5bcac0c2d771f09f69de4edab183b0c2fe437b"}}

The API would return

a URL to the location of the generated files
a list of the JS files and maybe others (eg source maps)
meta data such as a SHA / MD5 for each of the generated files

API / Cache - Implementation

Normalise the EDN
Look up the normalised dep (or a hash of it) in the cache
- when cache-hit: return the link to previously generated data
- when cache-miss: run CLJS to produce the deps, add the data to the cache and return the data

Web workers

Make the call via a web worker rather than the mainline code.

This could improve the perceived performance in the UX.

END

This is my recollection.

Please feel free to edit / comment.

mhuebert commented Aug 7, 2019 •

edited

Loading

Good start. Quite a bit of work was put into implementing dependency bundling for Maria.cloud, so that we could specify arbitrary “entry” namespaces to be made available to the self-hosted compiler (in userland), distinct from the “host” build which wraps/runs the compiler. Initially I wrote my own lib for this purpose, & then later worked with @thheller on the more principled implementation he wrote for shadow-cljs.

Main parts of that implementation:

A “bootstrap” build is passed a list of entry namespaces, compiles them (mostly a straightforward cljs compile, but I believe it requires some special handling of macro namespaces), and generates an index of the files + per-file metadata (including what each file :provides and :requires). Later, in the client, this index will let us build an mapping from namespaces to resources (so that we know what file some.namespace refers to). Because we have the complete dependency tree structure in memory, when given a namespace, we can download all the transitive dependencies in parallel.
In the “host” build, we need a runtime :load fn that can be passed to the selfhost-compiler which knows how to read the indexes we generated. May be useful to have a look at load and load-namespaces in this file. The selfhost environment also needs to know about the namespaces already loaded by the “host” build, to avoid clobbering them. Shadow’s browser-bootstrap code currently assumes a single index, could be easily modified to support the notion of loading multiple indexes (eg. from multiple calls to an API).

Our API service itself only needs to concern itself with Part 1, compiling + indexing. One difference I see from the shadow-cljs implementation is that we won’t have a notion of “entry namespaces” - so the index we generate for a deps/maven artifact would cover all of its cljs/js sources.

Part 2 will need a standalone implementation if people want to use it without shadow-cljs; the only weird (non-library-ish) part of that I see is storing the list of provide‘d things for the host build in memory, which is currently a special step during compile.

Other notes

Need to remember to account for analysis cache files (in addition to compiled JS)
Sometimes need a way to manually exclude a macros namespace (eg cljs.js) which is not selfhost-compatible but which is in the dependency tree of something that is selfhost-compatible. One can support JVM macros for transitive dependencies which are precompiled, but obviously not for direct usage by the selfhost compiler.

mfikes commented Aug 8, 2019

@mhuebert FWIW, it is possible to AOT compile macros namespaces for use with self-host. (See https://blog.fikesfarm.com/posts/2016-02-03-planck-macros-aot.html)

mhuebert commented Aug 8, 2019

@mfikes Thx, I used to take a similar approach.. now we AOT compile macro namespaces at the same time as the rest of the sources. This is performed by the shadow-cljs :bootstrap target (here, roughly). After sources are resolved for the given entries, a separate pass finds all their dependent macros, before the compile step.

awb99 commented Nov 25, 2019

Gents! I modified mhuebert demos a little bit and am now generating dynamic bundles via shadow cljs that I serve via http.
https://github.com/pink-gorilla/kernel-cljs-shadowdeps

The self hosted clojurescript is this project: https://github.com/pink-gorilla/kernel-cljs-shadow

raymcdermott/depgen.md

Introduction

Background

Problem statement