Last active
April 2, 2022 18:28
-
-
Save bollwyvl/132aaff5cdb2c35ee1f75aed83e87eeb to your computer and use it in GitHub Desktop.
Accessing JupyterLite contents from pyolite
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{"metadata":{"language_info":{"codemirror_mode":{"name":"python","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.8"},"kernelspec":{"name":"python","display_name":"Pyolite","language":"python"}},"nbformat_minor":4,"nbformat":4,"cells":[{"cell_type":"markdown","source":"# Using JupyterLite IndexedDB Storage\n\n> Big thanks to `@konwiddak` who [unraveled the first part](https://github.com/jupyterlite/jupyterlite/discussions/91#discussioncomment-1135504) of this!\n\nIf available, the JupyterLite \"Server\" will store its contents in the browser's [IndexedDB](https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API#see_also). This API is available to WebWorkers, where `pyolite` kernels run.","metadata":{}},{"cell_type":"code","source":"import asyncio, js, io, pandas, IPython","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"The name of the database is hard-coded.","metadata":{}},{"cell_type":"code","source":"DB_NAME = \"JupyterLite Storage\"","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"As the various APIs are event-driven, we use the `async` and `await` keywords with a `Queue` to unwrap the lifecycle.","metadata":{}},{"cell_type":"code","source":"async def get_contents(path):\n \"\"\"use the IndexedDB API to acess JupyterLite's in-browser (for now) storage\n \n for documentation purposes, the full names of the JS API objects are used.\n \n see https://developer.mozilla.org/en-US/docs/Web/API/IDBRequest\n \"\"\"\n # we only ever expect one result, either an error _or_ success\n queue = asyncio.Queue(1)\n \n IDBOpenDBRequest = js.self.indexedDB.open(DB_NAME)\n IDBOpenDBRequest.onsuccess = IDBOpenDBRequest.onerror = queue.put_nowait\n \n await queue.get()\n \n if IDBOpenDBRequest.result is None:\n return None\n \n IDBTransaction = IDBOpenDBRequest.result.transaction(\"files\", \"readonly\")\n IDBObjectStore = IDBTransaction.objectStore(\"files\")\n IDBRequest = IDBObjectStore.get(path, \"key\")\n IDBRequest.onsuccess = IDBRequest.onerror = queue.put_nowait\n \n await queue.get()\n \n return IDBRequest.result.to_py() if IDBRequest.result else None","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"With this function, we can now access files that have beeen saved to IndexedDB, for example, this notebook.","metadata":{}},{"cell_type":"code","source":"IPython.display.JSON(await get_contents(\"pyolite - contents.ipynb\"))","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"For many purposes, only the `content` field will be interesting.\n\n> For this example, _Open With..._ the `iris.csv` file, and make a change and save it, as it is initially only available from the _actual_ HTTP server. Future work may allow hiding this implementation detail.","metadata":{}},{"cell_type":"code","source":"pandas.read_csv(io.StringIO((await get_contents(\"iris.csv\"))[\"content\"]), sep = \"\\t\")","metadata":{"trusted":true},"execution_count":null,"outputs":[]}]} |
Is a complementary put_contents(path)
function also possible?
Yes, see this discussion comment.
It needs be polished and wrapped for users, I'd say.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is so super helpful! Thank you, @bollwyvl . I would like to add a note for someone else who runs into this issue -- if you have a binary file, the 'content' will be a string that is encoded using latin-1. I.e., if you want the bytes array (say, to do pickle.loads()), you will need to do: