JS in Colab

Saturday 7 May 2022

JS in Colab

A Jupyter or Colab notebook has two sides, one is the Python kernel, which may be running on a remote machine, and the front-end running in one's browser. The JavaScript in the browser and the Python kernel as a result may be on separate machine, yet it is possible to make them dialogue. However, this differs between Jupyter and Colab, the latter being more restrictive. I have found this difference problematic and even though I may not be fully versed in Colab functionality I want share some pointers, discussed below. Majorly:

  • Colab diverges greatly from Jupyter in terms of JS operations.
  • JS code injected into Colab is sandboxed within each cell.
  • There is no requireJS in Colab cells or window.
  • Imported modules have to be external to Colab/Drive.

Jupyter notebooks and Jupyter Lab

Colab is derived from Jupyter notebooks, but shares some similarity with Jupyter labs, such as file navigation panel, but not the tabbed layout. Dialogue with JavaScript works differently in JupyterLab and majorly there is no IPython/Jupyter object in JS (cf. bug discussion), so no JS to Python communication. Colab does allow the latter, but differently.

As a result here I am talking solely about Jupyter notebook not lab.

Applications

For proper applications, widgets, which have their own complex system, are made, but for simple things like a SVG where clicking on an item is registered in Python this seems overkill.
Recently I tried making a widget out of an existing library (JSME), but did not manage, so I had to use a work around resulting in a module that works, but is not elegant.

Output basics

The output under a Jupyter cell, shows the standard-out and standard-error streams, making is so one can see the Python output of print and warning.warn —exception tracebacks are a special case, as they are formatted and outputted by the IPython shell and each flavour of shell does it differently as I discovered for my weekend project for the reporting of errors from shared notebooks (cf. notes). With a Python kernel one can have custom outputs shown thanks to the function IPython.display.display, which will render the passed object based on its _repr_html_ (or __repr__ if _repr_html_ is absent). Many libraries show plots and molecules thanks to this. The value returned by the last command (_) is displayed this way. Additionally, in IPython.display there are a several classes to display particular formats, from Audio to YouTubeVideo. In particular, HTML, Javascript, SVG and FileLink are very useful. Therefore, one can inject JavaScript dynamically in Python like so display(JavaScript(js_codeblock)).

A cell can be run as JavaScript itself with the cell magic %%javascript or %%js (or %%typescript in future).

JS import

In Jupyter, injected JS code runs in the normal JS space, while in Colab it is runs sandboxed, i.e. in an iframe. This means that in Colab the namespace will be isolated.

There are two ways to import a JS library, one in a script-element in HTML the other in JS with RequireJS. Both will need an address whence to source the script/module.

To import a JS library in HTML, the attribute src or the element script will do the job —with a special case of type=module, so one can do the same in a notebook cell display(HTML('<script src="some_url"></script>)). Two attributes worth reading up on are crossorigin="anonymous" and async.

RequireJS allows one's code to run smoother asynchronously —the second-biggest pain in JS are codeblocks running before everything is loaded. To do so RequireJS allows one to declare a variable or define a function only once the required module is loaded, e.g. cost something = require(['resource']) or define('somefun', ['resource'], (something) => {...}). It normally runs off a preset configuration assigning a nice name to a longer URL, but it can somewhat work with a URL directly.

RequireJS in Colab

In colab requireJS is somewhat available in the page's JS namespace, but not in the cells' JS namespaces. Running:

display(HTML('''
<script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.4/require.min.js" integrity="sha256-Ae2Vz/4ePdIu6ZyI/5ZGsYnb+m0JlOmKPjt6XZ9JJkA=" crossorigin="anonymous"></script>
'''))

fails for me due to a bad underscore.js package dependency, so I did not look further into it, as it seems like it would be a nightmare as there nothing about this online.

I said requireJS is "kind of" available, because it is not actually, but instead a similar system is present, via the monaco_editor (a feature of VSCode), so I am utterly lost. Enabling nbextensions in Python must do something, but not through this.

URLs

Both approaches require a URL. Before talking about URLs, it may be constructive to quickly mention the different path types: * a URL starting with two slashes has the full address, minus the protocol (http:/https:/etc.), * a URL starting with a single slash (absolute path) is from the root of the domain, just like in a filesystem, * a URL starting with any other valid character is relative to the referring file, i.e. file in same folders, just like in a filesystem.

Jupyter routes

In a Jupyter notebook everything within the base directory of the jupyter server is accessible. You can navigate through this folder (dashboard view) in the /tree/{path} route, you can edit any file in the /edit/{path} route or get served a notebook in the /notebooks/{path} route. The latter also serves raw files if not a notebook, because it redirects to /files/{path}. There is also an additional route that serves files, /static/{path}, which serves files that are in the site-packages/notebook/static folder in the Python path. For completeness, there is also the /api/{several}/{stuff} routes, which are well documented and actually do all the heavy lifting, including session management. As a result, in requireJS or script-tag one can use a relative path to the notebook file, and it will work. One can dump files in site-packages/notebook/static and they will be served by /static, without authentication and without having to reset the server.

In Colab things are different: the URL has a UUID in it and none of the non-API routes exist. One therefore has to rely on an externally hosted address.

CDNs

A CDN is a repository for libraries, which are generally fast. A shameful amount of internet traffic is actually the same libraries being served over and over —JQuery, Bootstrap, React, Google fonts etc. In some cases there may not be a working CDN for a given JS library, in which case one can host it themselves, but with the caveat that it needs to have the Header add Access-Control-Allow-Origin "*" directive set (or Nginx etc. equivalents) in the Apache config or in the .htaccess file in that folder's parent folder because the request is cross-origin and would be otherwise refused (I apologise for stating what may be obvious, but this is a classic tripping hazard when starting out). Rawgit was a handy CDN for getting a GitHub repo, but is no longer active. Dropbox and others used to allow it, but no longer due to excessive requests and websites with illegal content. My university provides user filestorage for staff and students that can be used as a CDN, but it is far from a well-know service so worth a Google search if one's an academic.

Dialogue with Python

The JS can talk to Python via functions in Jupyter.notebook.kernel in Jupyter (or IPython.notebook.kernel), which is available in both the JS console and in injected or run JS code. For example, Jupyter.notebook.kernel.execute will run a python code block when the current Python execution finishes. There is also comms that allow a smoother data exchange. And there are widgets that build on these allowing user interactions to affect Python —sliders are the demo example but one can do a lot more.

In Colab's documentation page advanced_output there is something, but it is no way as well documented as Jupyter's. The kernel interaction machinery in JS is in google.colab.kernel, but is rather different. This is available in the cells' outputs, but not in the namespace —fun time debugging. I have not figured out where to get DOMWidgetView for example —this is part of @jupyter-widgets/base library (for nbextensions).

Proper way

Like in JupyterLab, the most sane way to make a JS<->Python interaction is creating a proper widget extension (which will run NPM and have @jupyter-widgets/base ): this will work in classic Jupyter notebook, Jupyter Lab and Colab. But depending on the task required is very much overkill.

Concluding thoughts

JS operations in Jupyter are not straightforward despite the great documentation. It is not really something that is important only to people that make Python modules with widgets, but is a useful thing to know how to use. For example, I worked on a port-forwarded notebook served by a cluster that did not have access to the internet, but as my local browser obviously had access to the web, I could download the data I need via JS on my machine and feed it to the Python kernel for it to crunch.

Colab is great for demoing a feature: it runs without the user having to do anything. It visually diverges from Jupyter has the former uses the Bootstrap3 framework while Colab uses LitElement —Google is disconcertingly inconsistent with its frontend frameworks. This means that nicely formatted _repr_html_ functionality using BS3 will not work. Parenthetically, I feel like the BS3 functionality is under-represented in Jupyter ecosystem, because we are on bootstrap 5 now, and I assume everyone is waiting for the switch —I was tempted to make a nice modal a few times, but I thought I'd be tempting fate as BS4 diverges from BS3 a lot (and for the better) and JupyterLabs does not use Bootstrap, but backbone.js.

However, the divergence from Jupyter notebook is painful as discussed, especially as it is tricky in Jupyter. However, custom widgets (which work different) appear to be the sole solution. It is a lot more laborious and is not overly flexibly: to end with a XKCD reference, it is like bringing a gun to a knife fight, you win in two cases, but cannot easily put out fires with it.

No comments:

Post a Comment