A Jupyter or Colab notebook has two sides, one is the Python kernel, which may be running on a remote machine, and the front-end running in one's browser. The JavaScript in the browser and the Python kernel as a result may be on separate machine, yet it is possible to make them dialogue. However, this differs between Jupyter and Colab, the latter being more restrictive. I have found this difference problematic and even though I may not be fully versed in Colab functionality I want share some pointers, discussed below. Majorly:
- Colab diverges greatly from Jupyter in terms of JS operations.
- JS code injected into Colab is sandboxed within each cell.
- There is no requireJS in Colab cells or window.
- Imported modules have to be external to Colab/Drive.
Jupyter notebooks and Jupyter Lab
Colab is derived from Jupyter notebooks, but shares some similarity with Jupyter labs,
such as file navigation panel, but not the tabbed layout.
Dialogue with JavaScript works differently in JupyterLab and majorly
there is no IPython
/Jupyter
object
in JS (cf. bug discussion),
so no JS to Python communication. Colab does allow the latter, but differently.
As a result here I am talking solely about Jupyter notebook not lab.
Applications
For proper applications, widgets, which have their own complex system, are made, but for simple things like a SVG where clicking on an item is registered in Python this seems overkill.
Recently I tried making a widget out of an existing library (JSME), but did not manage, so I had to use a work around resulting in a module that works, but is not elegant.
Output basics
The output under a Jupyter cell, shows the standard-out and standard-error streams,
making is so one can see the Python output of print
and warning.warn
—exception tracebacks are a special case, as they are formatted and outputted by the IPython shell
and each flavour of shell does it differently as I discovered for my weekend project
for the reporting of errors from shared notebooks
(cf. notes).
With a Python kernel one can have custom outputs shown thanks to the function IPython.display.display
,
which will render the passed object based on its _repr_html_
(or __repr__
if _repr_html_
is absent).
Many libraries show plots and molecules thanks to this.
The value returned by the last command (_
) is displayed this way.
Additionally, in IPython.display
there are a several classes
to display particular formats,
from Audio
to YouTubeVideo
. In particular, HTML
, Javascript
, SVG
and FileLink
are very useful.
Therefore, one can inject JavaScript dynamically in Python like so display(JavaScript(js_codeblock))
.
A cell can be run as JavaScript itself with the cell magic %%javascript
or %%js
(or %%typescript
in future).
JS import
In Jupyter, injected JS code runs in the normal JS space, while in Colab it is runs sandboxed, i.e. in an iframe. This means that in Colab the namespace will be isolated.
There are two ways to import a JS library, one in a script-element in HTML the other in JS with RequireJS. Both will need an address whence to source the script/module.
To import a JS library in HTML, the attribute src
or the element script
will do the job
—with a special case of type=module
, so one can do the same in a notebook cell
display(HTML('<script src="some_url"></script>))
. Two attributes worth reading up on are
crossorigin="anonymous"
and async
.
RequireJS allows one's code to run smoother asynchronously
—the second-biggest pain in JS are codeblocks running before everything is loaded.
To do so RequireJS allows one to declare a variable or define a function only once the required module is loaded,
e.g. cost something = require(['resource'])
or define('somefun', ['resource'], (something) => {...})
.
It normally runs off a preset configuration assigning a nice name to a longer URL,
but it can somewhat work with a URL directly.
RequireJS in Colab
In colab requireJS is somewhat available in the page's JS namespace, but not in the cells' JS namespaces. Running:
display(HTML('''
<script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.4/require.min.js" integrity="sha256-Ae2Vz/4ePdIu6ZyI/5ZGsYnb+m0JlOmKPjt6XZ9JJkA=" crossorigin="anonymous"></script>
'''))
fails for me due to a bad underscore.js package dependency, so I did not look further into it, as it seems like it would be a nightmare as there nothing about this online.
I said requireJS is "kind of" available, because it is not actually, but instead a similar system is present,
via the monaco_editor (a feature of VSCode), so I am utterly lost.
Enabling nbextensions
in Python must do something, but not through this.
URLs
Both approaches require a URL.
Before talking about URLs, it may be constructive to quickly mention the different path types:
* a URL starting with two slashes has the full address, minus the protocol (http:
/https:
/etc.),
* a URL starting with a single slash (absolute path) is from the root of the domain, just like in a filesystem,
* a URL starting with any other valid character is relative to the referring file, i.e. file in same folders, just like in a filesystem.
Jupyter routes
In a Jupyter notebook everything within
the base directory of the jupyter server is accessible.
You can navigate through this folder (dashboard view) in the /tree/{path}
route,
you can edit any file in the /edit/{path}
route or get served
a notebook in the /notebooks/{path}
route. The latter also serves raw files if not a notebook,
because it redirects to /files/{path}
.
There is also an additional route that serves files, /static/{path}
, which serves files
that are in the site-packages/notebook/static
folder in the Python path.
For completeness, there is also the /api/{several}/{stuff}
routes, which are well documented
and actually do all the heavy lifting, including session management.
As a result, in requireJS or script-tag one can use a relative path to the notebook file, and it will work.
One can dump files in site-packages/notebook/static
and they will be served by /static
,
without authentication and without having to reset the server.
In Colab things are different: the URL has a UUID in it and none of the non-API routes exist. One therefore has to rely on an externally hosted address.
CDNs
A CDN is a repository for libraries,
which are generally fast. A shameful amount of internet traffic is actually the same libraries
being served over and over —JQuery, Bootstrap, React, Google fonts etc.
In some cases there may not be a working CDN for a given JS library, in which case one
can host it themselves, but with the caveat that it needs to have
the Header add Access-Control-Allow-Origin "*"
directive set (or Nginx etc. equivalents)
in the Apache config or in the .htaccess
file in that folder's parent folder
because the request is cross-origin and would be otherwise refused
(I apologise for stating what may be obvious, but this is a classic tripping hazard when starting out).
Rawgit was a handy CDN for getting a GitHub repo, but is no longer active.
Dropbox and others used to allow it, but no longer due to excessive requests and websites with illegal content.
My university provides user filestorage for staff and students that can be used as a CDN,
but it is far from a well-know service so worth a Google search if one's an academic.
Dialogue with Python
The JS can talk to Python via functions in Jupyter.notebook.kernel
in Jupyter (or IPython.notebook.kernel
), which is available in
both the JS console and in injected or run JS code. For example, Jupyter.notebook.kernel.execute
will run a python code block when the current Python execution finishes.
There is also comms
that allow a smoother data exchange. And there are widgets that build on these allowing user interactions
to affect Python —sliders are the demo example but one can do a lot more.
In Colab's documentation page
advanced_output
there is something, but it is no way as well documented as Jupyter's.
The kernel interaction machinery in JS is in google.colab.kernel
,
but is rather different. This is available in the cells' outputs,
but not in the namespace —fun time debugging.
I have not figured out where to get DOMWidgetView
for example
—this is part of @jupyter-widgets/base
library (for nbextensions).
Proper way
Like in JupyterLab, the most sane way to make a JS<->Python interaction is creating a proper widget extension (which will run NPM and have @jupyter-widgets/base ): this will work in classic Jupyter notebook, Jupyter Lab and Colab. But depending on the task required is very much overkill.
Concluding thoughts
JS operations in Jupyter are not straightforward despite the great documentation. It is not really something that is important only to people that make Python modules with widgets, but is a useful thing to know how to use. For example, I worked on a port-forwarded notebook served by a cluster that did not have access to the internet, but as my local browser obviously had access to the web, I could download the data I need via JS on my machine and feed it to the Python kernel for it to crunch.
Colab is great for demoing a feature: it runs without the user having to do anything.
It visually diverges from Jupyter has the former uses the Bootstrap3 framework
while Colab uses LitElement —Google is disconcertingly inconsistent with its frontend frameworks.
This means that nicely formatted _repr_html_
functionality using BS3 will not work.
Parenthetically, I feel like the BS3 functionality is under-represented in Jupyter ecosystem,
because we are on bootstrap 5 now, and I assume everyone is waiting for the switch
—I was tempted to make a nice modal a few times, but I thought I'd be tempting fate
as BS4 diverges from BS3 a lot (and for the better) and JupyterLabs does not use Bootstrap,
but backbone.js.
However, the divergence from Jupyter notebook is painful as discussed, especially as it is tricky in Jupyter. However, custom widgets (which work different) appear to be the sole solution. It is a lot more laborious and is not overly flexibly: to end with a XKCD reference, it is like bringing a gun to a knife fight, you win in two cases, but cannot easily put out fires with it.
No comments:
Post a Comment