2021-11-08 Weeknotes: datasette-jupyterlite, test and publish an open source Python library pytest-recording

datasette-jupyterlite

JupyterLite is absolutely incredible : it’s a full, working distribution of Jupyter that runs entirely in a browser, thanks to a Python interpreter (and various other parts of the scientific Python stack) that has been compiled to WebAssembly by the Pyodide project.

Since it’s just static JavaScript (and WASM modules) it’s possible to host it anywhere that can run a web server.

Datasette runs a web server…

So, I built datasette-jupyterlite ( https://datasette.io/plugins/datasette-jupyterlite ) a Datasette plugin that bundles JupyterLite and serves it up as part of the Datasette instance.

Here’s some Python code that will retrieve data from the associated Datasette instance and pull it into a Pandas DataFrame:

import pandas, pyodide
pandas.read_csv(pyodide.open_url(
  "https://latest-with-plugins.datasette.io/github/stars.csv")
)

(I haven’t yet found a way to do this with a relative rather than absolute URL.)

The best part of this is that it works in Datasette Desktop! You can install the plugin using the “Install and manage plugins” menu item to get a version of Jupyter running in Python running in WebAssembly running in V8 running in Chromium running in Electron.

The plugin implementation is just 30 lines of code—it uses the jupyterlite Python package which bundles a .tgz file containing all of the required static assets, then serves files directly out of that tarfile.

How to build, test and publish an open source Python library

My other projects from this week are already written about on the blog:

How to build, test and publish an open source Python library is a detailed write-up of the 10 minute workshop I presented at PyGotham this year showing how to create a Python library, bundle it up as a package using setup.py, publish it to PyPI and then set up GitHub Actions to test and publish future releases.

Using VCR and pytest with pytest-recording

pytest-recording is a neat pytest plugin that makes it easy to use the VCR library ( https://vcrpy.readthedocs.io/en/latest/ ), which helps write tests against HTTP resources by automatically capturing responses and baking them into a YAML file to be replayed during the tests.

It even works with boto3!

To use it, first install it with pip install pytest-recording and then add the @pytest.mark.vcr decorator to a test that makes HTTP calls:

@pytest.mark.vcr
def test_create():
    runner = CliRunner()
    with runner.isolated_filesystem():
        result = runner.invoke(cli, ["create", "pytest-bucket-simonw-1", "-c"])
        assert result.exit_code == 0

The first time you run the tests, use the –record-mode=once option:

pytest -k test_create --record-mode=once

This defaults to creating a YAML file in tests/cassettes/test_s3_credentials/test_create.yaml.

Subsequent runs of pytest -k test_create will reuse those recorded HTTP requests and will not make any network requests - I confirmed this by turning off my laptop’s WiFi.