uv+marimo

Alonso Silva

Introduction

“We shape our tools and thereafter our tools shape us.”

Marshall McLuhan

Most of this reflection is coming from Trevor Manz in juv: Reproducible Jupyter Notebooks

I must admit it: I like notebooks

Notebooks do have reproducibility problems

Reproducibility of Jupyter Notebooks

  • 1.16M notebooks
  • Most repositories don’t declare their dependencies
  • Those that do don’t declare all of them
  • Able to execute: 24%
  • Able to produce same results: 4%

In Falling Into The Pit of Success, Jeff Atwood quotes:

“I often think of C++ as my own personal Pit of Despair Programming Language. Unmanaged C++ makes it so easy to fall into traps. Think buffer overruns, memory leaks, double frees, mismatch between allocator and deallocator, using freed memory, umpteen dozen ways to trash the stack or heap – and those are just some of the memory issues. There are lots more ‘gotchas’ in C++. C++ often throws you into the Pit of Despair and you have to climb your way up the Hill of Quality.”

Eric Lippert

  • Wouldn’t it be nice to use a tool designed to keep you from falling into The Pit of Despair?
  • Wouldn’t it be even better if you used a tool that let you effortlessly fall into The Pit of Success?
  • We want users to simply fall into winning practices by using our tools. To the extent that we make it easy to get into trouble we fail.

“A well-designed system makes it easy to do the right things and annoying (but not impossible) to do the wrong things.”

Jeff Atwood

GitHub blog: AI leads Python to top language

GitHub blog: AI leads Python to top language

Rethinking the getting started

python -m venv venv
source venv/bin/activate
python -m pip install -r requirements.txt # or just pip install numpy pandas..
jupyter lab notebook.ipynb
1
Which python?
2
What happens if you forget to activate it?
3
Which versions of the packages? Which versions of their dependencies?
4
Which jupyterlab? In which order of the cells?

The Gold Standard of the getting started is a single command (no furthur guidance required)

<magic tool> notebook.ipynb

The Gold Standard of the getting started is a single command (no furthur guidance required)

<magic tool> notebook.ipynb

uv + marimo help us do a huge jump towards that goal.

uv: An extremely fast Python package and project manager

uv

An extremely fast Python package and project manager, written in Rust.

Introducing uv: 10-100x faster than pip

uv

  • Announced in February 15, 2024
  • Stars in GitHub: 25.4k
  • Downloads per month: 28.1M (October 2024)
  • PyPI share: 13.3% (October 2024)

Rust 🦀 is supercharging Python

  • Polars
  • Pydantic
  • Tantivy
  • Qdrant
  • LanceDB
  • Ruff
  • uv

Python package management is a mess

Python package management is a mess

There is a risk…

Why is it a difficult problem?

  • No multi-version support -> SAT problem -> NP-hard -> CDCL-based SAT solver
  • Rich syntax for filtering requirements by platform -> SAT problem -> NP-hard -> Algebraic decision diagrams

Installation

  • Standalone installer
curl -LsSf https://astral.sh/uv/install.sh | sh
  • PyPI
pip install uv

Common workflow

Following the Python Packaging User Guide

Tip

It is recommended to use a virtual environment when working with third party packages.

Common workflow

Following the Python Packaging User Guide

python -m venv .venv
source .venv/bin/activate
python -m pip install numpy pandas

Common workflow

Following the Python Packaging User Guide

python -m venv .venv
source .venv/bin/activate
python -m pip install numpy pandas

Drop-in replacement for venv

  • Create a virtual environment at .venv
uv venv

Common workflow

Following the Python Packaging User Guide

python -m venv .venv
source .venv/bin/activate
python -m pip install numpy pandas

Drop-in replacement for venv

  • Activate a virtual environment
uv venv
source .venv/bin/activate

Common workflow

Following the Python Packaging User Guide

python -m venv .venv
source .venv/bin/activate
python -m pip install numpy pandas

Drop-in replacement for pip

  • Install a package in the new virtual environment
uv venv
source .venv/bin/activate
uv pip install numpy pandas

Workflow

Terminal
python -m venv .venv
source .venv/bin/activate
python -m pip install numpy pandas
Terminal
uv venv
source .venv/bin/activate
uv pip install numpy pandas

Drop-in replacement for virtualenv

  • Create a virtual environment with, e.g., Python 3.11:
uv venv --python 3.11

Drop-in replacement for pip-tools

  • Given a requirements.in
uv pip compile requirements.in > requirements.txt
  • Given a requirements.txt
uv pip sync requirements.txt

Drop-in replacement for poetry

  • Initialize a project
mkdir hello-world
cd hello-world
uv init
uv run hello.py
  • Add dependencies
uv add numpy polars
  • Build distributions
uv build
  • Publish a package
uv publish

Running scripts with no dependencies

example.py
print("Hello World")
Terminal
uv run example.py
# Hello World

Running scripts with dependencies

example.py
import time
from tqdm import tqdm
from rich import print

print(tqdm.__version__)
print("hi :vampire:")
Terminal
uv run --with tqdm --with rich example.py

Creating scripts with inline script metadata PEP 732

Terminal
uv init --script example.py --python 3.12
uv add --script example.py 'tqdm' 'rich'
uv run example.py

Improving reproducibility

Terminal
uvx juv stamp example.py # to add exclude-newer
example.py
# /// script
# dependencies = [
#   "numpy",
# ]
# [tool.uv]
# exclude-newer = "2024-11-07T00:00:00Z"
# ///

import numpy

print(numpy.__version__)

Improving reproducibility

To make it a script:

  • Add ‘#!/usr/bin/env -S uv run’ at the beginning of the file
Terminal
sed -i '1i\#!/usr/bin/env -S uv run' example.py
  • Make it executable:
Terminal
chmod +x example.py
  • You can run it as a script:
./example.py

Using tools

Terminal
uvx ruff check example.py

marimo: Rethinking the notebook to create reproducible notebooks

Getting started

To install:

Terminal
uv pip install marimo

To start a marimo notebook:

Terminal
marimo edit notebook.py

To start a marimo tutorial:

Terminal
marimo tutorial intro
# marimo tutorial --help

Marimo notebooks are reactive

cell 1:

x = 1

cell 2:

y = 1
print(x + y)

Marimo notebooks are reactive

cell 1:

x = 2

cell 2:

y = 1
print(x + y)

Marimo notebooks are executable as a script

Marimo notebook
name = "World"
print(f"Hello {name}!")
Terminal
python notebook.py

Marimo notebooks are executable as a script

Marimo notebook
import marimo as mo

name = mo.cli_args().get("name") or ""
print(f"Hello {name}!")
Terminal
python notebook.py --name="LINCS"

Marimo notebooks are git-friendly

import numpy as np
import matplotlib.pyplot as plt

def plot(α):
    fig = plt.figure(figsize=(4, 5))
    xlim, ylim = [-np.pi, np.pi], [0, 10]
    x = np.linspace(*xlim)
    plt.plot(x, np.exp(x * α), label=r'$e^{\alpha x}$')
    plt.xlabel('$x$')
    plt.xlim(xlim)
    plt.ylim(ylim)
    plt.legend()
    plt.show()

α = 1
plot(α)

Marimo notebooks are git-friendly

import numpy as np
import matplotlib.pyplot as plt

def plot(α):
    fig = plt.figure(figsize=(4, 5))
    xlim, ylim = [-np.pi, np.pi], [0, 10]
    x = np.linspace(*xlim)
    plt.plot(x, np.exp(x * α), label=r'$e^{\alpha x}$')
    plt.xlabel('$x$')
    plt.xlim(xlim)
    plt.ylim(ylim)
    plt.legend()
    plt.show()

α = 2
plot(α)

Jupyter notebooks can be git-friendly but require more effort

Reactive UI

import marimo as mo

text = mo.ui.text("Hello")
text
print(f"You entered: {text.value}")
number = mo.ui.number(0,1,.1)
number
number = mo.ui.slider(0,1,.1)

Persistent cache

import time

def my_expensive_function(number):
    time.sleep(9)
    return 42

import marimo as mo

with mo.persistent_cache(name="my_cache"):
    x = my_expensive_function(10)

WebAssembly

Deployment in HuggingFace spaces

Chat is an abstraction

Reproducible notebooks

Terminal
uvx marimo edit --sandbox notebook.py

Conclusions

  • uv is an extremely fast Python package and project management
  • marimo is a reactive notebook with many nice features
  • Both of this tools help us drive towards more reproducible programs

References

Asides