Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

Welcome to the Data Visualization Techniques Demo!

This project showcases a comprehensive set of data visualization techniques using popular Python libraries:

  • Matplotlib — static plotting and customization
  • Seaborn — statistical visualization with pandas integration
  • Plotly — interactive web-based charts
  • Datashader — rendering very large datasets efficiently

What's Included

  • A Jupyter Notebook (data-visualization-demo.ipynb) with live, runnable examples across all major visualization types.
  • Interactive HTML exports for Plotly charts that work without a running Python kernel.
  • Automated helpers to export the notebook to PDF using nbconvert or headless Chromium.
  • Examples of exporting animations to MP4/GIF and saving static plots as PNG.
  • Comprehensive documentation (this site) for quick reference.

Key Features

  1. Multi-library coverage — Matplotlib for low-level control, Seaborn for quick statistical plots, Plotly for interactivity.
  2. Large dataset handling — Datashader fallback for hexbin aggregation when rendering hundreds of thousands of points.
  3. Export flexibility — Three approaches to PDF generation: LaTeX-based nbconvert, browser print, or headless Chromium.
  4. Animation support — FuncAnimation examples with MP4/GIF export options.
  5. Reproducible environment — Pinned dependencies in requirements.txt for exact reproduction.

Quick Start

  1. Clone or download this repository.
  2. Set up a Python virtual environment and install dependencies (see Installation).
  3. Launch Jupyter and open the notebook (see Running the Notebook).
  4. Run cells to explore visualization techniques and export examples.

For detailed instructions, see Getting Started.

Next Steps

Happy visualizing!

Getting Started

This section covers how to set up your environment and run the notebook.

Prerequisites

  • Python 3.9 or later (3.10+ recommended)
  • pip (usually comes with Python; verify with pip --version)
  • A terminal or command prompt
  • ~2GB of disk space (for venv + dependencies)

System Requirements

The notebook can run on any modern OS (Linux, macOS, Windows). Some optional features require extra system packages (see below).

What You'll Need

  1. Python packages — listed in requirements.txt (matplotlib, seaborn, plotly, jupyter, datashader, etc.).
  2. Optional: LaTeX — for direct PDF export via nbconvert --to pdf. Install TeX Live (Linux) or MacTeX (macOS).
  3. Optional: Chromium/Chrome — for headless HTML->PDF conversion (alternative to LaTeX).
  4. Optional: ffmpeg — for exporting animations to MP4 format.

Next Steps

Installation

Python Packages (Required)

Create a fresh Python environment to isolate this project's dependencies:

cd /path/to/data-visualization-demo
python -m venv .venv

Activate the virtual environment:

  • Linux/macOS:

    source .venv/bin/activate
    
  • Windows (PowerShell):

    .\.venv\Scripts\Activate.ps1
    
  • Windows (Command Prompt):

    .venv\Scripts\activate.bat
    

Install Dependencies

With the virtual environment activated, install all required Python packages:

pip install -r requirements.txt

This installs ~75 packages including:

  • Jupyter Lab / Notebook (latest)
  • Matplotlib 3.10.7 — static plotting
  • Seaborn 0.13.2 — statistical visualization
  • Plotly 6.4.0 — interactive web charts
  • Pandas 2.3.3 — data manipulation
  • NumPy 2.3.4 — numerical computing
  • Datashader 0.18.2 — large dataset aggregation
  • Statsmodels 0.14.5 — regression and statistics
  • And ~50+ supporting libraries (see requirements.txt for full list).

System Packages (Optional)

Depending on which features you want to use, you may need to install additional system-level packages.

LaTeX (for Direct PDF Export)

Required for jupyter nbconvert --to pdf.

Debian/Ubuntu:

sudo apt update
sudo apt install -y texlive-xetex texlive-fonts-recommended texlive-latex-recommended

macOS (with Homebrew):

brew install mactex

Windows: Download and install MiKTeX or TeX Live from their official sites.

Chromium / Google Chrome (for Headless HTML→PDF)

Alternative to LaTeX; recommended for environments where TeX is difficult to install.

Debian/Ubuntu:

sudo apt update
sudo apt install -y chromium-browser

macOS (with Homebrew):

brew install chromium

Windows: Download from Google Chrome or Chromium.

ffmpeg (for Animation Export to MP4)

Debian/Ubuntu:

sudo apt install -y ffmpeg

macOS (with Homebrew):

brew install ffmpeg

Windows: Download from ffmpeg.org or install via Chocolatey:

choco install ffmpeg

Verification

To verify your installation, activate your virtual environment and run:

python -c "import jupyter, matplotlib, seaborn, plotly; print('All core packages imported successfully!')"
jupyter --version

You should see version information for Jupyter and no errors.

Next: Running the Notebook

Once installed, see Running the Notebook to get started.

Running the Notebook

Prerequisites

Ensure you have:

  1. Installed Python packages (see Installation).
  2. Activated your virtual environment.

Starting Jupyter

From the project directory, activate your environment and launch Jupyter:

jupyter lab

Your browser should open automatically to http://localhost:8888 (or a similar URL with a token).

Using Jupyter Notebook

jupyter notebook

Opening the Notebook

  1. In the Jupyter file browser, navigate to and click on data-visualization-demo.ipynb.
  2. The notebook will open in a new tab.

Running Cells

  • Run a single cell: Click the cell and press Ctrl+Enter (or Cmd+Enter on macOS).
  • Run all cells: From the menu, select RunRun All Cells.
  • Run from a specific point: Click a cell and select RunRun All Below.

Key Sections

The notebook is organized as follows:

  1. Setup — Import libraries and install missing dependencies (run this first).
  2. Load Datasets — Load iris, tips, and flights datasets from Seaborn.
  3. Visualization Examples — Various chart types (line, scatter, histogram, heatmap, etc.).
  4. Interactive Plots — Plotly examples with HTML export fallbacks.
  5. Animations — Matplotlib FuncAnimation with MP4/GIF export.
  6. Exporting to PDF — Multiple approaches for notebook→PDF conversion.

Troubleshooting

"Kernel died" or "Import error"

  • Ensure you activated the virtual environment before launching Jupyter.
  • Try restarting Jupyter and running the Setup cell again.

Plots not showing inline

  • Confirm %matplotlib inline is in the setup cell and has been executed.

Memory or performance issues

  • The datashader example creates a large synthetic dataset. Reduce the mult variable in that cell if your system is resource-constrained.

For more help, see Troubleshooting.

Visualization Techniques

This section describes each visualization technique covered in the notebook.

Overview

The notebook includes examples for:

  1. Line Plots & Time Series — temporal data with rolling averages and confidence bands.
  2. Scatter Plots & Regression — bivariate relationships with regression overlays.
  3. Histograms & KDE — univariate distributions and density estimation.
  4. Box & Violin Plots — comparing distributions across categories.
  5. Heatmaps & Correlations — showing correlation matrices and multivariate patterns.
  6. Interactive Plotly Charts — web-based interactive exploration.
  7. Datashader — efficient rendering of large point clouds.
  8. Animations — time-based or parameter-sweep animations.

Each subsection below provides details on the purpose, use cases, and customization options.

Line Plots & Time Series

Purpose

Line plots are ideal for showing trends over time or continuous sequences. They're particularly effective for:

  • Time series data (stock prices, temperature, passenger counts).
  • Showing changes and trends across ordered categories.
  • Comparing multiple time series on the same axes.

Example in the Notebook

The notebook uses the flights dataset to demonstrate:

  1. Raw monthly passenger counts as a line.
  2. A 12-month rolling average (smoothed trend).
  3. Confidence bands (rolling mean ± standard deviation).

Key Code Snippet

ts = fl.set_index('date')['passengers'].sort_index()
# Use .to_numpy() for compatibility with modern matplotlib type hints
ax.plot(ts.index.to_numpy(), ts.to_numpy(), label='passengers', color='C0')
rolling = ts.rolling(window=12).mean()
ax.plot(rolling.index.to_numpy(), rolling.to_numpy(), label='12-month rolling mean', color='C1')
rolling_std = ts.rolling(window=12).std()
ax.fill_between(ts.index.to_numpy(), (rolling - rolling_std).to_numpy(), (rolling + rolling_std).to_numpy(),
                color='C1', alpha=0.2)

Customization Tips

  • Adjust rolling window: Change window=12 to a different period (e.g., 6 for half-yearly).
  • Confidence levels: Modify the band width (e.g., use 2× standard deviation for ~95% CI).
  • Multiple series: Plot multiple lines on the same axes with different colors and labels.
  • Markers: Add marker='o' to ax.plot() to show data points.

When to Use

  • Always: for time-indexed data.
  • Consider: when you have many repeated measurements and want to show a trend.
  • Avoid: for categorical data without a natural order (use bar plots instead).

See Also

Scatter Plots & Regression

Purpose

Scatter plots reveal relationships between two continuous variables. Adding a regression line or smoothed curve helps identify trends and fit patterns.

Example in the Notebook

The notebook uses the iris dataset to plot:

  1. Sepal length vs. sepal width as a scatter plot, colored and styled by species.
  2. A LOWESS (Locally Weighted Scatterplot Smoothing) regression curve overlay.

Key Code Snippet

sns.scatterplot(data=iris, x='sepal_length', y='sepal_width', hue='species', style='species', ax=ax, s=60)
sns.regplot(data=iris, x='sepal_length', y='sepal_width', scatter=False, ax=ax, 
            color='gray', lowess=True)

Customization Tips

  • Point size: Change s=60 to adjust marker size.
  • Regression method: Use lowess=True for non-parametric smoothing or omit for linear regression.
  • Color palettes: Customize the hue palette with palette='Set2' or others.
  • Transparency: Add alpha=0.6 to scatter points for overlapping data visibility.
  • Faceting: Use sns.relplot(..., col='species') to create subplots per category.

When to Use

  • Always: for exploring correlations between two continuous variables.
  • Consider: adding transparency or jitter when points overlap heavily.
  • Avoid: when you have too many points (>10,000) without aggregation; use hexbin or datashader instead.

See Also

Histograms & KDE

Purpose

Histograms and kernel density estimation (KDE) show the distribution of a single continuous variable. They help identify:

  • Central tendency (mode, median).
  • Spread and skewness.
  • Presence of multiple modes or outliers.

Example in the Notebook

The notebook plots the distribution of total bill from the tips dataset:

  • Histogram with 30 bins.
  • Overlaid KDE curve.
  • Normalized to density (area under curve = 1).

Key Code Snippet

# Use data= and x= parameters for modern seaborn API
sns.histplot(data=tips, x='total_bill', bins=30, kde=True, stat='density', ax=ax, color='C2')
ax.set_title('Distribution of total bill (tips dataset)')

Customization Tips

  • Bin count: Increase bins=30 for more detail or decrease for a smoother view.
  • Stat type: Use stat='count' for raw counts, stat='density' for normalized density, or stat='percent'.
  • KDE bandwidth: Control smoothing with kde=True or fine-tune via hue and Seaborn's bw_adjust.
  • Multiple groups: Use hue='day' to overlay distributions by category.
  • Stacking: Set multiple='stack' or multiple='dodge' for grouped histograms.

When to Use

  • Always: for exploring univariate distributions.
  • Consider: KDE for smooth, continuous estimates.
  • Avoid: histograms for very small samples (< 20 observations).

See Also

Box & Violin Plots

Purpose

Box plots and violin plots compare distributions across groups. They highlight:

  • Median and quartiles (box plot).
  • Full distribution shape (violin plot).
  • Outliers and range.

Example in the Notebook

The notebook displays total bill by day using:

  1. A box plot (left panel) — shows median, IQR, whiskers, and outliers.
  2. A violin plot (right panel) — shows the full distribution shape.
  3. An overlaid swarm plot (right panel) — individual data points.

Key Code Snippet

sns.boxplot(data=tips, x='day', y='total_bill', hue='day', dodge=False, ax=axes[0], palette='pastel')
sns.violinplot(data=tips, x='day', y='total_bill', hue='day', split=False, inner=None, ax=axes[1], palette='muted')
sns.swarmplot(data=tips, x='day', y='total_bill', color='k', size=3, ax=axes[1])

Customization Tips

  • Boxplot parts: Show quartiles, mean, or whiskers with showmeans=True, showfliers=True.
  • Violin plot shape: Use split=True to compare two hue groups side-by-side (for 2-level categorical).
  • Inner representation: Set inner='box', inner='quartile', or inner=None (violin only).
  • Point overlay: Add sns.swarmplot() or sns.stripplot() for individual observations.
  • Orient: Use orient='h' for horizontal layout.

When to Use

  • Box plots: quick comparison of medians and spreads across many groups.
  • Violin plots: when distribution shape matters more than individual points.
  • Swarm overlay: when sample size is small (< 50 per group) and you want to see all points.

See Also

Heatmaps & Correlations

Purpose

Heatmaps visualize two-dimensional data (matrices) using color intensity. Common uses:

  • Correlation matrices (how variables relate).
  • Confusion matrices (classification performance).
  • Time-series aggregations (e.g., activity by hour and day).

Example in the Notebook

The notebook computes and displays the correlation matrix of iris features:

  • Sepal length, sepal width, petal length, petal width.
  • Uses a diverging color palette (coolwarm) centered at 0.
  • Annotations show exact correlation values.

Key Code Snippet

corr = iris.select_dtypes(include=np.number).corr()
sns.heatmap(corr, annot=True, cmap='coolwarm', center=0, ax=ax)

Customization Tips

  • Color map: Use cmap='viridis' (perceptually uniform), cmap='RdBu_r' (diverging), or cmap='Greens' (sequential).
  • Annotation: Set annot=True to show values, or annot=False to hide.
  • Centering: Use center=0 for diverging colormaps to anchor at a meaningful value.
  • Normalization: Use vmin and vmax to control the color scale range.
  • Clustermap: Use sns.clustermap() to hierarchically cluster rows/columns.

When to Use

  • Always: for visualizing correlations and matrices.
  • Consider: clustering when you want to identify related variables/samples.
  • Avoid: heatmaps for more than ~20×20 cells without annotation clarity.

See Also

Interactive Plots with Plotly

Purpose

Interactive plots enable users to explore data dynamically via hover tooltips, zooming, panning, and legend toggling. Plotly is a popular library for building web-based interactive charts.

Example in the Notebook

The notebook creates two Plotly Express charts:

  1. Scatter plot: total bill vs. tip, colored by day, with hover data (sex, party size).
  2. Line plot: monthly passenger counts over time.

Both are exported to HTML files that can be opened in any web browser.

Key Code Snippet

import plotly.express as px
fig = px.scatter(tips, x='total_bill', y='tip', color='day', 
                 hover_data=['sex', 'size'], title='Tips: total bill vs tip')
fig.write_html('interactive_tips.html', include_plotlyjs='cdn')

Features

  • Hover tooltips: Show data values and custom labels.
  • Zoom & pan: Click and drag to zoom; double-click to reset.
  • Legend toggling: Click legend items to show/hide traces.
  • Download: Camera icon to save as PNG.
  • Export: Save as HTML for sharing or embedding in reports.

Customization Tips

  • Markers: Change marker size with size='size' column or fixed marker_size=10.
  • Colors: Use color='day' for categorical or color_continuous_scale='Viridis' for continuous.
  • Faceting: Add facet_row='day' or facet_col='day' for subplots.
  • Annotations: Use fig.add_annotation() to add text or arrows.
  • Custom styling: Modify fig.update_layout() and fig.update_traces().

When to Use

  • Always: for exploratory data analysis and interactive dashboards.
  • Consider: exporting to HTML for stakeholder sharing (no Python environment needed).
  • Avoid: Plotly for very large datasets (>50k points) without aggregation; consider datashader instead.

See Also

Datashader for Large Datasets

Purpose

When you have hundreds of thousands or millions of data points, traditional scatter plots become too slow and visually cluttered. Datashader rasterizes and aggregates points into a grid for efficient, artifact-free rendering.

Use Cases

  • Stock tick data (millions of prices per second).
  • Sensor readings from IoT devices.
  • Geographic data points (billions of GPS coordinates).
  • Scientific simulations with dense output.

Example in the Notebook

The notebook demonstrates datashader by:

  1. Creating a synthetic dataset (tips repeated 2000 times with jitter) — ~500k rows.
  2. Aggregating points into a canvas grid.
  3. Shading the grid by point count using a colormap.
  4. Falling back to Matplotlib hexbin if datashader is unavailable.

Key Code Snippet

import datashader as ds
import datashader.transfer_functions as tf

cvs = ds.Canvas(plot_width=800, plot_height=400)
agg = cvs.points(big, 'x', 'y', ds.count())
img = tf.shade(agg, cmap=colorcet.m_fire, how='eq_hist')

Customization Tips

  • Canvas size: Adjust plot_width and plot_height for resolution.
  • Aggregation: Use ds.count() (default), ds.mean(), ds.sum(), or custom reductions.
  • Colormaps: Choose from colorcet library (perceptually uniform): m_fire, m_viridis, etc.
  • Normalization: Use how='eq_hist' for histogram equalization or how='linear' for simple scaling.
  • Fallback: The notebook includes a sns.hexbin() fallback if datashader is not installed.

Installation

If datashader is not in your environment:

pip install datashader colorcet

When to Use

  • Always: for datasets with >100k points.
  • Consider: for exploratory analysis of massive datasets.
  • Avoid: if you need individual point interactivity (use aggregated tooltips instead).

See Also

Animations

Purpose

Animations bring data to life by showing changes over time or through parameter space. They're useful for:

  • Illustrating temporal evolution.
  • Demonstrating algorithm convergence.
  • Creating engaging presentations or educational content.

Example in the Notebook

The notebook uses Matplotlib's FuncAnimation to create a simple animated sine wave:

  1. The wave oscillates horizontally.
  2. The animation runs for 200 frames at 30 ms per frame.
  3. It's displayed inline as HTML/JavaScript in the notebook.
  4. It's also exported to MP4 (with ffmpeg) or GIF (with Pillow).

Key Code Snippet

from matplotlib import animation

fig, ax = plt.subplots()
x = np.linspace(0, 2*np.pi, 400)
line, = ax.plot(x, np.sin(x))
ax.set_ylim(-1.5, 1.5)

def init():
    line.set_ydata(np.sin(x))
    return (line,)

def animate(i):
    line.set_ydata(np.sin(x + i/10.0))
    return (line,)

anim = animation.FuncAnimation(fig, animate, init_func=init, frames=200, interval=30, blit=True)

Export Options

  • Inline (Jupyter): Display directly with HTML(anim.to_jshtml()) (no external files).
  • MP4: Requires ffmpeg. Use anim.save('out.mp4', writer='ffmpeg', fps=30).
  • GIF: Requires Pillow. Use anim.save('out.gif', writer='pillow', fps=30).

Customization Tips

  • Frame count: Increase frames=200 for longer animations.
  • Speed: Adjust interval=30 (milliseconds between frames; smaller = faster).
  • Blit mode: Set blit=True for faster rendering (only redraws changed elements).
  • Complex animations: Use update_lines_and_patches() or custom update functions.
  • Interactivity: Combine with Jupyter widgets (ipywidgets) for interactive parameter control.

When to Use

  • Always: for time-series visualizations and pedagogical content.
  • Consider: GIF for social media sharing (MP4 may have compatibility issues on some platforms).
  • Avoid: animations for static reports (use stills or summaries instead).

See Also

Exporting & PDF Generation

Overview

Converting a Jupyter notebook to a shareable PDF is a common task. This documentation covers three approaches:

  1. Direct PDF via nbconvert (simplest, requires LaTeX).
  2. HTML export + browser print (universal, manual step).
  3. Automated headless Chromium (modern, no LaTeX).

Comparison

MethodEaseSpeedRequirementsNotes
nbconvert PDFEasyFastLaTeXFails if LaTeX missing
HTML + PrintMediumMediumBrowserManual step; good UI control
ChromiumMediumFastChromiumFully automated; modern alternative to LaTeX

Choosing the Right Method

  • If LaTeX is installed: Use nbconvert for simplicity.
  • If you prefer a UI: Export to HTML and use browser Print dialog.
  • If you want full automation without LaTeX: Use headless Chromium.

All three methods are implemented as helper cells in the notebook for easy access.

Direct PDF via nbconvert

Overview

nbconvert 7.16+ is Jupyter's official tool for converting notebooks to various formats. Using the --to pdf option generates a PDF via a LaTeX intermediate (requires Pandoc and LaTeX).

Prerequisites

  1. Jupyter and nbconvert installed (both in requirements.txt).
  2. LaTeX installed on your system (see Installation).

CLI Usage

From the project directory:

jupyter nbconvert --to pdf data-visualization-demo.ipynb

This creates data-visualization-demo.pdf in the same directory.

Options

  • --output-dir=./output — specify output directory.
  • --no-input — exclude input cells (show only output).
  • --template=classic — use different templates.
  • --execute — re-run all cells before export (slow for large notebooks).

Example:

jupyter nbconvert --to pdf --no-input --output-dir=./pdf-output data-visualization-demo.ipynb

Troubleshooting

Error: "xelatex not found"

  • LaTeX is not installed. See Installation for platform-specific steps.

Error: "PDF was not created"

  • Complex plots or Unicode may cause LaTeX issues. Try simplifying or using alternative fonts.
  • Use --debug flag for verbose output: jupyter nbconvert --debug --to pdf ...

Performance:

  • First run is slow (LaTeX compilation takes ~30s). Subsequent runs are faster.

When to Use

  • Simple, static documents with standard plots.
  • Archival: guaranteed PDF output format (backward compatible).
  • CI/CD pipelines: if LaTeX is already available in your environment.

See Also

HTML Export + Browser Print

Overview

This method exports the notebook to HTML (widely supported) and then uses your web browser's Print dialog to save as PDF. It's universal, works everywhere, and gives you control over page setup.

Step 1: Export to HTML

From the project directory:

jupyter nbconvert --to html data-visualization-demo.ipynb

This creates data-visualization-demo.html.

Step 2: Open in a Browser

Open data-visualization-demo.html in your preferred web browser (Chrome, Firefox, Safari, Edge, etc.).

Step 3: Print to PDF

  1. Open Print dialog: Ctrl+P (Windows/Linux) or Cmd+P (macOS).
  2. Configure:
    • Destination: "Save as PDF" (Chrome/Edge) or "Print to File" (Firefox).
    • Margins: Choose "None" to minimize whitespace.
    • Paper size: A4 or Letter as desired.
    • Background graphics: Check if you want colors (usually on by default).
  3. Save: Choose a filename and click "Save."

Advantages

  • No extra software: Works with any browser.
  • Visual control: Adjust margins, headers, footers in the Print dialog.
  • Compatibility: HTML is backward-compatible and future-proof.
  • Responsive: Modern browsers handle layout well.

Troubleshooting

Interactive plots don't appear

  • Plotly and other interactive content render only if JavaScript is enabled in your browser. Check browser console for JS errors.

Headers/footers in printed PDF

  • Adjust in the Print dialog. Some browsers allow custom headers/footers.

Page breaks

  • Long notebooks may split awkwardly. Use CSS (custom notebook template) to control page breaks if needed.

See Also

Chromium Headless

Overview

Headless Chromium (or Google Chrome) can render HTML and export to PDF programmatically without a GUI. This is a modern, fast alternative to LaTeX and works reliably across platforms.

Prerequisites

  1. HTML file exported (see HTML Export + Browser Print).
  2. Chromium or Google Chrome installed (see Installation).

CLI Usage

Basic Command

chromium --headless --disable-gpu --print-to-pdf=output.pdf data-visualization-demo.html

Or with Google Chrome:

google-chrome --headless --disable-gpu --print-to-pdf=output.pdf data-visualization-demo.html

This creates output.pdf.

Options

  • --print-to-pdf=<path> — output PDF path (required).
  • --print-to-pdf-margin-top=<mm> — top margin in millimeters.
  • --print-to-pdf-margin-bottom=<mm> — bottom margin.
  • --print-to-pdf-margin-left=<mm> — left margin.
  • --print-to-pdf-margin-right=<mm> — right margin.
  • --print-to-pdf-paper-width=<mm> — paper width (default 210 for A4).
  • --print-to-pdf-paper-height=<mm> — paper height (default 297 for A4).
  • --print-to-pdf-prefer-css-page-size — respect CSS page size if defined.

Example with custom margins:

chromium --headless --disable-gpu \
  --print-to-pdf=output.pdf \
  --print-to-pdf-margin-top=10 \
  --print-to-pdf-margin-bottom=10 \
  data-visualization-demo.html

Automated in the Notebook

The notebook includes a Python cell that automates this process:

  1. Looks for Chromium/Chrome on your PATH.
  2. Detects the HTML file.
  3. Runs the headless print-to-pdf command.
  4. Reports success or suggests alternatives.

Advantages

  • No LaTeX: Avoids system dependency complexity.
  • Fast: PDF generation is quick (seconds).
  • Reliable: Chromium's rendering engine is modern and well-tested.
  • Portable: Works on Linux, macOS, Windows.
  • Customizable: Fine control over margins, page size, and other parameters.

Troubleshooting

Error: "Chromium not found"

  • Install Chromium or Chrome (see Installation).
  • Verify it's on your PATH: which chromium or which google-chrome.

Sandbox errors (Docker/WSL)

  • If sandboxing fails, try without --no-sandbox (be aware of security implications):
    chromium --headless --no-sandbox --disable-gpu --print-to-pdf=output.pdf file.html
    

PDF is blank

  • Ensure the HTML file is valid and self-contained. Plotly charts with CDN URLs should work.
  • Try opening the HTML in a browser first to confirm it renders.

See Also

  • Chromium docs: headless-mode
  • Supported flags: chromium --help | grep print-to-pdf

Troubleshooting

Common Issues & Solutions

Jupyter / Kernel Issues

"Kernel died" or "Connection lost"

Symptoms: Jupyter stops responding or crashes.

Solutions:

  1. Restart the kernel: KernelRestart in Jupyter menu.
  2. Ensure you activated the virtual environment before launching Jupyter.
  3. Check available system memory; large datashader computations consume RAM.
  4. Update Jupyter: pip install --upgrade jupyter ipykernel.

"ImportError: No module named 'X'" or "ModuleNotFoundError"

Symptoms: A cell fails with ImportError or ModuleNotFoundError for a library.

Solutions:

  1. Install the missing package: pip install <package_name> or pip install -r requirements.txt for full environment.
  2. Ensure the virtual environment is activated: which python should show .venv/bin/python.
  3. Restart the kernel after installing: KernelRestart.
  4. Verify the package is in your environment: pip list | grep <package_name>.

Plotting Issues

Plots not showing inline

Symptoms: Code runs but no plot appears in the notebook.

Solutions:

  1. Ensure %matplotlib inline is in the setup cell and has been executed (it's in cell 1).
  2. Verify import matplotlib.pyplot as plt is run before plotting.
  3. Confirm matplotlib backend is set: %matplotlib inline should precede all plotting commands.
  4. For Jupyter Lab, ensure the Lab extensions are up to date: jupyter labextension list.

"No module named 'datashader'"

Symptoms: Datashader cell fails.

Solutions:

  1. Install datashader: pip install datashader colorcet.
  2. Restart the kernel.
  3. The notebook includes a fallback to Matplotlib hexbin.

Plotly charts show as blank boxes

Symptoms: Interactive plots don't render.

Solutions:

  1. Check browser console (F12) for JavaScript errors.
  2. Ensure JavaScript is enabled in your browser.
  3. Plotly requires an internet connection (CDN mode) unless you specify a local mode.
  4. Try exporting to HTML: the notebook handles this automatically.

Export Issues

"nbconvert: command not found"

Symptoms: jupyter nbconvert --to pdf ... fails.

Solutions:

  1. Ensure nbconvert is installed: pip install nbconvert.
  2. Verify you're in the virtual environment.
  3. Use the full path: /path/to/.venv/bin/jupyter nbconvert ....

PDF export fails with LaTeX errors

Symptoms: jupyter nbconvert --to pdf produces errors like "xelatex not found."

Solutions:

  1. Install LaTeX (see Installation).
  2. Use alternative export methods (HTML + browser print, or Chromium headless).
  3. Try --debug flag for detailed error messages.

Chromium headless produces blank or corrupted PDF

Symptoms: PDF is blank, garbled, or partially rendered.

Solutions:

  1. Verify the HTML file opens correctly in a browser.
  2. Check for complex CSS or JavaScript that may not be supported.
  3. Try with explicit margins: --print-to-pdf-margin-top=10.
  4. Upgrade Chromium: apt update && apt install chromium-browser (Linux).

Performance Issues

Notebook is very slow

Symptoms: Cells take a long time to execute.

Solutions:

  1. The datashader example creates a large synthetic dataset. Reduce mult=2000 to a smaller value.
  2. Disable inline plotting if not needed (no %matplotlib inline).
  3. Reduce plot resolution or number of points.
  4. Close unused tabs/applications to free system memory.

Animation export is very slow

Symptoms: anim.save() takes minutes or doesn't complete.

Solutions:

  1. Reduce number of frames: frames=100 instead of 200.
  2. Use GIF instead of MP4 (simpler encoding).
  3. Ensure ffmpeg is properly installed and on PATH.

Environment Issues

Virtual environment not activating

Symptoms: source .venv/bin/activate doesn't seem to work.

Solutions:

  1. Verify the venv exists: ls -la .venv/.
  2. Check shell type: echo $SHELL.
  3. For bash: . .venv/bin/activate (dot-space prefix).
  4. For PowerShell (Windows): .\.venv\Scripts\Activate.ps1.
  5. Check permissions: chmod +x .venv/bin/activate.

Different Python versions conflicting

Symptoms: python and python3 point to different versions; notebooks use the wrong interpreter.

Solutions:

  1. Always use the venv Python: which python should show .venv/bin/python.
  2. Create venv with explicit version: python3.10 -m venv .venv.
  3. In Jupyter, select the kernel from the venv: KernelChange Kernel.

Getting Help

Still Stuck?

  1. Check error messages carefully; they often suggest the fix.
  2. Search the library's documentation for your specific error.
  3. Search GitHub issues for similar problems.
  4. Consider running a fresh venv and re-installing all packages.

Good luck!