Image%201-5-26%20at%2012.15%E2%80%AFPM.jpeg

Turn notebooks into reliable, rerunnable workflows your whole team can trust.

Scope Statement 

Jupyter & Reproducible Workflows is a self-paced, 2-dayequivalent course designed for analysts, data scientists, and engineers who rely on Jupyter Notebooks for real work—not just experimentation. 

Across eight focused modules, students learn how to design, build, and share reproducible workflows using Jupyter. The course walks through reproducibility principles, environment setup, dependency pinning, parameterization, lightweight testing, provenance logging, and minimal Continuous Integration (CI) concepts. 

Through guided exercises, example notebooks, and a practical capstone, learners move from “it works on my machine” to notebooks that teammates can rerun with consistent, verifiable results. 

The course ends with a hands-on exam-style capstone where students fix and improve an existing notebook to meet best practices for reproducibility. 

 

During this course, you will gain the skills to: 

Design Reproducible Jupyter Workflows 

  • Explain the core elements of a reproducible notebook workflow. 

  • Configure environments so notebooks run consistently across machines. 

Manage Environments & Dependencies 

  • Create isolated Python environments. 

  • Install and pin dependencies. 

  • Export and rebuild environments from requirements files. 

Parameterize & Reuse Notebooks 

  • Understand why parameterization matters for repeatable runs. 

  • Add basic parameters to support different inputs and configurations. 

  • Use tools like Papermill concepts to support batch execution and templating. 

Add Lightweight Tests & Guards 

  • Apply simple tests and assertions to catch regressions in workflows. 

  • Move repeated logic into small functions for easier debugging and reuse. 

Convert Notebooks for Team Handoff 

  • Convert notebooks to scripts or HTML reports. 

  • Package reusable code into small modules. 

  • Assemble handoff bundles with requirements and clear run instructions. 

Log Provenance & Results 

  • Generate reproducibility logs that capture inputs, parameters, versions, and outputs. 

  • Treat notebooks like code by using version control and clean, rerunnable states. 

Understand Minimal Continuous Integration (CI) Concepts 

  • See how lintingtests, and simple CI workflows can keep notebooks reliable over time. 

  • Connect local quality habits to automated checks used in team environments. 

 

Course Format 

This is a self-paced course designed to be completed in roughly 2 days of focused effort, but learners can move faster or slower as needed. 

  • 8 instructional modules with PPT-based explanations and worked examples 

  • Hands-on Jupyter exercises using provided .ipynb notebooks and project structure 

  • Command-line practice for environment and dependency management 

  • Incremental build-up from hygiene and structure → parameterization → logging → minimal CI 

  • capstone notebook where students pull all skills together into a single, reproducible workflow 

Modules are designed so learners can pause, repeat, and revisit content as needed, making it ideal for mixed-experience teams. 

 

Examination / Capstone 

Instead of a multiple-choice test, this course uses a hands-on Jupyter-based examination: 

  • Students are given a notebook with deliberate issues and gaps in reproducibility. 

  • The task is to fix errors, improve environment handling, add parameters, add basic tests, and implement reproducibility logging. 

  • Success is measured by whether the improved notebook: 

  • Runs cleanly end-to-end 

  • Produces consistent results 

  • Is clearly documented and reproducible by others 

This mirrors real-world expectations more closely than a traditional written exam. 

 

Requirements 

To get the most from this course, students should have: 

  • Basic familiarity with Python and Jupyter Notebooks (execution, editing cells, basic Python syntax) 

  • Ability to access a command-line environment (OS terminal or Jupyter Terminal) 

  • Ability to install Python packages (e.g., via pip or conda) 

  • A local or hosted Jupyter environment (Anaconda, VS Code, JupyterLab, etc.) 

All exercise notebooks, starter files, and directory structures are provided. 

 

Who Should Attend? 

This course is ideal for: 

  • Data analysts & data scientists working heavily in notebooks 

  • Machine learning engineers and researchers prototyping in Jupyter 

  • Intelligence or mission analysts using notebooks for repeatable workflows 

  • Developers transitioning from ad-hoc notebooks to team-ready code 

  • Any technical professional tired of “it used to work, I don’t know what changed” 

If you share notebooks with others—or expect to—in this course you’ll learn how to make them trustworthy, reusable, and production-aware. 

 

Why This Course Matters 

Jupyter Notebooks are powerful, but without discipline they quickly become: 

  • Hard to rerun 

  • Fragile to environment changes 

  • Nearly impossible for teammates to reuse 

This course helps you: 

  • Turn exploratory work into reliable workflows 

  • Avoid hidden environment “gotchas” 

  • Make your future self (and your teammates) grateful for clean structure and logs 

  • Lay the groundwork for team-ready, production-conscious notebook practices 

Reproducibility isn’t just good hygiene—it’s how your work scales beyond a single machine and a single analyst.