Jupyter Project
Since Python is a multipurpose programming language, its various applications and use cases introduce people to specialized ecosystems and projects the community has built to enhance the development experience for specific workflows. In the Data Science and Machine Learning scene, the Jupyter Project stands out as one of the most influential tools supporting this area of development.
The Jupyter Project is an open-source initiative that defines standards and interfaces for interactive computing across multiple programming languages. It offers web-based tools, including notebooks, to combine code, data, and rich media in a single, interactive environment. While widely used in data science, machine learning, and scientific computing, Jupyter is not limited to these fields. It enables a unified workflow where users can write, run, and visualize code alongside notes, equations, and visualizations—all in one place.
IPython
The origins of the Jupyter Project are closely tied to a now independent program called IPython (Interactive Python). From 2011 to 2014, Jupyter and IPython were developed as a single, monolithic project before they eventually split to serve distinct purposes.
Interpreted languages such as Ruby and JavaScript, provide REPLs (Read Evaluate Print
Loops) like irb
and the browser console, respectively—tools that enable rapid code
testing and iteration. Python is no exception. Running the python
command in a shell
launches an interactive environment for executing Python code line by line, enabling a
fast feedback loop for experimentation and learning. These REPL experiences vary
between languages and implementations, each offering different levels of interactivity
and tooling.
IPython goes beyond the default Python REPL implementation by offering a more powerful interactive environment. Key features include:
- Embeded syntax higlight and auto-completion.
- Magic commands
using
%
and%%
syntax for tasks like timing code, running scripts, or managing the environment. - Automatic storage of output values, allowing reuse of results via special variables
like
_
,__
, and numbered outputs (_1
,_2
, ...) - Command history navigation across sessions.
IPython can be used as a standalone shell command, just as Jupyter can be used without IPython. However, the two remain connected—IPython is the reference Python kernel used within Jupyter for executing Python code inside notebooks and other Jupyter interfaces.
Kernel
Inspired by the Operating System's definition, the Jupyter ecosystem uses the term kernel to describe a different kind of bridge: one that connects an independent programming language process with Jupyter Clients, e.g. user interaces like Notebook. Using a well-defined messaging protocol, this design enables Jupyter to support multiple languages through interchangeable kernels, while keeping the user experience consistent.
Different programming languages—as well as different versions of the same language—can appear as kernels within a single Jupyter client. Note that Jupyter Lab and Notebook are Python applications, meaning that launching them starts a Python process. That Python interpreter may also act as a kernel, but it does not have to.
The Jupyter Stack can be installed as a standalone tool, independent of the kernels it manages. This separation is encouraged: it allows a single Jupyter installation to serve multiple, isolated environments without duplicating Jupyter for each one. This is especially useful in centralized setups (e.g., IDEs or servers) where Jupyter manages and connects to various kernels across different environments.
Integrated Workflows
While Jupyter can be run as a standalone application with its own user interface, it is also highly modular. Its UI can be replaced or extended—this is exactly what tools like VSCode’s Jupyter extension, Google Colab, and similar solutions do.
Jupyter’s architecture is particularly well-suited for a client–server model. The server handles code execution and computation, while the user interacts with Jupyter through a web browser—effectively abstracting away the underlying infrastructure.
This separation of interface and compute enables powerful, scalable setups—such as delegating workloads to cloud resources. Platforms like Amazon SageMaker, Azure Machine Learning Studio, Databricks Notebooks and others leverage this model to provide a seamless experience for running complex workflows with specialized computing needs.