Uv Dependency Groups Pep 735 Vs Poetry

Every Python team that has lived through a Poetry-to-something-else migration knows the specific friction this article addresses. You have a pyproject.toml with [tool.poetry.dependencies] and a [tool.poetry.group.dev.dependencies] block, a poetry.lock that takes forty seconds to resolve, and a CI pipeline that installs the entire dev group just to run ruff on a pull request. Then uv lands, PEP 735 standardises dependency groups outside any single tool's namespace, and suddenly the question is no longer "is uv faster" but "can I express the same dependency topology in a tool-agnostic way without losing the determinism Poetry gave me." The honest answer requires actually doing the migration on a real project, lockfile and all, instead of reading another benchmark thread.

This walkthrough builds that project end to end. You will scaffold a minimal Python package, declare its runtime and dev dependencies the Poetry way, then convert the dev block to a PEP 735 [dependency-groups] table that uv consumes natively, with test, lint, and docs groups composed via include-group so CI jobs can install exactly the slice they need. You will generate both poetry.lock and uv.lock from the same inputs and diff them for resolver determinism, wire a GitHub Actions matrix that installs individual groups per job, and finish with a migration script plus a verification checklist that confirms the converted project still resolves to the same versions. The stack stays small on purpose: Python 3.12, uv, Poetry 1.8 for the before-state, pytest, ruff, and mkdocs as representative group members.

This is written for backend and platform engineers maintaining a Poetry-managed library or service who want to evaluate PEP 735 without committing to a one-way migration. By the end you will have a reproducible repo you can point teammates at, a concrete sense of where uv groups beat Poetry groups and where they do not, and a migration recipe you can apply to a production pyproject.toml the same afternoon.

Step 1: Anchoring the Project to a Single pyproject.toml

Before we can argue about Poetry's [tool.poetry.group.*] tables versus the PEP 735 [dependency-groups] table, we need a project that has neither. This step builds the smallest possible Python package — one module, one importable function, one passing pytest run — whose entire identity lives in a single pyproject.toml. Every later step in the comparison will mutate this file in two different directions, so the cleaner the baseline, the sharper the diff will be.

The goal here is not to write interesting code. The goal is to make sure that when we add poetry and uv into the mix later, the only thing that changed is how dependency groups are declared — not the package layout, not the build backend, not the test runner.

Setup

We create a src/-layout package called groupcompare, a minimal pyproject.toml driven by Hatchling, and a tests/ folder for pytest. No runtime dependencies. No dev dependencies yet. The file tree after this step is:

codebase/
├── pyproject.toml
├── README.md
├── src/
│   └── groupcompare/
│       ├── __init__.py
│       └── core.py
└── tests/
    ├── __init__.py
    └── test_scaffold.py

The src/ layout is deliberate. It prevents an unbuilt source tree from being importable purely because pytest happened to start in the project root, which means the test suite imports the installed package — exactly the same code path a downstream consumer would hit. That makes any later wiring of dependency groups behave honestly.

Implementation

First, the pyproject.toml. Notice that [project] carries the PEP 621 metadata (name, version, Python floor, classifiers) while [build-system] pins Hatchling. There is no [tool.poetry] table and no [dependency-groups] table — those land in the next steps.

[project]
name = "groupcompare"
version = "0.1.0"
description = "Minimal Python package used to compare Poetry's [tool.poetry.group.*] layout against PEP 735 [dependency-groups] under uv."
readme = "README.md"
requires-python = ">=3.9"
license = { text = "MIT" }
authors = [
    { name = "vytharion", email = "vytharion@users.noreply.github.com" },
]
dependencies = []

[build-system]
requires = ["hatchling>=1.18"]
build-backend = "hatchling.build"

[tool.hatch.build.targets.wheel]
packages = ["src/groupcompare"]

[tool.pytest.ini_options]
minversion = "7.0"
pythonpath = ["src"]
testpaths = ["tests"]
addopts = ["-ra", "--strict-markers", "--strict-config"]

Two design choices are worth flagging. The empty dependencies = [] list is intentional — we want the upcoming --group semantics to be the only moving part later. And the pytest.ini_options table lives inside pyproject.toml rather than in a separate pytest.ini, because we want a single configuration surface that both Poetry and uv can read without ceremony.

Next, the package itself. core.py exposes one function, describe_layout, which returns the TOML table path each tool uses to declare dependency groups. It is small on purpose — just enough behaviour to test against, and a placeholder we will extend in later steps.

from __future__ import annotations

from typing import Mapping

_KNOWN_LAYOUTS: Mapping[str, str] = {
    "poetry": "[tool.poetry.group.<name>.dependencies]",
    "pep735": "[dependency-groups]",
}


def describe_layout(tool: str) -> str:
    key = tool.strip().lower()
    if key not in _KNOWN_LAYOUTS:
        raise ValueError(f"unknown layout: {tool!r}")
    return _KNOWN_LAYOUTS[key]

The shape of describe_layout matches the rule in the codebase guide: a single early-return guard, then the happy path. There is no nested branching, and there is no try/except around the dict lookup — we let ValueError carry the bad-input case, which keeps the control flow flat.

The __init__.py re-exports the function and a version string so consumers (and tests) can write from groupcompare import describe_layout without reaching into private modules.

from groupcompare.core import describe_layout

__all__ = ["__version__", "describe_layout"]

__version__ = "0.1.0"

Finally, the test file pins both the public API and a few behavioural invariants — that the version is exposed, that each known tool returns the expected TOML path, that input is case-insensitive, and that unknown tools raise ValueError. These tests become the safety net the rest of the article leans on whenever we mutate pyproject.toml.

import pytest

import groupcompare
from groupcompare import describe_layout


def test_package_exposes_version():
    assert groupcompare.__version__ == "0.1.0"


def test_describe_layout_returns_poetry_table_path():
    assert describe_layout("poetry") == "[tool.poetry.group.<name>.dependencies]"


def test_describe_layout_returns_pep735_table_path():
    assert describe_layout("pep735") == "[dependency-groups]"


def test_describe_layout_is_case_insensitive():
    assert describe_layout("  Poetry  ") == "[tool.poetry.group.<name>.dependencies]"


def test_describe_layout_rejects_unknown_tool():
    with pytest.raises(ValueError, match="unknown layout"):
        describe_layout("pipenv")

Verification

Run the test suite from the codebase/ directory:

python -m pytest

============================= test session starts ==============================
platform darwin -- Python 3.9.6, pytest-8.4.2, pluggy-1.6.0
rootdir: codebase
configfile: pyproject.toml
testpaths: tests
collected 5 items

tests/test_scaffold.py .....                                             [100%]

============================== 5 passed in 0.03s ===============================

Five green dots is the only signal we need at this stage. The harness is alive, the src/ layout is importable, and the configuration in pyproject.toml is being honoured by pytest.

What we built

We now have a Python project whose entire identity — name, version, Python floor, build backend, test runner configuration — is declared in one file. There is no setup.py, no setup.cfg, no pytest.ini, no tool.poetry table. That single-surface property is what makes the upcoming comparison meaningful.

We also have a tiny public API, describe_layout, plus a five-test suite that locks down its behaviour. Those tests are not throwaway — every later step keeps them green, which means any breakage caused by introducing Poetry groups or PEP 735 groups gets caught before it can be hidden by configuration changes.

The deliberate gap in this step is dependency declaration. dependencies = [] is empty, and there is no concept yet of "dev dependencies" or "lint dependencies" or "test dependencies". That gap is exactly the seam the next two steps will slice along: first Poetry's [tool.poetry.group.*] tables, then PEP 735's [dependency-groups] table, each grafted onto this same baseline.

With the baseline locked, the rest of the article becomes a controlled experiment. The independent variable is the dependency-group dialect; everything else — package layout, test runner, build backend — stays still.

Repository

The state of the code after this step: ebaee65

Step 2: Grafting Poetry's Group Tables onto pyproject.toml

Step 1 left us with a project whose entire identity sat in PEP 621 metadata — no [tool.poetry] table, no dependency groups, no dev tooling pinned anywhere. That was deliberate: the comparison only works if Poetry's tables and PEP 735's [dependency-groups] table arrive separately and visibly. This step adds the Poetry half. We graft [tool.poetry], [tool.poetry.dependencies], and [tool.poetry.group.dev.dependencies] onto the existing pyproject.toml, then write a tiny reader so pytest can assert that the tables landed exactly as Poetry expects them.

The point of writing a reader instead of trusting Poetry's own tooling is leverage. Step 3 will mirror this same data into the PEP 735 [dependency-groups] table, and we want the test suite — not Poetry, not uv — to be the arbiter of "the two layouts say the same thing". A small TOML reader keeps that contract honest.

Setup

We keep the src/ layout from step 1 and add one new module plus one new test file. Nothing about the build backend, the package layout, or the test runner moves. The only edits are:

codebase/pyproject.toml — add a runtime dependency, a [tool.poetry] table, a [tool.poetry.dependencies] table, and a [tool.poetry.group.dev.dependencies] table.
codebase/src/groupcompare/poetry_layout.py — new module exposing four read-only helpers.
codebase/src/groupcompare/__init__.py — re-export the new helpers next to describe_layout.
codebase/tests/test_poetry_layout.py — new test file that pins the shape of the Poetry tables.

The file tree after this step:

codebase/
├── pyproject.toml
├── README.md
├── src/
│   └── groupcompare/
│       ├── __init__.py
│       ├── core.py
│       └── poetry_layout.py
└── tests/
    ├── __init__.py
    ├── test_scaffold.py
    └── test_poetry_layout.py

We also need a way to parse TOML on Python 3.9 and 3.10, which do not ship tomllib in the standard library. The honest fix is to add tomli as a runtime dependency — but only under an environment marker, so 3.11+ installs do not drag in a useless package.

Implementation

The first edit is pyproject.toml. Below the existing [project] block we keep the PEP 621 view, then we mirror the same metadata into a [tool.poetry] table because Poetry insists on its own copy of name, version, authors, and packages. Notice that dependencies appears twice — once in [project] (PEP 621) and once in [tool.poetry.dependencies] (Poetry's dialect) — which is exactly the kind of duplication PEP 735 was written to escape. Step 3 will collapse it.

dependencies = [
    "tomli>=2.0; python_version < '3.11'",
]

[tool.poetry]
name = "groupcompare"
version = "0.1.0"
description = "Minimal Python package used to compare Poetry's [tool.poetry.group.*] layout against PEP 735 [dependency-groups] under uv."
authors = ["vytharion <vytharion@users.noreply.github.com>"]
readme = "README.md"
packages = [{ include = "groupcompare", from = "src" }]

[tool.poetry.dependencies]
python = "^3.9"
tomli = { version = "^2.0", python = "<3.11" }

[tool.poetry.group.dev.dependencies]
pytest = "^8.0"
ruff = "^0.6"
mypy = "^1.10"

Three Poetry idioms are worth pointing at. The python = "^3.9" line is Poetry's way of declaring the interpreter floor — it is a dependency, not a project-level field, which is one of the spots where Poetry diverges from PEP 621. The tomli = { version = "^2.0", python = "<3.11" } inline table is Poetry's syntax for environment markers; the equivalent PEP 508 marker would be tomli>=2.0; python_version < '3.11'. And the dev tooling lives under [tool.poetry.group.dev.dependencies], which is the table whose ergonomics PEP 735 is reacting to.

Next, the reader. poetry_layout.py is a thin wrapper over tomllib/tomli that exposes four helpers: runtime_dependencies, python_constraint, dev_dependencies, and declared_groups. The shape is intentionally boring — pure functions, no class, no I/O outside one private loader.

from __future__ import annotations

import sys
from pathlib import Path
from typing import Any, Dict, List, Mapping, Optional

if sys.version_info >= (3, 11):
    import tomllib as _toml
else:
    import tomli as _toml


PYPROJECT_PATH = Path(__file__).resolve().parents[2] / "pyproject.toml"


def _load(path: Optional[Path]) -> Mapping[str, Any]:
    target = path or PYPROJECT_PATH
    with target.open("rb") as handle:
        return _toml.load(handle)


def _poetry_table(path: Optional[Path]) -> Mapping[str, Any]:
    return _load(path).get("tool", {}).get("poetry", {})

The single early-return version check at the top is the only branch in the module that touches Python versions. Everything below speaks one TOML dialect. The Optional[Path] argument on every helper is what lets the tests drive the reader against fixture files later, without monkey-patching globals.

def runtime_dependencies(path: Optional[Path] = None) -> Dict[str, Any]:
    deps = dict(_poetry_table(path).get("dependencies", {}))
    deps.pop("python", None)
    return deps


def python_constraint(path: Optional[Path] = None) -> Optional[str]:
    return _poetry_table(path).get("dependencies", {}).get("python")


def dev_dependencies(path: Optional[Path] = None) -> Dict[str, Any]:
    groups = _poetry_table(path).get("group", {})
    return dict(groups.get("dev", {}).get("dependencies", {}))


def declared_groups(path: Optional[Path] = None) -> List[str]:
    return sorted(_poetry_table(path).get("group", {}).keys())

runtime_dependencies strips the python key because Poetry overloads [tool.poetry.dependencies] to carry both the interpreter floor and real packages. Step 3 will need to compare runtime dependencies across dialects, and PEP 735 does not mix the interpreter into the same table — so we normalise here, once, and let the tests pin the normalised shape.

The __init__.py re-exports the four helpers so consumers (and tests) can write from groupcompare import runtime_dependencies without reaching into a private module path. Keeping the public surface flat means later steps can swap the implementation without touching the import sites.

from groupcompare.core import describe_layout
from groupcompare.poetry_layout import (
    declared_groups,
    dev_dependencies,
    python_constraint,
    runtime_dependencies,
)

__all__ = [
    "__version__",
    "declared_groups",
    "describe_layout",
    "dev_dependencies",
    "python_constraint",
    "runtime_dependencies",
]

__version__ = "0.1.0"

Finally, the test file pins six invariants: tomli is declared as a runtime dependency, the interpreter floor is not leaking into the runtime set, the floor itself is ^3.9, the dev group pins pytest = "^8.0", only the dev group is declared, and the tomli spec carries the python = "<3.11" environment marker as a Poetry inline table.

from groupcompare import (
    declared_groups,
    dev_dependencies,
    python_constraint,
    runtime_dependencies,
)


def test_runtime_dependencies_include_tomli():
    deps = runtime_dependencies()
    assert "tomli" in deps


def test_runtime_dependencies_strip_python_constraint():
    deps = runtime_dependencies()
    assert "python" not in deps


def test_python_constraint_matches_project_floor():
    assert python_constraint() == "^3.9"


def test_dev_group_pins_pytest():
    deps = dev_dependencies()
    assert "pytest" in deps
    assert deps["pytest"] == "^8.0"


def test_declared_groups_contains_only_dev_for_now():
    assert declared_groups() == ["dev"]


def test_tomli_constraint_is_environment_marked():
    deps = runtime_dependencies()
    tomli_spec = deps["tomli"]
    assert isinstance(tomli_spec, dict)
    assert tomli_spec["python"] == "<3.11"
    assert tomli_spec["version"] == "^2.0"

These six assertions are what stops a careless edit in step 3 from quietly demoting tomli, dropping the environment marker, or smuggling in a second group before we have decided what its PEP 735 counterpart looks like.

Verification

Run the full suite from the codebase/ directory — step 1's five scaffold tests must stay green alongside the six new Poetry-layout tests:

python -m pytest

============================= test session starts ==============================
platform darwin -- Python 3.9.6, pytest-8.4.2, pluggy-1.6.0
rootdir: codebase
configfile: pyproject.toml
testpaths: tests
collected 11 items

tests/test_poetry_layout.py ......                                       [ 54%]
tests/test_scaffold.py .....                                             [100%]

============================== 11 passed in 0.04s ==============================

Eleven passing tests is the signal. Six of them are the new Poetry-layout assertions; the other five are the scaffold tests from step 1, which is exactly the regression net we wanted — adding Poetry tables did not perturb the baseline behaviour.

What we built

We now have a pyproject.toml that speaks two dialects at once: PEP 621 metadata for the build backend and Poetry's nested [tool.poetry.*] tables for dependency grouping. That duplication is not a mistake — it is the precise pain point the rest of the article is going to dissect.

The poetry_layout module is the lever that makes the next step writable. By turning Poetry's TOML tables into four boring Python functions, we have decoupled "what does Poetry say about dependencies" from "how is Poetry's CLI shaped". Step 3 will introduce a sibling module for PEP 735 and ask the test suite to prove the two views agree.

Six new tests pin the shape of the Poetry tables — runtime dependencies, the interpreter constraint, the dev group, the environment marker on tomli, and the fact that no other group has snuck in. Those assertions are deliberately rigid. Any later edit that drifts away from "Poetry says X" will fail loudly instead of silently re-shaping the comparison.

The remaining gap is the PEP 735 side. There is no [dependency-groups] table yet, no PEP 735 reader, and no cross-dialect assertion. That is precisely the seam step 3 will cut along.

Repository

The state of the code after this step: 47842e1

Step 3: Promoting the Dev Tooling into a PEP 735 [dependency-groups] Table for uv

Step 2 finished with a pyproject.toml that spoke Poetry fluently: [tool.poetry], [tool.poetry.dependencies], and [tool.poetry.group.dev.dependencies] all landed, and six pytest assertions pinned their shape. What it still lacked was the PEP 735 dialect that uv consumes natively — a top-level [dependency-groups] table standardised so any conformant resolver can read the same group without reaching into a tool-specific namespace. This step adds that table, mirrors the same three dev tools into it, ships a sibling reader, and gates the equivalence with a cross-dialect test.

The crux of this step is the equivalence test, not the table. Anyone can paste three lines into a TOML file; the hard problem is keeping two declarations of the same dev group from drifting once both are checked in. We push the answer to that question into the pytest suite, so any later commit that touches one table without touching the other fails before it can rot the comparison the rest of the article will rest on.

Setup

The src/ layout, the Hatchling build backend, and every byte of Poetry metadata from step 2 stay put. All four edits in this step are purely additive — we are layering a second dialect on top of the first, not rewriting either.

codebase/pyproject.toml — append a top-level [dependency-groups] table that lists dev as a PEP 508 array.
codebase/src/groupcompare/pep735_layout.py — new module with four read-only helpers plus a regex-based PEP 508 name extractor.
codebase/src/groupcompare/__init__.py — re-export the PEP 735 helpers alongside the Poetry helpers from step 2.
codebase/tests/test_pep735_layout.py — eight new assertions, one of which spans both dialects.

After this step the package tree gains exactly one source module and one test file:

codebase/
├── pyproject.toml
├── README.md
├── src/
│   └── groupcompare/
│       ├── __init__.py
│       ├── core.py
│       ├── pep735_layout.py
│       └── poetry_layout.py
└── tests/
    ├── __init__.py
    ├── test_pep735_layout.py
    ├── test_poetry_layout.py
    └── test_scaffold.py

No new third-party dependency is required. The tomli fallback we added in step 2 for Python 3.9 and 3.10 already covers both readers — PEP 735 lives in the same pyproject.toml and uses no syntax that vanilla TOML cannot already decode.

Implementation

The first edit lands in pyproject.toml. PEP 735 reserves the dependency-groups key at the top level of pyproject.toml — not under [tool.something] — and that flat placement is one of the surfaces uv reads directly. We mirror the three tools we declared under Poetry's nested table, keeping the version intent the same but rewriting it in plain PEP 508 syntax because PEP 735 has no caret operator.

[dependency-groups]
dev = [
    "pytest>=8.0,<9.0",
    "ruff>=0.6,<0.7",
    "mypy>=1.10,<2.0",
]

Two details about the table are worth being explicit about. The values are bare strings, not inline tables — so a plain tomllib.load decodes the section as dict[str, list[str]] with no dialect-aware glue. And every version range carries an explicit upper bound rewritten by hand, because where Poetry's ^8.0 quietly expands to >=8.0,<9.0, PEP 735 leaves that algebra to the human and ships exactly the characters you typed to every consumer.

Next, the reader. pep735_layout.py is structurally a clone of poetry_layout.py from step 2 — same _load helper, same Optional[Path] argument on every public function, same flat module surface. Keeping the two readers shaped identically is what lets the test file compare both dialects without privileging one as the canonical view.

from __future__ import annotations

import re
import sys
from pathlib import Path
from typing import Any, List, Mapping, Optional

if sys.version_info >= (3, 11):
    import tomllib as _toml
else:
    import tomli as _toml


PYPROJECT_PATH = Path(__file__).resolve().parents[2] / "pyproject.toml"

_NAME_RE = re.compile(r"^[A-Za-z0-9][A-Za-z0-9._-]*")


def _load(path: Optional[Path]) -> Mapping[str, Any]:
    target = path or PYPROJECT_PATH
    with target.open("rb") as handle:
        return _toml.load(handle)


def _dependency_groups_table(path: Optional[Path]) -> Mapping[str, Any]:
    return _load(path).get("dependency-groups", {})

The single sys.version_info branch at the top is the only place in the module that cares which interpreter is running. Everything below it speaks one TOML dialect, which keeps the helpers readable and the test surface small. _NAME_RE is a deliberately strict head-match — it pulls a distribution name off a PEP 508 requirement and stops at the first non-name character, which is enough to compare against Poetry's {"pytest": "^8.0"} keys without dragging a full PEP 508 parser into the article.

def pep735_groups(path: Optional[Path] = None) -> List[str]:
    return sorted(_dependency_groups_table(path).keys())


def pep735_dev_dependencies(path: Optional[Path] = None) -> List[str]:
    return list(_dependency_groups_table(path).get("dev", []))


def project_name(requirement: str) -> str:
    match = _NAME_RE.match(requirement.strip())
    if match is None:
        raise ValueError(f"unrecognised PEP 508 requirement: {requirement!r}")
    return match.group(0)


def pep735_dev_dependency_names(path: Optional[Path] = None) -> List[str]:
    return sorted(project_name(spec) for spec in pep735_dev_dependencies(path))

pep735_groups and pep735_dev_dependencies are straight reads — they hand the test file the keys and the array exactly as they sit on disk. project_name is the bridge between dialects, and pep735_dev_dependency_names composes it across the dev array so the equivalence check downstream is a single line. Every helper stays inside the codebase's "max two levels of conditional nesting" rule; the only branch in the module is one early-return guard inside project_name.

The package __init__.py then re-exports the four new helpers alongside the four from step 2. Keeping the public surface flat is what lets the test file write from groupcompare import ... instead of poking into a private module path, and it makes any later refactor of either reader an internal concern.

from groupcompare.core import describe_layout
from groupcompare.pep735_layout import (
    pep735_dev_dependencies,
    pep735_dev_dependency_names,
    pep735_groups,
    project_name,
)
from groupcompare.poetry_layout import (
    declared_groups,
    dev_dependencies,
    python_constraint,
    runtime_dependencies,
)

__all__ = [
    "__version__",
    "declared_groups",
    "describe_layout",
    "dev_dependencies",
    "pep735_dev_dependencies",
    "pep735_dev_dependency_names",
    "pep735_groups",
    "project_name",
    "python_constraint",
    "runtime_dependencies",
]

__version__ = "0.1.0"

The last edit is the test file. It pins eight invariants. Five describe the PEP 735 side in isolation — that dev is declared, that it is the only declared group for now, that it lists three tools, that the entries are strings rather than inline tables, and that the pytest pin carries explicit upper bounds. Two cover the regex extractor: it parses each of the three requirement strings, and it refuses an empty input with a clear error.

import pytest

from groupcompare import (
    dev_dependencies,
    pep735_dev_dependencies,
    pep735_dev_dependency_names,
    pep735_groups,
    project_name,
)


def test_pep735_dev_group_is_declared():
    assert "dev" in pep735_groups()


def test_pep735_groups_contain_only_dev_for_now():
    assert pep735_groups() == ["dev"]


def test_pep735_dev_group_lists_three_tools():
    assert len(pep735_dev_dependencies()) == 3


def test_pep735_dev_group_specs_are_strings():
    specs = pep735_dev_dependencies()
    assert all(isinstance(spec, str) for spec in specs)


def test_pep735_dev_group_pins_pytest_with_pep508_bounds():
    specs = pep735_dev_dependencies()
    pytest_spec = next(spec for spec in specs if spec.startswith("pytest"))
    assert pytest_spec == "pytest>=8.0,<9.0"


def test_pep735_dev_dependency_names_match_poetry_group():
    poetry_names = sorted(dev_dependencies().keys())
    assert pep735_dev_dependency_names() == poetry_names


def test_project_name_extracts_distribution_from_spec():
    assert project_name("pytest>=8.0,<9.0") == "pytest"
    assert project_name("ruff>=0.6,<0.7") == "ruff"
    assert project_name("mypy>=1.10,<2.0") == "mypy"


def test_project_name_rejects_empty_requirement():
    with pytest.raises(ValueError, match="unrecognised PEP 508 requirement"):
        project_name("")

The eighth test, test_pep735_dev_dependency_names_match_poetry_group, is the one this whole step was built around. It is the first assertion in the project that does not sit inside a single dialect — it holds both readers side by side and demands the set of distribution names match. Any future edit that grows the PEP 735 table without growing the Poetry table (or vice versa) trips it loudly instead of silently breaking the comparison the article will draw later.

Verification

Run the full pytest suite from the codebase/ directory. Step 1's five scaffold tests and step 2's six Poetry-layout tests must stay green next to the eight new PEP 735 tests — anything less and the baseline has regressed:

python -m pytest

============================= test session starts ==============================
platform darwin -- Python 3.9.6, pytest-8.4.2, pluggy-1.6.0
rootdir: codebase
configfile: pyproject.toml
testpaths: tests
collected 19 items

tests/test_pep735_layout.py ........                                     [ 42%]
tests/test_poetry_layout.py ......                                       [ 73%]
tests/test_scaffold.py .....                                             [100%]

============================== 19 passed in 0.06s ==============================

Nineteen passing tests is the signal we wanted. Eight of them are the new PEP 735 assertions, six are the Poetry-layout net from step 2, and five are the scaffold tests from step 1. The fact that adding the [dependency-groups] table did not perturb either previous suite is what makes the comparison the rest of the article will lean on honest.

What we built

The pyproject.toml now carries both dependency-group dialects at once: Poetry's nested [tool.poetry.group.dev.dependencies] from step 2 and a brand-new top-level [dependency-groups] table that uv reads natively. Both list the same three dev tools, and the duplication is intentional — it is the precise diff the article exists to dissect.

A sibling reader, pep735_layout, now lives next to poetry_layout. The two modules share an _load helper, an Optional[Path] argument on every public function, and a flat public surface re-exported through groupcompare. That symmetry is what makes the cross-dialect test possible to write without privileging either tool.

The cross-dialect assertion deserves its own paragraph. test_pep735_dev_dependency_names_match_poetry_group is the first invariant in the project that does not live inside one dialect — it spans both, and refuses to let them drift. A hand-maintained mirror cannot give you that guarantee; a test can, and now does.

What remains for step 4 is composition. PEP 735 lets one group include-group another, which is how we will split the current monolithic dev array into smaller pieces — a test group, a lint group, a docs group — and re-assemble them under dev. The cross-dialect equivalence we locked in here is the floor that split will be built on; without it, every later refactor would silently risk losing parity with the Poetry view.

Repository

The state of the code after this step: 9e21340

Step 4: Decomposing the Dev Group into Test, Lint, and Docs with PEP 735 include-group

Step 3 ended with two parallel declarations of the same dev group — Poetry's nested [tool.poetry.group.dev.dependencies] and a top-level [dependency-groups] table — and a cross-dialect test that pinned them to the same set of distribution names. That parity held because the group contained three tools any human could reread without losing track. This step breaks that easy case on purpose: we split the dev tooling into three purpose-built groups (test, lint, docs), recompose dev from them, and watch the two dialects diverge in shape while they stay equivalent in content.

The interesting half of the step is what the two dialects do with composition. PEP 735 ships a first-class include-group directive — dev becomes a list of { include-group = "..." } tables and the resolver expands them recursively. Poetry has no equivalent, so [tool.poetry.group.dev.dependencies] has to redundantly enumerate every tool from test + lint + docs and stay in sync by hand. This step makes that asymmetry visible inside pyproject.toml, encodes a recursive resolver on the PEP 735 side, and pins the equivalence-after-resolution invariant with a test that catches a stale Poetry mirror immediately instead of letting it rot quietly.

Setup

The src/ layout, the Hatchling build backend, and Poetry's runtime metadata stay untouched. Everything we touch in this step is additive on the dialect surfaces (three new groups on each side plus a rewritten composite) and additive on the reader (three new helpers and one rewritten one-liner). No new third-party dependencies — the new behaviour is all parsing and composition over the already-installed tomllib / tomli fallback.

codebase/pyproject.toml — split the dev tooling into test, lint, and docs on both dialect sides. Add mkdocs as a fourth tool so the new groups have distinct shapes. Rewrite the PEP 735 dev group as three { include-group = "..." } items. Mirror the same four tools redundantly into [tool.poetry.group.dev.dependencies], with a comment explaining why the duplication is unavoidable.
codebase/src/groupcompare/pep735_layout.py — add pep735_group_items, pep735_resolve_group, and pep735_resolved_group_names. Switch pep735_dev_dependencies to a one-line wrapper over pep735_resolve_group("dev") so every step-3 call site keeps working with no edit.
codebase/src/groupcompare/__init__.py — re-export the three new helpers next to the existing surface.
codebase/tests/test_pep735_layout.py and codebase/tests/test_poetry_layout.py — relax the "only dev declared" assertions from step 3 to the new four-group state, then add coverage for per-group reads, include-group resolution, cycle and unknown-group error paths, and a cross-dialect equivalence check over the resolved dev group.

The directory layout neither gains nor loses files in this step — we are growing existing ones rather than scattering new modules.

Implementation

The pyproject.toml changes are where the asymmetry between the two dialects becomes visible at a glance. On the PEP 735 side, we declare three focused groups and a fourth composite group built from them:

[dependency-groups]
test = [
    "pytest>=8.0,<9.0",
]
lint = [
    "ruff>=0.6,<0.7",
    "mypy>=1.10,<2.0",
]
docs = [
    "mkdocs>=1.6,<2.0",
]
dev = [
    { include-group = "test" },
    { include-group = "lint" },
    { include-group = "docs" },
]

dev is now a list of inline tables, not strings — exactly the case the resolver has to handle. Each { include-group = "..." } entry is the PEP 735 spec's composition primitive: a name lookup the consumer expands transitively. We keep version pins inside their owning group so CI can uv sync --group lint for the lint job and pull only ruff and mypy, with no pytest or mkdocs slowing down the install or polluting the resolution graph.

Poetry has no include-group. The closest analogue is hand-duplicating every tool into the parent group, which is exactly what we do — and we leave a comment in the file admitting it:

[tool.poetry.group.test.dependencies]
pytest = "^8.0"

[tool.poetry.group.lint.dependencies]
ruff = "^0.6"
mypy = "^1.10"

[tool.poetry.group.docs.dependencies]
mkdocs = "^1.6"

# Poetry has no equivalent of PEP 735 `include-group`, so the dev group
# must redundantly enumerate every tool from test + lint + docs. Any new
# tool added to a sub-group has to be hand-copied here or `poetry install
# --with dev` silently drifts from the PEP 735 view.
[tool.poetry.group.dev.dependencies]
pytest = "^8.0"
ruff = "^0.6"
mypy = "^1.10"
mkdocs = "^1.6"

The comment is doing real work — it is the warning label the next person to add a tool will read. Without it, a contributor adding pytest-cov to the PEP 735 test group would have no signal that they also need to extend [tool.poetry.group.dev.dependencies]. With it, the duplication is at least conscious, and the equivalence test we add below turns silent drift into a loud failure.

The PEP 735 reader gains an include-group resolver. It is short, recursive, and refuses to spin forever on an accidental cycle:

def _resolve_item(
    table: Mapping[str, Any], item: Any, seen: Set[str]
) -> List[str]:
    if isinstance(item, str):
        return [item]
    if isinstance(item, dict) and "include-group" in item:
        return _resolve(table, item["include-group"], seen)
    raise ValueError(f"unrecognised dependency-groups entry: {item!r}")


def _resolve(
    table: Mapping[str, Any], name: str, seen: Set[str]
) -> List[str]:
    if name in seen:
        raise ValueError(f"include-group cycle detected at {name!r}")
    if name not in table:
        raise KeyError(f"undefined dependency group: {name!r}")
    next_seen = seen | {name}
    resolved: List[str] = []
    for item in table[name]:
        resolved.extend(_resolve_item(table, item, next_seen))
    return resolved

Two design choices are worth flagging. seen is passed as a frozen-per-frame snapshot (seen | {name}) rather than a mutable accumulator, so a recursive walk that fans out across two sibling groups never leaks state from one branch into the other. And the helpers stay inside the codebase's "max two levels of conditional nesting" rule — _resolve_item is a straight chain of guard clauses with no nested if body, and _resolve keeps a single for loop that delegates per-item handling back to _resolve_item.

The public surface gets three small additions sitting on top of those internals:

def pep735_resolve_group(
    name: str, path: Optional[Path] = None
) -> List[str]:
    return _resolve(_dependency_groups_table(path), name, set())


def pep735_dev_dependencies(path: Optional[Path] = None) -> List[str]:
    return pep735_resolve_group("dev", path)


def pep735_resolved_group_names(
    name: str, path: Optional[Path] = None
) -> List[str]:
    return _names(pep735_resolve_group(name, path))

pep735_dev_dependencies is now a one-liner over pep735_resolve_group("dev") — every caller from step 3 keeps working, and the include-group expansion is invisible from the outside. pep735_resolved_group_names is what the cross-dialect test consumes: the set of distribution names dev actually installs, after every include has been expanded.

Verification

Run the full pytest suite from the codebase/ directory. Step 1's scaffold tests, step 2's Poetry-layout tests (now updated to reflect four declared groups instead of one), and step 3's PEP 735 tests (updated the same way) must stay green next to the new resolver, sub-group, and error-path assertions:

python -m pytest

============================= test session starts ==============================
platform darwin -- Python 3.12.5, pytest-8.4.2, pluggy-1.6.0
rootdir: codebase
configfile: pyproject.toml
testpaths: tests
collected 32 items

tests/test_pep735_layout.py .................                            [ 53%]
tests/test_poetry_layout.py ..........                                   [ 84%]
tests/test_scaffold.py .....                                             [100%]

============================== 32 passed in 0.08s ==============================

Thirty-two passing tests is the signal we wanted. Seventeen now sit on the PEP 735 side — the original eight from step 3, plus new ones for test / lint / docs being declared, for pep735_resolve_group expanding dev through three includes, for pep735_resolved_group_names agreeing with the (now hand-mirrored) Poetry dev group, and for the three error paths the resolver guards (unknown group, include cycle, unknown item shape). Ten cover the Poetry side, including per-group reads for the three new tables plus the explicit union check that catches drift in the hand-copied dev table. The five scaffold tests from step 1 still pass unchanged — a small reassurance that none of this dialect work perturbed the package surface.

What we built

pyproject.toml now declares four PEP 735 groups — test, lint, docs, and a composite dev — and the composite is built from the other three via include-group. Adding a new tool to the test group, say pytest-cov, automatically flows into dev with no second edit. The Poetry side declares the same four groups and pays the price for not having composition: [tool.poetry.group.dev.dependencies] is a hand-maintained mirror, and the comment in the file is the warning label that says so.

The PEP 735 reader picked up a recursive resolver. pep735_resolve_group(name) returns the fully-flattened PEP 508 strings for any group, expanding include-group references transitively, refusing cycles with a clear error, and rejecting any TOML shape that is neither a string nor a single-key { include-group = ... } table. pep735_dev_dependencies is now a wrapper around it, so every step-3 caller keeps working unchanged.

The test suite grew the load-bearing invariant of this step: equivalence-after-resolution. pep735_resolved_group_names("dev") and sorted(dev_dependencies().keys()) must produce the same list, even though one side is computed by walking includes and the other is hand-maintained by typing the same name into two tables. Any future commit that adds a tool to a PEP 735 sub-group without also extending the Poetry dev group (or vice versa) trips that assertion before it can rot the comparison the article will draw later.

What remains for step 5 is the lockfile angle. poetry.lock and uv.lock have to be regenerated against this new four-group layout, and we want to confirm they pin the same versions when fed the same inputs. The composition we locked in here is exactly the harder case for that comparison — uv lock has to walk include-group to find the dev set, and any mismatch with what Poetry's lock says about its hand-mirrored dev group becomes interesting evidence rather than noise.

Repository

The state of the code after this step: 2a200ab

Step 5: Locking the Four-Group Layout with poetry.lock and uv.lock, Then Diffing Them in Pytest

Step 4 ended with the test, lint, docs, and composite dev groups declared in both dialects, an include-group resolver on the PEP 735 side, and a pytest invariant pinning the two views to the same set of distribution names. That invariant stops the source-level tables from drifting, but it says nothing about what each resolver actually does with those tables once it sits down to lock. This step closes that gap: we run both tools against the same pyproject.toml, check the resulting poetry.lock and uv.lock into the repo, and write a lock reader that lets pytest hold the two graphs side by side.

The point is not to declare a winner — it is to make the difference legible. Both resolvers see the same four groups and the same requires-python = ">=3.9" constraint, so the question we are actually asking is: do they pin the same packages to the same versions, and where they diverge, is that divergence load-bearing or accidental? We answer with a short reader (lock_compare.py), eighteen new tests, and one explicit acknowledgement that uv.lock ships a different shape than poetry.lock because uv's resolution-markers let one dependency fan out across Python ranges where Poetry collapses to a single pin.

Setup

The source tree from step 4 is untouched. We add one new module and one new test file, append a re-export block to __init__.py, and check in the two lockfiles plus a tiny poetry.toml that keeps Poetry from creating its own virtualenv during local runs. No edits to pyproject.toml — the whole point of this step is that the inputs to both resolvers stay identical.

codebase/poetry.lock — generated by poetry lock against the existing four-group pyproject.toml. Thirty packages, each pinned to exactly one version, with sdist + wheel hashes per entry.
codebase/uv.lock — generated by uv lock against the same file. Thirty distinct package names plus the editable root project, but a handful of names appear twice because uv keeps both versions when resolution-markers branch on the active Python range.
codebase/poetry.toml — two lines: [virtualenvs] create = false. Tells Poetry to skip its own venv plumbing so the lock step plays nicely with the existing .venv/ from earlier work.
codebase/src/groupcompare/lock_compare.py — flat reader: two TOML loads, a {name: {versions}} collector, three set-algebra helpers, two hash-presence checks, and an exact-pin check. The editable root project is filtered out of the uv side so the diff stays apples-to-apples.
codebase/src/groupcompare/__init__.py — re-export the eleven new helpers next to the existing surface so the test file can from groupcompare import ....
codebase/tests/test_lock_compare.py — eighteen assertions: real-lock checks, fixture-driven parser shape checks, and a deliberately-permissive guard around uv's multi-version fan-out.

The directory now carries both lockfiles next to pyproject.toml:

codebase/
├── pyproject.toml
├── poetry.toml
├── poetry.lock
├── uv.lock
├── src/
│   └── groupcompare/
│       ├── __init__.py
│       ├── core.py
│       ├── lock_compare.py
│       ├── pep735_layout.py
│       └── poetry_layout.py
└── tests/
    ├── __init__.py
    ├── test_lock_compare.py
    ├── test_pep735_layout.py
    ├── test_poetry_layout.py
    └── test_scaffold.py

Implementation

The reader is structurally a third sibling to poetry_layout.py and pep735_layout.py from earlier steps — same tomllib/tomli fallback at the top, same Optional[Path] argument on every public function, same flat module surface. Keeping the shape uniform is what lets the test file read either side without privileging one as canonical.

PROJECT_ROOT = Path(__file__).resolve().parents[2]
POETRY_LOCK_PATH = PROJECT_ROOT / "poetry.lock"
UV_LOCK_PATH = PROJECT_ROOT / "uv.lock"
ROOT_PROJECT_NAME = "groupcompare"


def _load(path: Path) -> Mapping[str, Any]:
    with path.open("rb") as handle:
        return _toml.load(handle)


def _packages(data: Mapping[str, Any]) -> List[Mapping[str, Any]]:
    return list(data.get("package", []))


def _is_editable(pkg: Mapping[str, Any]) -> bool:
    source = pkg.get("source")
    return isinstance(source, dict) and "editable" in source

Both lockfiles store their package list under a top-level [[package]] array, so _packages is a one-liner with no dialect glue. _is_editable is the only piece of asymmetry the reader has to absorb: uv tracks the workspace root itself as { source = { editable = "." } }, and we filter that single entry out so the cross-tool diff covers only resolved dependencies. Poetry's lockfile never lists the root project, so the equivalent guard on its side is a no-op.

The version collector is the load-bearing helper. It maps each package name to the set of versions that lockfile records — usually a singleton, but uv legitimately stores two when resolution-markers branch on the Python range. Modelling the value as a set instead of a string is what lets version_disagreements later compare both shapes without raising.

def _collect_versions(
    packages: List[Mapping[str, Any]], skip_editable: bool
) -> Dict[str, Set[str]]:
    versions: Dict[str, Set[str]] = {}
    for pkg in packages:
        if skip_editable and _is_editable(pkg):
            continue
        name = pkg["name"]
        versions.setdefault(name, set()).add(pkg["version"])
    return versions


def poetry_lock_versions(
    path: Optional[Path] = None,
) -> Dict[str, Set[str]]:
    data = _load(path or POETRY_LOCK_PATH)
    return _collect_versions(_packages(data), skip_editable=False)


def uv_lock_versions(
    path: Optional[Path] = None,
) -> Dict[str, Set[str]]:
    data = _load(path or UV_LOCK_PATH)
    return _collect_versions(_packages(data), skip_editable=True)

The two thin wrappers exist so the public surface mentions each lockfile by name — every caller is unambiguous about which side it is reading. The skip_editable=False on the Poetry side is deliberate: a future Poetry version could start writing a self entry, and we want the read to surface that loudly rather than silently filter it.

Set algebra on top of {name: {versions}} gives us the three queries the article is built around — overlap, Poetry-only, uv-only — plus the version disagreement map that records both sides when a name pins differently across the two tools:

def packages_in_both(
    poetry_path: Optional[Path] = None,
    uv_path: Optional[Path] = None,
) -> List[str]:
    poetry = set(poetry_lock_versions(poetry_path).keys())
    uv = set(uv_lock_versions(uv_path).keys())
    return sorted(poetry & uv)


def version_disagreements(
    poetry_path: Optional[Path] = None,
    uv_path: Optional[Path] = None,
) -> Dict[str, Tuple[Set[str], Set[str]]]:
    poetry = poetry_lock_versions(poetry_path)
    uv = uv_lock_versions(uv_path)
    differences: Dict[str, Tuple[Set[str], Set[str]]] = {}
    for name in set(poetry) & set(uv):
        if poetry[name] != uv[name]:
            differences[name] = (poetry[name], uv[name])
    return differences

version_disagreements keeps both version sets in its return shape because either side may be the surprising one. For this run, every Poetry value is a singleton and every uv value is either the same singleton or a superset that adds an older version pinned to python_full_version < '3.10'. Returning the raw sets lets the test, the CLI, or a future report renderer decide how to surface that — the helper itself stays opinion-free.

The hash and exact-pin checks make the reproducibility guarantees programmatic instead of folklore. We dig under files, wheels, and sdist because Poetry packs hashes under files = [...] while uv splits them across sdist = { hash = "..." } and wheels = [{ hash = "..." }, ...]:

def _has_hash_entries(pkg: Mapping[str, Any]) -> bool:
    files = pkg.get("files")
    if isinstance(files, list) and any("hash" in f for f in files):
        return True
    wheels = pkg.get("wheels")
    if isinstance(wheels, list) and any("hash" in w for w in wheels):
        return True
    sdist = pkg.get("sdist")
    return isinstance(sdist, dict) and "hash" in sdist


def poetry_lock_is_hashed(path: Optional[Path] = None) -> bool:
    data = _load(path or POETRY_LOCK_PATH)
    packages = _packages(data)
    return bool(packages) and all(_has_hash_entries(pkg) for pkg in packages)


def uv_lock_is_hashed(path: Optional[Path] = None) -> bool:
    data = _load(path or UV_LOCK_PATH)
    packages = [pkg for pkg in _packages(data) if not _is_editable(pkg)]
    return bool(packages) and all(_has_hash_entries(pkg) for pkg in packages)


def poetry_lock_pins_are_exact(path: Optional[Path] = None) -> bool:
    versions = poetry_lock_versions(path)
    return all(len(v) == 1 for v in versions.values())

poetry_lock_pins_are_exact exists in isolation — there is no corresponding uv_lock_pins_are_exact, because the uv contract explicitly allows multiple pins per name under different resolution markers. Writing the helper only for the side where the invariant holds is the article's first concrete signal that "deterministic" means different things to the two tools.

The test file then turns every paragraph above into an assertion. Eleven tests hit the real lockfiles checked into the repo (existence, core tools present, root project filtered correctly, hashes recorded, Poetry's one-pin-per-name contract, the overlap set covering shared tooling, the algebraic identity between both | only_poetry and the Poetry name set, and a permissive guard that documents uv's multi-version fan-out without asserting it always happens). Seven more drive the parser against synthetic fixtures so the structural contract is locked in even if the real lockfiles change shape later — fixture cases for the happy path, for version drift, for a Poetry entry with no hash, for duplicate-version Poetry entries, and for an editable-only uv root with no artifacts.

def test_poetry_lock_pins_each_package_to_one_exact_version():
    assert poetry_lock_pins_are_exact() is True
    versions = poetry_lock_versions()
    assert versions, "poetry.lock parsed empty"
    for name, vs in versions.items():
        assert len(vs) == 1, f"{name} has {len(vs)} versions in poetry.lock"


def test_uv_may_pin_multiple_versions_when_resolver_branches_on_python():
    # uv supports `resolution-markers`, so the same dependency can resolve
    # to two versions under different Python ranges. Poetry collapses to
    # one. We don't assert it always happens, but if it does, the version
    # set has length > 1 — which is the determinism contract uv documents.
    uv = uv_lock_versions()
    multi = {name: vs for name, vs in uv.items() if len(vs) > 1}
    for name, vs in multi.items():
        assert all(isinstance(v, str) and v for v in vs)

The asymmetry between those two tests is the whole step in miniature. Poetry's lockfile invariant is strong enough to assert; uv's is strong enough only to describe — and the test that describes it would still pass on a future uv release that converged to single pins, because the body iterates over the multi-version subset and is a no-op when it is empty.

Verification

Run pytest from codebase/. All four test files must stay green together — step 1's scaffold, step 2's Poetry-layout reader, step 3's PEP 735 reader (extended in step 4), and the new lock-compare suite:

python -m pytest

============================= test session starts ==============================
platform darwin -- Python 3.9.6, pytest-8.4.2, pluggy-1.6.0
rootdir: codebase
configfile: pyproject.toml
testpaths: tests
collected 50 items

tests/test_lock_compare.py ..................                            [ 36%]
tests/test_pep735_layout.py .................                            [ 70%]
tests/test_poetry_layout.py ..........                                   [ 90%]
tests/test_scaffold.py .....                                             [100%]

============================== 50 passed in 1.56s ==============================

Fifty passing tests is the signal we wanted. Eighteen are the new lock-compare assertions, seventeen are the PEP 735 net from step 4, ten are the Poetry-layout suite, and five are the scaffold from step 1. Both real lockfiles parse, both record the four core tools (pytest, ruff, mypy, mkdocs), both pass the hash-presence check, the Poetry side passes the exact-pin check, and the uv side exhibits — at the time of this commit — version fan-out on click, markdown, mypy, platformdirs, and iniconfig, where the resolver branches on python_full_version < '3.10' and keeps an older pin for the legacy range.

What we built

poetry.lock and uv.lock are now checked into the repository alongside the pyproject.toml that produced them, so anyone cloning the codebase can reproduce both resolutions byte-for-byte. Thirty distinct package names live in each, the overlap set covers all thirty, and the four tools the article is built around — pytest, ruff, mypy, mkdocs — are present on both sides at the versions their PEP 508 / caret ranges allow.

The new lock_compare.py module turns that pair of files into queryable structure. {name: {versions}} is the right shape because it absorbs both the singleton-per-name case Poetry guarantees and the multi-pin case uv documents under resolution-markers. packages_in_both, packages_only_in_poetry, packages_only_in_uv, and version_disagreements give the test suite — and any future CLI — a small, opinion-free vocabulary for talking about what the two tools agreed on and where they did not.

The eighteen new tests pin three different categories of invariant. Existence and shape: both files parse, both record the core tools, the editable root is filtered out exactly once. Reproducibility content: every package on both sides carries artifact hashes, and every Poetry pin is exact. Cross-tool symmetry: the union of packages_in_both with each side's exclusive set equals that side's full name list, which is the algebraic identity that catches a parser bug before it can quietly distort the diff.

The asymmetry we surfaced is the most useful deliverable of the step. Poetry collapses every dependency to one version; uv may keep two when resolution-markers branch on Python range. Both behaviours are deterministic given the same inputs, but they are not the same shape of determinism — and the test suite now records that distinction explicitly instead of leaving readers to discover it by diffing the lockfiles in their heads.

Repository

The state of the code after this step: 1ca50d7

Step 6: Proving Group Isolation in CI with a GitHub Actions Matrix Over uv sync --no-default-groups

Step 5 left the codebase with poetry.lock and uv.lock checked in side by side and an eighteen-test reader pinning the determinism contract — same dependency graph, exact pins on the Poetry side, a documented multi-version fan-out on the uv side under resolution-markers. What the test suite could not yet prove was that the four groups we declared back in step 4 actually install independently on a clean machine. Locally, a stale .venv/ can mask a missing include-group reference or a leaked default group, and the pytest assertions only ever see the source-level TOML tables.

This step closes that loop in CI. We add a single .github/workflows/ci.yml with two jobs: one matrix-driven job that installs test, lint, and docs one group at a time on a fresh Ubuntu runner, and a second job that installs the composed dev group end to end. Both jobs pass --no-default-groups to every uv sync so a forgotten default group cannot silently rescue a broken isolation contract. A new ci_workflows.py module reads the workflow back as text and turns its shape into pytest assertions, so a future drive-by edit to the YAML cannot quietly undo the guarantee.

Setup

The source tree from step 5 stays put. We add one workflow file, one parser module, append a small block of re-exports to __init__.py, and write one new test file:

codebase/.github/workflows/ci.yml — two jobs. The group-install job uses strategy.matrix.group: [test, lint, docs] and runs uv sync --group ${{ matrix.group }} --no-default-groups followed by a case "${{ matrix.group }}" block that picks the right tool for that leg (pytest -q, ruff check src tests, mkdocs --version). The dev-install job is a single leg that runs uv sync --group dev --no-default-groups and then confirms pytest, ruff, mypy, and mkdocs are all on PATH.
codebase/src/groupcompare/ci_workflows.py — a small text + regex reader. Four compiled patterns plus a handful of Optional[Path] entry points. No PyYAML dependency: this article is about uv dependency groups, not YAML parsing, and a single regex is enough for the shape we authored.
codebase/src/groupcompare/__init__.py — re-export the seven new helpers so the test file can from groupcompare import ... next to the existing surface.
codebase/tests/test_ci_workflows.py — twenty-four assertions: real-workflow checks against the file we just landed, plus synthetic-fixture checks that lock the parser shape down independent of the live workflow.

No edits to pyproject.toml, no new runtime dependency, no change to the lockfiles. The point of the step is to take the four-group source we already had and surface it on an external machine; pulling extra packages in would muddy that signal.

Implementation

The workflow is the centrepiece. group-install declares a matrix over the three leaf groups and lets the --group flag get interpolated from matrix.group so each leg installs exactly one group on a clean runner:

jobs:
  group-install:
    name: Install only --group ${{ matrix.group }}
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        group: [test, lint, docs]
    steps:
      - uses: actions/checkout@v4
      - uses: astral-sh/setup-uv@v3
        with:
          version: "0.4.27"
          enable-cache: true
      - run: uv python install 3.12
      - name: Install only the ${{ matrix.group }} dependency-group
        run: uv sync --group ${{ matrix.group }} --no-default-groups

fail-fast: false is deliberate. If the test leg breaks, we still want the lint and docs legs to report — the whole point of CI is to see which group regressed, not just that something did. enable-cache: true on the official astral-sh/setup-uv action reuses the uv download cache across runs, which keeps the per-leg sync under a second once the cache is warm.

The follow-up step dispatches the right tool for each group via a case block, with --no-sync on every uv run invocation so we reuse the environment from the preceding uv sync instead of re-resolving:

      - name: Exercise the ${{ matrix.group }} group's tooling
        run: |
          case "${{ matrix.group }}" in
            test) uv run --no-sync pytest -q ;;
            lint) uv run --no-sync ruff check src tests ;;
            docs) uv run --no-sync mkdocs --version ;;
            *)
              echo "Unknown group: ${{ matrix.group }}" >&2
              exit 1
              ;;
          esac

Without --no-sync, uv run would silently re-resolve using the project's default groups, which is exactly the leak the isolation contract is supposed to forbid. The wildcard branch (*)) terminates with exit 1 so a typo in a future matrix entry fails loudly rather than being treated as a no-op.

The dev-install job mirrors the composition story from step 4. dev is declared via include-group: test, include-group: lint, include-group: docs, so installing --group dev should pull every leaf tool in one shot:

  dev-install:
    name: Install the composed dev group
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: astral-sh/setup-uv@v3
        with: { version: "0.4.27", enable-cache: true }
      - run: uv python install 3.12
      - run: uv sync --group dev --no-default-groups
      - name: Confirm every tool from test + lint + docs is on PATH
        run: |
          uv run --no-sync pytest --version
          uv run --no-sync ruff --version
          uv run --no-sync mypy --version
          uv run --no-sync mkdocs --version

ci_workflows.py then translates that YAML shape into queryable structure without dragging a YAML parser into the runtime install. Four compiled regexes — one for the matrix list, one for the per-group sync command, one for the dev-group sync command, one for the case branch bodies — plus a couple of one-line entry points:

_MATRIX_RE = re.compile(r"^\s*group:\s*\[([^\]]+)\]\s*$", re.MULTILINE)
_PER_GROUP_SYNC_RE = re.compile(
    r"uv sync\s+--group\s+\$\{\{\s*matrix\.group\s*\}\}\s+--no-default-groups"
)
_DEV_SYNC_RE = re.compile(r"uv sync\s+--group\s+dev\s+--no-default-groups")
_CASE_BRANCH_RE = re.compile(
    r"^\s*(?P<name>[A-Za-z0-9_-]+)\)\s*\n\s*(?P<body>uv run[^\n]+)",
    re.MULTILINE,
)


def matrix_groups(path: Optional[Path] = None) -> List[str]:
    match = _MATRIX_RE.search(workflow_text(path))
    if match is None:
        return []
    raw = match.group(1)
    items = [chunk.strip().strip("'\"") for chunk in raw.split(",")]
    return [item for item in items if item]

Each helper accepts an Optional[Path] so synthetic fixtures can drive the same code path the live workflow does. The isolated_install_uses_no_default_groups check is the load-bearing one — it scans every line that contains uv sync and asserts each one passes --no-default-groups, which is the single invariant that turns the workflow from "looks right" into "cannot accidentally leak default groups." If even one future sync line forgets the flag, the test fails before the regression reaches CI.

Verification

Run pytest from codebase/. All five test files must stay green together — step 1's scaffold, step 2's Poetry layout, step 3 + 4's PEP 735 reader, step 5's lockfile diff, and the new workflow reader:

python -m pytest

============================= test session starts ==============================
platform darwin -- Python 3.12.5, pytest-8.4.2, pluggy-1.6.0
rootdir: codebase
configfile: pyproject.toml
testpaths: tests
collected 74 items

tests/test_ci_workflows.py ........................                      [ 32%]
tests/test_lock_compare.py ..................                            [ 56%]
tests/test_pep735_layout.py .................                            [ 79%]
tests/test_poetry_layout.py ..........                                   [ 93%]
tests/test_scaffold.py .....                                             [100%]

============================== 74 passed in 0.80s ==============================

Seventy-four passing tests, twenty-four of them new. Sixteen of the new ones hit the real ci.yml we just committed (matrix shape, trigger events, the --no-default-groups invariant on every sync line, the case branches dispatching pytest, ruff, and mkdocs, the dev job installing the composed group), and eight more drive the parser against synthetic fixtures so the regex contract is locked in even if the live workflow is rearranged later.

What we built

.github/workflows/ci.yml now exists and turns the source-level group declarations into an external, reproducible signal. Three matrix legs each install one PEP 735 leaf group on a clean Ubuntu runner with no default groups in scope, and a separate dev-install job confirms the include-group composition still pulls every leaf tool when the composed group is requested directly. The fail-fast: false setting plus the case block's wildcard exit keep regressions loud and specific instead of letting one broken leg mask the others.

The new ci_workflows.py module gives the test suite a small, stable vocabulary for talking about that workflow without dragging a YAML library in. Four regexes plus eight thin helpers cover the matrix list, the two sync commands, the per-group tool dispatch table, and the global --no-default-groups invariant. Every helper accepts an optional path so synthetic fixtures can exercise the parser independent of whatever shape the live workflow currently holds.

Twenty-four pytest assertions encode three categories of guarantee. Existence and trigger shape: the workflow file is at the canonical path, runs on push and pull request to main, and uses the official astral-sh/setup-uv and actions/checkout actions. Isolation contract: the matrix lists exactly the leaf groups, never dev; every uv sync opts out of default groups; every uv run uses --no-sync so the run step cannot re-resolve. Composition contract: a separate dev-install job installs the composed group in isolation and confirms every leaf tool — pytest, ruff, mypy, mkdocs — is on PATH afterwards.

What this unlocks is a head-to-head comparison story that does not have to hedge. Both tools support per-group installation in a single command — uv sync --group <name> --no-default-groups on the uv side, poetry install --only <group> on the Poetry side — but only uv reads the standardised PEP 735 table directly, and only PEP 735 expresses the include-group composition that step 4 was built around. The CI signal we landed in this step is what makes any cross-tool comparison fair rather than aspirational: the matrix proves the standard-side claim on a fresh runner, so a future Poetry workflow can be measured against the same clean-machine baseline instead of asserted from folklore.

Repository

The state of the code after this step: 49c4cf5

Step 7: Migrating Poetry Groups to PEP 735 with a Caret-Aware Converter and a PASS/FAIL Checklist

Step 6 closed the loop on the source contract: the matrix workflow proved on a clean Ubuntu runner that uv sync --group <name> --no-default-groups installs each PEP 735 leaf group in isolation, and the dev-install job confirmed include-group composition still pulls every leaf tool when the composed group is requested directly. What the six previous steps have NOT given the reader is a tool that turns their Poetry project into a PEP 735 project — every conversion so far was hand-authored and lived in our pyproject.toml.

This step ships that tool. A new groupcompare.migrate module reads [tool.poetry.group.*] tables, translates each Poetry version constraint to PEP 508 (caret and tilde get the right PEP 440 ranges, exact and wildcard pass through, table-shaped entries become markers), emits an equivalent [dependency-groups] block with dev rebuilt out of include-group references, and exposes a verification checklist whose human-readable report prints [PASS] for every line when the conversion is structurally sound. The reader runs the checklist, watches every line say PASS, and only then deletes the Poetry tables — drift between the two views can no longer hide.

Setup

The source tree from step 6 stays put. We add one module, append a small block of re-exports to __init__.py, and write one new test file:

codebase/src/groupcompare/migrate.py — three concerns in one module. (1) Constraint conversion: caret_to_pep440, tilde_to_pep440, plus a _bound_constraint helper for the pass-through cases. (2) Group migration: migrate_group for a single mapping, migrate_groups to walk every Poetry group + rebuild dev from include-group entries, render_dependency_groups to format the result as TOML. (3) Verification: a ChecklistItem dataclass plus verification_checklist, checklist_passes, and checklist_report that compare the live Poetry view against the live PEP 735 view and surface the diff as PASS / FAIL lines.
codebase/src/groupcompare/__init__.py — re-export the new public surface (caret_to_pep440, tilde_to_pep440, poetry_to_pep508, migrate_group, migrate_groups, render_dependency_groups, verification_checklist, checklist_passes, checklist_report, ChecklistItem) so callers can from groupcompare import migrate_groups next to the existing helpers.
codebase/tests/test_migrate.py — twenty-six assertions covering caret + tilde edge cases, the pass-through PEP 440 range, wildcard erasure, table-with-marker shape, group migration including the dev rebuild, TOML rendering that round-trips through tomllib, and the checklist on both the live pyproject.toml and three failing synthetic fixtures.

No edits to pyproject.toml. No new runtime dependency: the module relies on the stdlib re, pathlib, and dataclasses, plus the tomllib reader already pulled in via groupcompare.poetry_layout. Adding a heavier converter would defeat the point — the article is about dependency groups, and a hundred lines of regex is enough for the constraint shapes Poetry actually ships.

Implementation

The constraint conversion is the load-bearing piece, because every other concern in the module funnels through it. Poetry's ^X.Y and ~X.Y shorthands are not valid PEP 440, so a literal copy from Poetry to PEP 735 would silently change resolver behaviour. Two compiled regexes — one for caret, one for tilde — pin down what we accept; everything else is normalised into a single _bound_constraint choke point:

_CARET_RE = re.compile(r"^\^(\d+)(?:\.(\d+))?(?:\.(\d+))?$")
_TILDE_RE = re.compile(r"^~(\d+)(?:\.(\d+))?(?:\.(\d+))?$")


def caret_to_pep440(constraint: str) -> str:
    match = _CARET_RE.match(constraint)
    if match is None:
        raise ValueError(f"not a caret constraint: {constraint!r}")
    major = int(match.group(1))
    minor = int(match.group(2)) if match.group(2) is not None else 0
    patch_raw = match.group(3)
    floor_parts = [str(major), str(minor)]
    if patch_raw is not None:
        floor_parts.append(patch_raw)
    floor = ".".join(floor_parts)
    ceiling = _caret_ceiling(major, minor, patch_raw)
    return f">={floor},<{ceiling}"

The non-obvious part is the ceiling rule, which Poetry inherits from Cargo: ^1.6 bumps to <2.0, but ^0.6 bumps to <0.7, and ^0.0.3 bumps to <0.0.4. Major-zero leading versions are treated as unstable, so the left-most non-zero component is the one that's allowed to move. A helper called _caret_ceiling keeps that logic in one place; the public function stays under fifteen lines and never nests an if more than one level deep.

def _caret_ceiling(major: int, minor: int, patch_raw: Optional[str]) -> str:
    if major > 0:
        return f"{major + 1}.0"
    if minor > 0:
        return f"0.{minor + 1}"
    patch = int(patch_raw) if patch_raw is not None else 0
    return f"0.0.{patch + 1}"

The tilde rule is simpler — bump the right-most specified component — but still asymmetric. ~1 (major only) bumps the major; ~1.6 and ~1.6.4 both bump the minor. Anything that doesn't match the regex raises so the caller has to look at it by hand instead of getting a silently-wrong migration. A dispatcher then routes a single Poetry entry to the right strategy:

def poetry_to_pep508(name: str, constraint: Any) -> str:
    if isinstance(constraint, str):
        bound = _bound_constraint(constraint)
        return f"{name}{bound}"
    if isinstance(constraint, Mapping):
        return _poetry_table_to_spec(name, constraint)
    raise ValueError(f"unsupported Poetry constraint shape: {constraint!r}")

Poetry's table form — tomli = { version = "^2.0", python = "<3.11" } — translates to a PEP 508 marker on the right-hand side of a semicolon: tomli>=2.0,<3.0; python_version < '3.11'. _PYTHON_MARKER_RE pins the comparator + the version literal so we can quote the version when the marker is rendered, which is what PEP 508 demands. Wildcard (*) collapses to a bare name, exact and pre-bounded >=X,<Y strings pass through untouched, and anything else raises.

migrate_groups walks [tool.poetry.group.*] once. Every group except dev becomes a leaf list of converted PEP 508 specs; dev itself is rebuilt from scratch as a sorted list of {"include-group": name} entries, because the whole point of PEP 735 over Poetry is that dev should compose the leaves rather than redundantly enumerate them:

def migrate_groups(path: Optional[Path] = None) -> Dict[str, List[Any]]:
    poetry = _poetry_table(path).get("group", {})
    leaf_groups: Dict[str, List[Any]] = {}
    for name, body in poetry.items():
        if name == "dev":
            continue
        leaf_groups[name] = migrate_group(body.get("dependencies", {}))
    members = sorted(leaf_groups.keys())
    leaf_groups["dev"] = [{"include-group": m} for m in members]
    return leaf_groups

render_dependency_groups then turns that dictionary into TOML by hand, because the stdlib only ships tomllib (reader) not tomli_w (writer), and adding a writer dependency just to emit ten lines of static text would be overkill. The renderer handles two entry shapes — plain strings and the include-group inline table — and raises on anything else, which means a future regression that introduces a third shape fails loudly at render time instead of producing a malformed pyproject.toml.

The verification checklist is what makes the script trustworthy. It's not enough to convert; the reader needs an objective signal that the conversion preserved the dependency set before they delete the Poetry tables. verification_checklist walks three global invariants plus one per-leaf check, and packages each result as a frozen ChecklistItem carrying a passed flag and a diagnostic detail string:

@dataclass(frozen=True)
class ChecklistItem:
    name: str
    passed: bool
    detail: str = ""


def verification_checklist(
    path: Optional[Path] = None,
) -> List[ChecklistItem]:
    poetry_leaves = sorted(g for g in declared_groups(path) if g != "dev")
    pep_leaves = sorted(g for g in pep735_groups(path) if g != "dev")
    items: List[ChecklistItem] = [
        _check_leaf_group_names(poetry_leaves, pep_leaves),
        _check_dev_uses_includes_only(path),
        _check_dev_includes_match_leaves(path, pep_leaves),
    ]
    items.extend(_check_each_leaf_membership(path, poetry_leaves, pep_leaves))
    return items

Three globals — the leaf-group names line up between Poetry and PEP 735, the new dev is composed exclusively of include-group entries (never raw strings), and the includes inside dev cover every leaf exactly once. Then one per-leaf membership check: for each name that appears in both views, the package names declared inside Poetry's group must equal the package names extracted from the PEP 735 resolution. checklist_report renders each item as [PASS] <name> or [FAIL] <name> followed by a detail line, which gives the reader a printable artifact they can paste into a commit message before deleting the Poetry tables.

Verification

Run pytest from codebase/. All six test files must stay green together — step 1's scaffold, step 2's Poetry layout reader, step 3 + 4's PEP 735 reader, step 5's lockfile diff, step 6's workflow reader, and the new migration suite:

python -m pytest

============================= test session starts ==============================
platform darwin -- Python 3.12.5, pytest-8.4.2, pluggy-1.6.0
rootdir: codebase
configfile: pyproject.toml
testpaths: tests
collected 100 items

tests/test_ci_workflows.py ........................                      [ 24%]
tests/test_lock_compare.py ..................                            [ 42%]
tests/test_migrate.py ..........................                         [ 68%]
tests/test_pep735_layout.py .................                            [ 85%]
tests/test_poetry_layout.py ..........                                   [ 95%]
tests/test_scaffold.py .....                                             [100%]

============================= 100 passed in 0.88s ==============================

One hundred passing tests, twenty-six of them new. Seven cover the caret + tilde regexes and their rejection of mismatched input, six pin down poetry_to_pep508 against the constraint shapes the article actually uses (caret string, pass-through PEP 440 range, wildcard, table with marker, table without marker, unsupported type), three exercise migrate_groups against the live pyproject.toml, four round-trip the rendered TOML back through tomllib, and the remaining six drive the checklist against the live project and three synthetic fixtures that each fail one invariant at a time.

What we built

migrate.py is now a thirteen-function module that any reader can drop into a real Poetry project. The constraint converter handles every shape we have encountered in the wild — caret with major-zero quirk, tilde with the right-most bump, plain PEP 440 ranges, wildcard, and the {version = ..., python = ...} table — and raises on anything else so silent breakage is impossible.

The group migrator walks the existing Poetry tables and produces a [dependency-groups] block with the leaf groups carried over verbatim (just translated) and the dev group rebuilt from include-group references. That single change — switching from hand-copied membership to composed membership — is the entire qualitative gap between Poetry's group model and PEP 735 that the article has been circling around, and the script materialises it in a form the reader can apply to their own repository in one shot.

The verification checklist is the safety harness. Four invariants — leaf names match, dev uses only includes, dev covers every leaf, each leaf has the same package set on both sides — print as PASS / FAIL lines with diagnostic detail strings. A reader who runs checklist_report() and sees nothing but [PASS] lines has objective evidence that the conversion preserved the dependency set, which is the prerequisite for the destructive next step of deleting [tool.poetry.group.*] from pyproject.toml.

This is the last code-bearing step in the series. With the migration script, the verification checklist, the lockfile parity reader from step 5, and the CI matrix from step 6, the article now ships a self-contained kit: a reader can clone the companion repo, point the migrator at their own pyproject.toml, watch the checklist print PASS, generate a uv.lock next to their existing poetry.lock, confirm the two agree on every package version, and commit the workflow that proves group isolation on a clean machine. The Poetry-to-PEP-735 transition stops being a leap of faith and becomes a sequence of mechanical checks.

Repository

The state of the code after this step: 63ac540

Repository

Full source at https://github.com/vytharion/uv-dependency-groups-pep-735-vs-poetry.

Walk the lessons by stepping through the git commits in the repo — each major step has its own commit you can git checkout and rerun.