Skip to main content
uv run syncs the project deps + tests group on demand, so the default test suite needs no upfront install — just uv run pytest --numprocesses=auto dimos (xdist parallelizes across cores). Self-hosted tests need the heavy optional extras (LFS data, perception models, simulation, hardware SDKs, …). Sync them explicitly before running:
uv sync --all-groups              # all dependency groups (tests-self-hosted, lint, …)
uv sync --group tests-self-hosted # just what CI installs on the self-hosted runner

Types of tests

In general, there are different types of tests based on what their goal is:
TypeDescriptionMockingSpeed
UnitTest a small individual piece of codeAll external systemsVery fast
IntegrationTest the integration between multiple units of codeMost external systemsSome fast, some slow
FunctionalTest a particular desired functionalitySome external systemsSome fast, some slow
End-to-endTest the entire system as a whole from the perspective of the userNoneVery slow
The distinction between unit, integration, and functional tests is often debated and rarely productive. Rather than waste time on classifying tests, it’s better to separate tests by how they are used:
Test GroupWhen to runTypical usage
defaultafter each code changeoften run with filesystem watchers so tests rerun whenever a file is saved
self-hostedevery once in a while to make sure you haven’t broken anythingmaybe every commit, but definitely before publishing a PR
The purpose of running tests in a loop is to get immediate feedback. The faster the loop, the easier it is to identify a problem since the source is the tiny bit of code you changed. Self-hosted tests are marked with @pytest.mark.self_hosted (they need LFS, ROS, CUDA, or other heavy deps); the default suite is everything else.

Usage

Default suite

./bin/pytest-fast
This is the same as:
pytest --numprocesses=auto dimos
The default addopts in pyproject.toml includes a -m filter that excludes self_hosted/mujoco, so plain pytest dimos runs only the default suite; --numprocesses=auto parallelizes across cores via pytest-xdist.

Self-hosted tests

./bin/pytest-slow
(Shortcut for pytest -m 'not mujoco' dimos — runs the default suite and self-hosted tests, but not mujoco.) When writing or debugging a specific self-hosted test, override -m yourself to run it:
pytest -m self_hosted dimos/path/to/test_something.py

Testing on a fresh Ubuntu install

CI tests dimos with pre-built images and cached deps, so it can’t catch gaps between what installation/ubuntu.md tells a new user to do and what a clean machine actually needs (e.g. a system package we require but forgot to document). The misc/fresh-ubuntu-tests/ harness closes that gap. It replays the documented install + test flow inside a fresh, official, unmodified Ubuntu Desktop 24.04 VM (VirtualBox). It’s intended to be executed locally.
skip
cd misc/fresh-ubuntu-tests

./vmtest.sh build   # download + verify the official ISO, install, snapshot "golden" (once, ~15-30 min)
./vmtest.sh run     # clone golden, run the doc flow, report PASS/FAIL
./vmtest.sh clean   # delete leftover run clones and logs (keeps the ISO + golden VM)

Writing tests

Test files live next to the code they test. If you have dimos/core/pubsub.py, its tests go in dimos/core/test_pubsub.py. When writing tests you probably want to limit the run to whatever tests you’re writing:
pytest -sv dimos/core/test_my_code.py

Fixtures

Pytest fixtures are very useful for making sure test failures don’t affect other tests. Whenever you have something that needs to be cleaned up when the test is over (disconnect, close, delete temp files, etc.), you should use a fixture. Simple example code:
import pytest

class RobotArm:
    def __init__(self, device: str) -> None:
        self.device = device
        self._position = (0.0, 0.0, 0.0)

    def connect(self) -> None:
        return None

    def disconnect(self) -> None:
        return None

    def move_to(self, x: float, y: float, z: float) -> None:
        self._position = (x, y, z)

    @property
    def position(self) -> tuple[float, float, float]:
        return self._position

@pytest.fixture
def arm():
    arm = RobotArm(device="/dev/ttyUSB0")
    arm.connect()
    yield arm
    arm.disconnect()

def test_arm_moves_to_position(arm):
    arm.move_to(x=0.5, y=0.3, z=0.1)
    assert arm.position == (0.5, 0.3, 0.1)
The yield is key: everything before it is setup, everything after is teardown. The teardown runs even if the test fails, so you never leak resources between tests.

Mocking

It’s easier to use the mocker fixture instead of unittest.mock. It automatically undoes all patches when the test ends, so you don’t need with blocks. Patching a method:
def test_uses_cached_position(mocker):
    mocker.patch("dimos.hardware.RobotArm.get_position", return_value=(0.0, 0.0, 0.0))
    arm = RobotArm()
    assert arm.get_position() == (0.0, 0.0, 0.0)
There are other useful things in mocker, like mocker.MagicMock() for creating fake objects.

Useful pytest options

OptionDescription
-sShow stdout/stderr output
-vMore verbose test names
-xStop on first failure
-k fooOnly run tests matching foo
--lfRerun only the tests that failed last time
--pdbDrop into the debugger when a test fails
--tb=shortShorter tracebacks
--durations=0Measure the speed of each test

Tool files

Dev-only pseudo-tests — the kind that need human interaction or make no assertions — live in tool_*.py files (e.g. dimos/protocol/pubsub/benchmark/tool_benchmark.py). pytest never collects them, because the filename doesn’t match the test_*.py pattern, so a normal pytest run stays clean. Run one on demand by naming it directly:
pytest -s dimos/path/to/tool_file.py
(-s keeps stdout/stdin open for prints and interactive input; add --timeout=0 for long-running or interactive ones.)

Markers

We have a few markers in use now.
  • self_hosted: used to mark tests that need the self-hosted runner (LFS, ROS, CUDA, heavy deps).
  • mujoco: tests which use MuJoCo. These are very slow and don’t work in CI currently.
If a test needs to be skipped for some reason, please use on of these markers, or add another one.
  • skipif_in_ci: tests which cannot run in GitHub Actions
  • skipif_no_openai: tests which require an OPENAI_API_KEY key in the env
  • skipif_no_alibaba: tests which require an ALIBABA_API_KEY key in the env