Performance¶

This document describes the performance characteristics of proxmox-sdk and the optimisations applied to startup time, per-request throughput, and the SDK hot paths. It focuses on the v0.0.8 performance work: generated route metadata, lazy Pydantic model shards, SQLite/WAL mock state, optional faster JSON serialization, and the decision to keep native acceleration at the ASGI server boundary through the Granian Docker image.

The short version: proxmox-sdk stays Python-first internally. The largest wins came from doing less work at startup and from changing the mock-state storage shape. A custom Rust/PyO3 extension would add packaging and maintenance cost without addressing the largest bottlenecks that were measured.

v0.0.8 Performance Summary¶

Area	Change	Result
Route registration	Load generated `route_metadata.json` instead of rebuilding topology from OpenAPI	Startup avoids repeated topology, path, and child-relationship derivation
Pydantic models	Lazy-load `models/<group>.py` shards using `model_index.json`	App import no longer executes the full aggregate `pydantic_models.py` module
Mock state	Default to SQLite rows in WAL mode	Avoids whole-state JSON serialization under a process-wide lock
Serialization	Use `orjson` when installed, stdlib `json` otherwise	Faster payload blob encoding without custom native code
Server runtime	Keep using the Granian Docker variant for Rust ASGI/TLS	Native HTTP server gains without changing package internals
CI guards	Verify metadata/model indexes for every supported schema tag	Prevents stale generated artifacts from shipping

Startup Time¶

Lazy package imports¶

import proxmox_sdk no longer constructs any FastAPI app or imports the SDK. All top-level exports are resolved on first attribute access via __getattr__. This means:

CLI tools that import proxmox_sdk for __version__ or SDK classes start without paying the FastAPI app construction cost.
The app and mock_app attributes are only materialised when accessed (e.g. by uvicorn).

Generated route metadata¶

675 operations across 449 Proxmox VE 9.2 paths are registered at startup. Runtime registration now prefers pre-generated route_metadata.json artifacts, so it does not rebuild topology, parameter maps, mounted paths, model names, and child collection relationships from OpenAPI on every app import.

Step	Before	After
Child path discovery	O(P²) scan across every path pair	O(P) single-pass index during codegen
Schema fingerprint	`model_dump` + `json.dumps` + `sha256` on every call	Computed once, cached as `@cached_property`
Route topology	Derived from OpenAPI during startup	Loaded from `route_metadata.json`
Generated Pydantic models	Full module generated and executed during startup	Route-group model shards lazy-load on first use

The codegen pipeline writes three runtime artifacts per schema version:

route_metadata.json — mounted paths, operation IDs, parameter metadata, and mock topology.
model_index.json — operation-to-model and operation-to-shard lookup.
models/<group>.py — lazy Pydantic model shards grouped by first API path segment.

The runtime loading path is:

register_generated_proxmox_mock_routes() loads the bundled OpenAPI document for the selected PROXMOX_MOCK_SCHEMA_VERSION.
It calls load_route_metadata(version_tag).
If route_metadata.json is present, route registration iterates the metadata routes and builds lightweight FastAPI dispatcher closures.
Each dispatcher uses the precomputed mounted path, operation ID, parameter metadata, request schema, response schema, and mock topology.
The generated OpenAPI document is merged directly into /openapi.json.

FastAPI routes are registered as lightweight request dispatchers and excluded from automatic schema generation. The application merges the bundled generated OpenAPI paths into /openapi.json, preserving Swagger/ReDoc detail without importing every generated model at startup.

If metadata is missing, the route builder falls back to the older OpenAPI walk. That fallback is intentionally retained for development and compatibility, but the shipped package and CI-tested schemas use the metadata path.

Lazy Pydantic model shards¶

The aggregate pydantic_models.py file still ships for compatibility, but it is no longer the primary startup path. The codegen pipeline also writes models/<group>.py files and model_index.json:

proxmox_sdk/generated/proxmox/latest/
├── openapi.json
├── pydantic_models.py       # compatibility aggregate
├── route_metadata.json      # route registration input
├── model_index.json         # operation -> shard/model lookup
└── models/
    ├── access.py
    ├── cluster.py
    ├── nodes.py
    ├── pools.py
    ├── storage.py
    └── version.py

When a route needs a request or response model, the runtime asks load_operation_model(version_tag, operation_id, role). That function reads the operation entry from model_index.json, imports only the matching models/<group>.py shard, rebuilds models in that module, and returns the specific class. Requests to /api2/json/nodes therefore import the nodes shard and do not import the aggregate module.

The behavior is covered by tests/mock/test_routes.py, which asserts that loading get_nodes imports proxmox_sdk._generated_models.<tag>.nodes and does not import the aggregate module.

Per-Request Throughput (Mock Mode)¶

SQLite/WAL state store¶

The default mock state backend is now SQLiteMockStore, selected by PROXMOX_MOCK_STORE=sqlite or by leaving the variable unset. It stores objects, collection members, and tombstones as separate SQLite rows instead of serialising the entire mock state into one shared-memory JSON blob for every read/write.

Configuration:

PROXMOX_MOCK_STORE=sqlite              # default
PROXMOX_MOCK_STATE_PATH=/tmp/mock.db   # optional explicit DB path
PROXMOX_MOCK_STORE=shared-memory       # legacy shared-memory backend
PROXMOX_MOCK_STORE=dict                # process-local test fallback

SQLite runs in WAL mode with a per-process connection and still resets state when the loaded schema fingerprint changes. Instead of storing one global JSON object, it stores separate rows for:

scalar/object resources
collection members
tombstones for deleted resources
metadata, including the active schema fingerprint

This matters more than Python-versus-Rust serialization for realistic mock state sizes. Whole-state JSON serialization under a lock makes every request pay for unrelated mock data. Row-level storage keeps reads and writes scoped to the resource path being accessed.

Payload blobs use orjson when it is installed, falling back to the standard json module otherwise. This is the preferred native-speed JSON path before considering custom PyO3 serialization.

Shared-memory locking¶

The legacy shared-memory backend remains available and is backed by a shared-memory segment protected by a filesystem lock (fcntl.flock).

Operation	Before	After
Read (GET)	`LOCK_EX` — serialises all readers	`LOCK_SH` — concurrent reads allowed
Write (POST/PUT/DELETE)	`LOCK_EX`	`LOCK_EX` (unchanged)

Concurrent GET requests no longer block each other.

Deleted-item tracking¶

The set of deleted keys was stored as a JSON list (O(n) membership checks). It is now materialised as a Python set during each request (O(1) checks) and serialised back to a list only when writing state.

SDK Hot Paths¶

URL construction (`https.py`)¶

_url_for() is called on every API request to join the base URL with the request path. Previously it re-parsed self._base_url with urlsplit/urlunsplit on every call.

Now the components (scheme, netloc, base_path) are parsed once in __init__ and reused:

# __init__
_parsed = urlsplit(self._base_url)
self._base_scheme = _parsed.scheme
self._base_netloc = _parsed.netloc
self._base_path = _parsed.path or "/"

# _url_for (called per-request)
joined = posixpath.join(self._base_path, path.lstrip("/"))
return urlunsplit((self._base_scheme, self._base_netloc, joined, "", ""))

posixpath is now a top-level module import instead of a deferred import posixpath inside the method.

Path joining (`resource.py`)¶

_url_join() is called on every attribute navigation step (proxmox.nodes, .qemu, etc.). A fast path now avoids urlsplit/urlunsplit for plain path strings (no ://), which is the common case for SDK usage:

def _url_join(base: str, *args: str) -> str:
    if "://" not in base:
        return posixpath.join(base or "/", *[str(a) for a in args])
    # full URL parsing only for absolute URLs
    ...

None filtering (`resource.py`)¶

_filter_none() is called once per HTTP method to strip None values from params/data dicts. A fast path skips the dict comprehension when no None values are present (common for well-formed requests):

def _filter_none(d: dict) -> dict:
    if all(v is not None for v in d.values()):
        return d  # avoid allocation
    return {k: v for k, v in d.items() if v is not None}

Task Polling¶

Tasks.blocking_status() previously polled at a fixed interval (2 s), which means:

Short tasks were checked 2 s late on average.
Long tasks (VM migrations, backups) generated many unnecessary API calls.

It now uses exponential backoff:

1s → 2s → 4s → 8s → 16s → 30s (cap)

Timeout tracking uses time.monotonic() for accurate wall-clock measurement instead of cumulative addition (which underestimates actual elapsed time by the duration of each HTTP request).

Request Retry¶

HttpsBackend.request() supports opt-in automatic retry for transient failures.

When retries occur¶

Condition	Retried?
HTTP 502, 503, 504	Yes (GET/HEAD only)
`asyncio.TimeoutError`	Yes (GET/HEAD only)
`aiohttp.ClientError` (DNS, refused, etc.)	Yes (GET/HEAD only)
SSL error	Never (not transient)
HTTP 4xx (client error)	Never
POST / PUT / DELETE	Never (not safe to duplicate)

Backoff strategy¶

Retries use exponential backoff:

attempt 1: backoff * 2⁰  → 0.5s
attempt 2: backoff * 2¹  → 1.0s
attempt 3: backoff * 2²  → 2.0s
…                          (capped at 30s)

Default max_retries=0 preserves existing behaviour with no delay or overhead on the hot path.

Configuration¶

ProxmoxSDK(
    host="pve.example.com",
    user="admin@pam", password="secret",
    max_retries=3,       # 3 retries = 4 total attempts
    retry_backoff=0.5,   # 0.5s, 1.0s, 2.0s
)

Or via environment:

PROXMOX_API_RETRIES=3
PROXMOX_API_RETRY_BACKOFF=0.5

Runtime Server¶

The project keeps native server acceleration at the ASGI boundary. Use the existing Granian Docker variant when you want a Rust-based HTTP/TLS server without adding PyO3 or native extensions to the SDK internals:

docker run -p 8443:8000 emersonfelipesp/proxmox-sdk:latest-granian

The SDK, route metadata, lazy model loading, and mock state code remain pure Python so source installs and generated artifacts stay simple.

Why not a custom Rust/PyO3 extension?¶

The measured bottlenecks were not "Python loops are slow" in isolation. They were:

repeated OpenAPI topology work during startup
eager import and execution of the full generated model module
whole-state JSON serialization under locks
ASGI/server overhead that is already handled by Granian

PyO3 would not remove the need to generate route metadata, shard models, or change the mock-state storage shape. It would also introduce native build requirements for a package that currently installs as pure Python. The current policy is therefore:

pre-generate static route metadata where the schema is already known
lazy-load generated Python models by route group
use proven serializers such as orjson or msgspec before custom native code
fix the storage design if state grows beyond SQLite/WAL needs
use Granian for Rust-native ASGI/server wins

Revisit PyO3 only if profiling shows a narrow, stable CPU hotspot that remains after those design issues are addressed and cannot be solved with an existing native dependency.

Config Loading¶

ProxmoxConfig.from_env() previously copied the entire os.environ dict (50–200+ keys) on every call. It now reads only the ~20 specific keys it needs:

_KEYS = (
    "PROXMOX_API_MODE", "PROXMOX_API_URL", "PROXMOX_API_TOKEN_ID", ...
)
env_config = {k: v for k in _KEYS if (v := os.environ.get(k)) is not None}

yaml (PyYAML) is only imported when a YAML config file is actually present, avoiding a heavy import on every startup.

Dead Code Removed¶

mock/schema_helpers.py previously contained ~200 lines of private functions (_seed_int, _field_hint, _semantic_string_value, etc.) that duplicated the same logic inside ProxmoxSchemaValue in schema.py. None of the module's public functions called these copies — they all delegated directly to ProxmoxSchemaValue. The dead code was removed, reducing module load time and memory footprint.

Regex Pre-compilation¶

The pattern r"(^|_)name$" used in semantic mock value generation was previously compiled on every call via re.search(pattern, hint). It is now compiled once at module load:

_RE_NAME_HINT = re.compile(r"(^|_)name$")

CI and Packaging Guards¶

The generated performance artifacts are part of the package contract. They are included through pyproject.toml package data:

[tool.setuptools.package-data]
proxmox_sdk = [
    "generated/proxmox/**/*.json",
    "generated/proxmox/**/*.py",
]

The integrity tests check all supported schema tags (latest, 9.2, and 9.1.11):

openapi.json SHA matches the generated model metadata
route_metadata.json source SHA matches openapi.json
route count and path count match the OpenAPI document
every operation in route_metadata.json is covered by model_index.json
every shard listed by model_index.json exists under models/

Run the focused guard suite locally with:

uv run pytest tests/test_generated_integrity.py tests/mock/test_routes.py

The GitHub CI and release workflows run the full test suite across:

PROXMOX_MOCK_SCHEMA_VERSION=latest
PROXMOX_MOCK_SCHEMA_VERSION=9.2
PROXMOX_MOCK_SCHEMA_VERSION=9.1.11

The scheduled/manual schema update workflow also runs proxmox-sdk-codegen. That command writes route_metadata.json, model_index.json, and models/<group>.py for both the package-internal tree and the source-of-record generated/proxmox/ tree.

Performance Characteristics (Updated)¶

Mock Mode¶

Metric	Value
Startup (schema load + route registration)	~2.1 s full app import in local smoke test
Route registration only	~0.75 s in local smoke test
Request latency (SQLite/WAL, warm state)	~4 ms via local TestClient smoke test
Concurrent read throughput	SQLite WAL readers avoid whole-state JSON serialisation
Memory (schema + state)	~100 MB

Real Mode¶

Metric	Value
Startup	~500 ms
Request latency	Proxmox latency + ~10–20 ms validation overhead
Memory	~80 MB

SDK¶

Operation	Notes
First request	One SSL context build + one session open
Subsequent requests	Connection reused from aiohttp pool
Ticket renewal	Automatic, uses same SSL context as API requests
Task polling	Exponential backoff, accurate monotonic timeout

Profiling Tips¶

To measure startup time:

python -c "
import time
t0 = time.perf_counter()
from proxmox_sdk.main import app
print(f'app ready in {time.perf_counter() - t0:.3f}s')
"

To measure mock route registration specifically:

python -c "
import time, json
from proxmox_sdk.mock.routes import register_generated_proxmox_mock_routes
from proxmox_sdk.schema import load_proxmox_generated_openapi
from fastapi import FastAPI

doc = load_proxmox_generated_openapi()
app = FastAPI()
t0 = time.perf_counter()
register_generated_proxmox_mock_routes(app, openapi_document=doc)
print(f'route registration: {time.perf_counter() - t0:.3f}s')
"