Dev Container Template für Projekte - mit Python, Dev Containers, Tests
- Python 60.4%
- Dockerfile 39.6%
| .devcontainer | ||
| .forgejo/workflows | ||
| .vscode | ||
| src | ||
| tests | ||
| .gitignore | ||
| .python-version | ||
| AGENTS.md | ||
| pyproject.toml | ||
| README.md | ||
| settings.yaml | ||
Agent Instructions
This document defines the development standards for this project. All LLM agents and coding assistants must follow these rules strictly.
Language
- All code, variable names, function names, class names, and file names must be in English
- Commit messages in English
- This file and README may be in any language
Code Style
- No unnecessary comments — clean class and function design is preferred over explanation
- Comments are only allowed when something is genuinely non-obvious or required by compliance
- Always use type hints — no untyped functions or variables
- No wildcard imports (
from module import *) — always import explicitly - Use Dataclasses or Pydantic for data models, never raw dicts
- Never hardcode secrets — use
.envfiles and environment variables
Project Structure
project/
├── src/
│ ├── core/ # Basic functions, data models, shared utilities
│ ├── <module_a>/ # Feature module with clear responsibility
│ └── <module_b>/ # Feature module with clear responsibility
├── tests/
│ ├── unit/ # Unit tests mirroring src/ structure
│ └── integration/ # Integration tests with real external calls
├── .env.example
├── pyproject.toml # Single source of configuration
└── AGENTS.md
Module Rules
- All modules live under
src/ - Every module must have a clear, single responsibility
- No circular dependencies — dependency flow must be strictly one-directional
core/is the base layer: it may not import from other modules- Other modules may import from
core/but not from each other unless explicitly justified
Configuration
pyproject.tomlis the single source of truth for all tooling configuration- No
setup.py, no standalonerequirements.txt - Dependencies managed via
pyproject.toml(e.g. withuvorpip)
Linting & Formatting
- Ruff must always be run and all issues must be resolved before committing
- Ruff ignores are only permitted in exceptional cases (e.g. test code) and must be justified inline:
result = eval(expr) # noqa: S307 — required for dynamic expression evaluation in sandbox - mypy must pass with no errors
- Pre-commit hooks enforce Ruff + mypy on every commit
Testing
- Tests mirror the
src/module structure exactly - Unit tests cover all functions and core logic in isolation
- External services (APIs, databases) must be mocked
- Integration tests use real external calls (e.g. real LLM API calls) — no mocking
- Fixtures are centralized in
conftest.py - Minimum test coverage: 80%
- Tests are named descriptively:
test_<what>_<condition>_<expected_result>
Dev Container
- A
devcontainer.jsonmust be present with all required extensions pre-configured - The container must be fully self-contained — no manual setup steps after opening
- All tools (Ruff, mypy, pytest) must be available inside the container
Prompt Management
- All prompts are stored as Jinja2 template files (
.j2) — never inline strings in code - Prompts live in a dedicated
prompts/directory, globally accessible across modules - Use macros and partials (via
{% macro %}and{% include %}) to reuse common prompt building blocks - Static prompt sections always come first to maximize KV cache efficiency — dynamic content goes at the end
- Prompt structure example:
prompts/
├── partials/
│ ├── _system_base.j2 # Shared system prompt base
│ ├── _output_format.j2 # Reusable output format instructions
│ └── _safety.j2 # Shared safety/constraint block
├── macros/
│ └── _common.j2 # Shared macros (e.g. format_examples)
└── tasks/
├── classification.j2
└── summarization.j2
- Example prompt structure (static first, dynamic last):
{% include 'partials/_system_base.j2' %}
{% include 'partials/_output_format.j2' %}
{# Static context above this line — benefits from KV cache #}
{# Dynamic content below — changes per request #}
User input: {{ user_input }}
Context: {{ context }}
LLM Retry Strategy
- All LLM calls must be wrapped in a retry block
- Retry is triggered by:
- LLM API errors (rate limits, timeouts, server errors)
- Failed output validators (e.g. Pydantic parsing errors, schema violations, custom business logic checks)
- Retry configuration (max attempts, backoff) comes from
settings.yaml— never hardcoded - Example pattern:
for attempt in range(settings.llm.retry.max_attempts):
try:
response = call_llm(prompt)
result = validate(response) # raises on invalid output
break
except (LLMApiError, ValidationError) as e:
if attempt == settings.llm.retry.max_attempts - 1:
raise
await asyncio.sleep(settings.llm.retry.backoff_seconds * (attempt + 1))
Settings
- All configuration lives in
settings.yaml— no scattered config files - Settings are loaded and validated via Pydantic (
BaseSettingsorBaseModel) - Sensitive values (API keys etc.) are injected via environment variables, never in
settings.yaml settings.yamlmay reference env vars:api_key: ${OPENAI_API_KEY}- Settings structure example:
llm:
model: gpt-4o
temperature: 0.2
retry:
max_attempts: 3
backoff_seconds: 2
prompts:
dir: prompts/
logging:
level: INFO
- Corresponding Pydantic model:
class RetrySettings(BaseModel):
max_attempts: int = 3
backoff_seconds: float = 2.0
class LLMSettings(BaseModel):
model: str
temperature: float = 0.0
retry: RetrySettings = RetrySettings()
class Settings(BaseModel):
llm: LLMSettings
prompts_dir: str = "prompts/"
def load_settings(path: str = "settings.yaml") -> Settings:
with open(path) as f:
data = yaml.safe_load(f)
return Settings(**data)
- A
settings.yaml.examplemust always be committed — never the realsettings.yamlif it contains sensitive defaults
What to Avoid
- Over-engineering: no abstractions without a clear use case
- Over-documenting: docstrings only for public API functions
- God classes or modules that do too much
- Any pattern that introduces circular imports
- Hardcoded values of any kind