For Agents

Reverse-chronological session log. Newest entries at top, grouped by date (## YYYY-MM-DD). Each bullet: one piece of work, short summary, wikilinks to docs touched. Updated by obsidian-documenter on every project doc write. Read by historian at bootstrap (top ~15 entries).

2026-05-26

  • Investigation captured (diagnosed, not fixed): ErrorWithStepStatus::log(status, message) returned from any flow step silently drops message at the parent log site. StepResult::log() (mando-lib/src/workflow/mod.rs:150-229) destructures the Log variant with .. (lines 155-159), dropping message; only the generic wrapper string + flow.step.status + flow.step.execution_time reach tracing::error!/warn! (lines 196-225). Display impl (mod.rs:443) is #[error("status: {status}")] — message also dropped from stringification. status_or_error (mod.rs:231-239) collapses LogOk(status), losing it again. Top-level catch at mando-bess/src/workflow/flow.rs:289 only sees "status: Error". Error(anyhow) arm is correctly logged via error! (mod.rs:160-171) — only ::Log is broken. Tests at mod.rs:548-605 assert level + wrapper string only, never message payload — how the regression shipped. Affects all environments. Possible overlap with follow-up commits c945514e, 4c543cb1, 567373a5, 63d6fa69, 8f9297e4 on feature/BE-2272 — diff before patching. Secondary: mando-lib/src/app/dd_formatter.rs:122-124 record_error uses value.to_string() (Display only), but niche path. → flow-step-log-message-dropped-2026-05-26.
  • mando-cli session: triaged Gabi’s 2026-05-25 bug report against v0.4.0 compose-runtime rewrite (commit 6ba0d61). Both reported bugs CONFIRMED real. (1) src/runtime/templates/runconfig/build.yml:11 + build_dev.yml:18 use context: . — Compose resolves relative paths from the compose-file’s parent dir, so context becomes <project>/runconfig/ (no Dockerfile). Fix: context: ... (2) Mocked runconfig — mocked.yml only defines <service>-mocks, but up.rs:192 passes bare slug; reporter’s diagnosis was incomplete — the exact “Must specify either image or build” error originates in render_override (templates.rs:298-325) which emits a malformed <service>: stub per docker_target into .mando/override.builtin.yaml. The stub is normally dormant via profiles: ["{run_tag}"] (templates.rs:317) — that’s a load-bearing invariant. Cleanest fix: rename positional arg AND skip mocked entries in override generation. Smoke-test round (8de8f64) missed both: Bug 1 masked by image cache; Bug 2 not exercised with mando-mocked-algos set as default profile against fresh checkout. Fixes not yet committed. → mando-cli-v0.4.0-compose-bugs-triage-2026-05-26.

2026-05-22

  • mando-cli session: documented the local macOS (Apple Silicon) cross-compile recipe for producing a Linux x86_64 / WSL release binary of mando v0.4.0. Target x86_64-unknown-linux-musl (static-pie). Two gotchas captured: (1) Docker pulls the arm64 image on Apple Silicon → ring 0.17 C build fails with cc1: unrecognized command-line option -m64 → fix is --platform linux/amd64; (2) optional query feature has path deps into ../mando/.worktrees/BE-1595/* that Cargo reads during resolution even when disabled → must mount the poc/ parent dir. Verified binary in ubuntu:24.04 + alpine (mando --versionmando cli 0.4.0). Distinct from the CI build mirror; cross-linked both ways. → mando-cli-wsl-linux-build.

2026-05-18

  • Investigation captured (diagnosed, not yet fixed): Calculated and Virtual DPs leak rows past to in all four retrieval methods (retrieve/retrieve_at/retrieve_history/retrieve_client). Root cause in MandoServiceBase::handle_data_point_types (mando-lib/src/service_base.rs:94-159) — Virtual/Calculated branches lack a final filter_data_frame_by_range after Polars transformations. Two leak mechanisms: (A) convert_to_metadata upsampling explodes 1 row → N (convert_resolution.rs:39-95); (B) evaluate_expression Full-join/concat-group_by produces union of dep timestamps (evaluation.rs:62-72, 175-178). EvaluationMetaData.range is plumbed but only consumed by FillMissing. Proposed fix: trim per-DP at final Virtual/Calculated branches using evaluation_metadata[&dp_id].range. → calculated-virtual-dp-range-cutoff-bug-2026-05-18.

2026-05-06

  • mando-cli session: 5 fixes shipped + 1 design shelved.
    • 0be3458 feat: mando mock down with idempotent teardown (404 from remove_container = success). Pins canonical 7-step pattern for docker-backed lifecycle commands. → mando-cli-mock-down-idempotent-2026-05-06.
    • f8a54bf fix: WireMock healthcheck targets /__admin/health (200) instead of /__admin (302→404) using curl -fsS. Diagnostic technique: docker inspect --format '{{json .State.Health}}' (wget exit 8 = HTTP error). → mando-cli-mock-down-idempotent-2026-05-06.
    • 68bcc63 fix: mando status made read-only and bounded under 2s. New connect_readonly (single connect + 2s timeout, no retries, no ensure_database) and table_exists helpers in db/flyway.rs; sets statement_timeout = '2s' post-connect. Status commands must be pure reads. → mando-cli-status-readonly-2026-05-06.
    • 6b1f7c7 feat: yaml-driven build context to stop COPY-everything hangs. New build.context_includes: Vec<String> on ServiceBuildDef + new runtime/build_context.rs::build_filtered_tar used by both commands/build.rs and runtime/runner.rs. Caught + fixed runner.rs hard-coded "Dockerfile" regression in same commit. → mando-cli-build-context-filter-2026-05-06.
    • 302be50 feat: shipped context_includes defaults for all 5 app services in src/config/defaults/*.yaml. → mando-cli-build-context-filter-2026-05-06.
    • SHELVED: profile-driven build variants (dev runtime-only Dockerfile + cargo build --release pre-step vs release multi-stage chef Dockerfile). Captured design + open questions; no code shipped. → mando-cli-build-variants-shelved-2026-05-06.
  • Parallel-release CI restructure shipped to mando-cli-github-build-mirror (a939117 on master): split monolithic gitlab-release job into init-gitlab-releasebuild matrix (each matrix job uploads + links its own binary) → release + gitlab-finalize (checksums only). Linux/macOS no longer block on Windows aarch64. New “Parallel release flow (2026-05)” section in the doc.

2026-05-05

  • BE-2272 branch feature/BE-2272 (renamed from prior bugfix/BE-2023) — continuation of the BE-1842 Datadog Observability arc; flattens DD log JSON.
  • Removed the span.* namespace from formatter output: flow.exec_id, flow.context, step.name, step.connection now sit at the document root alongside error.* / http.* (symmetric DD facet layout).
  • Single-file change in mando-lib src/app/dd_formatter.rs (+295/-16): dropped serialize_entry("span", ...), added MapVisitor: tracing::field::Visit to collect event fields into serde_json::Map<String, Value>, span-fields-first / event-fields-second merge with explicit event-wins precedence.
  • Removed magic name injection in collect_span_fields (was outermost span name; unused in DD dashboards).
  • 11 unit tests added with a reusable capture harness (tracing::subscriber::with_default + custom MakeWriter over Mutex<Vec<u8>>); pattern reusable for future dd_formatter changes.
  • 262 workspace tests pass, 0 regressions; scope strictly contained to the formatter.
  • Plan in repo: docs/superpowers/plans/2026-05-05-flatten-log-fields-to-root.md.
  • Open follow-ups: DD dashboard column migration (@span.X @X), execution.id vs flow.exec_id naming unification, dead ErrorCode derive arms in mando-lib-macro.
  • Branch state: local-only on feature/BE-2272, uncommitted.

2026-05-04

  • dc7b4259 chore: bumped Cargo.lock for py-mando after pulling in thiserror dep.
  • cd97fc35 fix: converted py-mando error logs to mando_core::error! macro so Python-binding errors carry typed error.kind (parity with Rust pattern from MR !481).
  • 923603f0 refactor: removed inline step.name/step.connection event fields now that the step span carries them — children inherit via dd_formatter root→leaf scope walk.
  • b8d3278d fix: added step.name and step.connection onto the step span at mando-lib/src/workflow/mod.rs:350 so child events inherit them in Datadog (see BE-1842 Datadog Observability).
  • c945514e fix: log step errors at the failure site to preserve real error.kind instead of generic wrapper at the catch boundary.
  • 4c543cb1 fix: downgraded parent flow error logs to warn when the child step has already logged the error (deduplicates Datadog noise).
  • ab622e29 fix: instrumented every tokio::spawn call with tracing spans so async tasks no longer drop trace context.
  • 5618603e fix: removed per-layer FilterFn from the OTel layer — the filter was suppressing events and breaking span field inheritance (root cause of BE-1842 Datadog Observability regressions).
  • 117f7b58 fix: foundation commit on bugfix/BE-2023 — deduped step error logging, upgraded OTel deps, threaded execution_id through FlowInfo.
  • All 9 commits are follow-ups to MR !481 (feat: error handling redesign, BE-2023) addressing reviewer feedback (Balazs Mracsko, Krisztian Fekete) and Datadog defects; iterative debugging captured in screenshots under /Volumes/bandi/coding/poc/mando/ (datadog-tab2-broken.png, dd-doublelog-1.png, dd-current-state.png, dd-log-expanded.png, etc.). Context: Agent Context.
  • Initialized activity log.