Changelog
Every release, in reverse order
Notable changes only. The rest are in the audit log, where they belong.
The agent pivots to bearer-token auth with sealed credentials. The control plane grows a verification pipeline and an alerts module.
Agent
Added
- AddedBearer-token authentication. The agent loads a token and a wrapping key from the environment, persists the wrapping key locally, and unseals the credentials envelope with AES-GCM at use time.
- Added
db_probecommand handler: live database-connectivity checks dispatched through the same lifecycle as backup and restore. The probe result is reported back overPostProbe. - AddedBounded-concurrency worker pool with durable lifecycle. Inflight rows are reconciled on startup by the recoverer, and the daemon drains the pool on shutdown.
- AddedEnd-to-end command handlers:
BackupHandler,RestoreHandler, and the probe handler all run their respectiverun_*flows through the worker pool. - AddedHeartbeat dispatch. Pending jobs are parsed from the heartbeat response and run through the worker pool. No long-poll, no side channel.
- AddedHalt-on-401 in
Daemon.Tick: the daemon refuses to keep talking to a control plane that has revoked its token. - AddedSFTP fingerprint mode in the credentials wire shape, aligned with the server's expectations.
- AddedOSS-prep:
LICENSE,SECURITY.md, a bearer-aware HTTP client, and daemon entrypoint polish for the first public release.
Changed
- ChangedCredentials wire shape aligned with the server's sealed-envelope format.
crl_cachetable dropped via a v1 to v2 state migration. - ChangedBinary registry. Published Postgres binaries are now sourced from a static-musl bash matrix.
- ChangedCI: race-detector step runs with CGO enabled, golangci-lint upgraded to v2 with the real lint debt cleared, and the licensed gitleaks action replaced with a binary install.
Control plane
Added
- AddedVerification module, server-side. A dedicated
bkpdb-verificationq-cluster runs a Docker-based pipeline against the latest backup. Trigger on successful upload, on-demand verify-now from the backup detail page, per-result detail page, and a policy config tab on the database detail. - AddedBoolean-returning SQL checks in the verification policy.
statement_timeoutis set viaPGOPTIONSso a runaway query does not block the worker. - AddedAlerts module. Events with dedup, incidents with recovery, channel CRUD with address verification, per-database channel scope, and a policy/events/incidents dashboard.
- AddedAlert channels: email via
QueuedEmailBackend, Slack via incoming webhook. A handler registry routes events to the right one with retry and backoff. - AddedAlert wiring: backup, verification, and retention modules fire
failedandrecoveredevents. A still-failing reminder cron fires for incidents that have not recovered. - AddedRestore from backup detail page, and adhoc-target restore with cancel and polling.
- AddedContainerised dev stack with separate
webappandworkerservices.
Changed
- ChangedAgent auth: mTLS replaced with bearer tokens and sealed credentials. The operator picks
agent_nameat token issue, not on the agent host. - Changed
/agent/v1/and/api/v1/surfaces force JSON responses. - ChangedHeartbeat
pending_jobsentries includeattemptso the agent can distinguish retries from first runs. - ChangedEnvironment variables renamed from
DBCRATE_*toBKPDB_*.
Fixed
- FixedAlerts:
record_eventenqueues fan-out reliably, and the reconnect recoverer no longer crashes on an unparsable agent primary key.
Schedules start firing. The agent is pinned to a database.
Agent
Added
- AddedHeartbeat dispatch: pending jobs are parsed from the heartbeat response and run through the worker pool. No long-poll, no side channel.
- Added
run_probecommand handler for live database-connectivity checks; the dispatcher maps it through the same lifecycle as backup and restore. - Added
--log-levelflag with debug-level wire tracing for heartbeat and/config. - AddedSOPS-managed key material for builds; the published binary registry is now sourced from a static-musl bash matrix.
Changed
- Changed
/configwire shape:expected_postgres_version→postgres_major(nullable). The probe-result contract carries the detected major back.
Control plane
Added
- AddedSchedule firing wired end-to-end.
Schedulerows synchronise into the django-q2 scheduler on signal, a boot check refuses to start if the two have drifted, a scheduler heartbeat gates/readyzfor the first three minutes, and a missed-fire detector flips staleQUEUEDruns so the next cron tick can pick them up. - AddedCron →
Jobdispatch and pending-jobs in the heartbeat response. - AddedAgent ↔ database pinning. Each database is pinned to a single agent in its organisation. The dispatcher refuses to send work to any other agent, and
/confignarrows to the agent's pinned databases only. - AddedOn-demand “Run backup now” from the database detail page, async, with a flash, hidden until the pinned agent has been heard from.
- AddedFirst-run welcome pages on databases, agents, backups, and storage when an organisation has no rows yet, with the next CTA inside the empty-state hero.
- AddedRedesigned backups list and detail: window / outcome / database facets, cross-field search, fixed-layout responsive rows, and a presigned download link for the encrypted object.
- AddedNewsreader / Inter / JetBrains Mono self-hosted across the console, theme-aware wordmark and favicon on every page, and the canonical overview dashboard built from real organisation data.
Fixed
- FixedUser-defined schedules that never fired because of a stale
next_runand a leftover second cluster. Q2 is collapsed to one cluster, and Schedule changes drive the row directly. - FixedEmpty-list pages no longer issue duplicate queries; an
EXISTS-first short-circuit short-circuits the empty case.
The agent becomes a daemon. The dashboard learns to drive it.
Agent
Added
- AddedProduction CLI shape:
bkpdb enroll(one-shot, token → identity) andbkpdb run(daemon), plusbkpdb version. Re-enrolling is a deliberate operator gesture; the identity directory must be removed first. - AddedIdentity: Ed25519 keypair generated on the host, CSR built locally, certificate persisted alongside SQLite-backed durable state so in-flight job rows and identity survive a host restart.
- AddedmTLS HTTP client that fetches identity per request and rotates on renewal.
- AddedDaemon: heartbeat loop, state machine, halt-on-401, graceful shutdown that drains the worker pool.
- AddedWorker pool: bounded-concurrency executor with a durable lifecycle; a recoverer reconciles stale in-flight rows on startup.
- AddedCommand handlers for
run_backupandrun_restore, plus result-reporting endpoints (PostBackup,PostRestore) that carry structured failure reasons back to the control plane. - AddedCertificate renewal goroutine, with a
ShouldRenewAtdecider and a renewal RPC that obtains a fresh certificate well before expiry.
Control plane
Added
- AddedDatabase creation wizard. Five steps: connection, agent pick-or-install, storage destinations (one primary, optional mirrors), schedule rules, review. The agent step live-polls for fresh heartbeats so the operator can watch a freshly installed agent come up in the same tab.
- AddedStorage destination probe: an end-to-end
head_bucket→put_object→get_object(with byte match) →delete_objectsweep, run synchronously at submit and exposed as an HTMX endpoint for the “Test now” button. Results are cached on the destination. - AddedExpanded storage form to every S3-compatible provider with a tested config, plus a generic custom-endpoint option.
- AddedAgent certificate renewal endpoint and a CRL endpoint.
Backups become unreadable to us. The agent encrypts on the host; the control plane keeps nothing it can decrypt.
Agent
Added
- Addedage v1 encryption pipeline (X25519 + ChaCha20-Poly1305). Encrypt, decrypt, and recipient-fingerprint paths, exercised end-to-end against vanilla upstream
ageon the test path.
Control plane
Added
- AddedEnd-to-end encryption to the organisation's public key. Per-organisation X25519 keypair, auto-issued at organisation creation, with a one-time recovery-key download gated behind recent-auth on the security tab. A backfill command exists for organisations created before the keypair work landed.
- AddedPer-backup fingerprint pinned at upload time. A backup's file key is unwrapped on demand with a ranged
GETagainst the age header, so only the few bytes needed reach the control plane. - AddedKeypair rotation as a one-button operation on the organisation security tab. The rotation re-wraps each backup's age header (the file body is untouched), records progress in a
KeypairRotationJob, and handles backups that arrive mid-rotation. - AddedSettings shell with tabs (Organisation / People / Security) for admins, mirroring the operator profile (Identity / Security / Preferences / Activity). Timezone middleware and a
user_dtfilter route every timestamp through the operator's preferences. - AddedOrganisation invitations, with SHA-256-hashed tokens and an accept flow.
Removed
- RemovedAn earlier in-database cache of unwrapped DEKs. Rotation operates on the age header directly; nothing decrypted lives at rest in the metadata DB.
The control plane comes online. The agent stops talking to YAML and starts talking to a server.
Agent
Changed
- ChangedProject pivots to a server-driven architecture. The dev-mode YAML harness is kept for local development; the agent's source of truth in production is now the control plane it heartbeats to.
Control plane
Added
- AddedFirst public cut. Django 6 application with operator accounts, organisations, role-based access control across four roles, an append-only audit log, envelope encryption for stored credentials, redacted structured logs,
/healthzand/readyzendpoints, a CSP-enforcing middleware, and a django-q2 task cluster wired into boot. - AddedDatabases, storage destinations, schedules, and retention policies as first-class records. The default retention is GFS (7 daily, 4 weekly, 12 monthly), overridable per database, with an optional hard age bound.
- AddedMutual TLS between the agent and the control plane. The agent generates its keypair on the host, sends a CSR, and receives a leaf certificate signed by the agent CA. The private key never leaves the host.
- AddedHeartbeat and
/configendpoints with hash short-circuiting. The server tells the agent it is up-to-date without re-sending the whole config when nothing has changed. - AddedSingle-use job credential exchange. Database passwords and storage secrets are fetched per job, never stored on the agent's disk, and the issuance is recorded in the audit log.
- AddedOperator-facing backups history and a console dashboard. One-shot restore initiation from the console, with the agent reporting the result back.
And to talk to SFTP.
Agent
Added
- AddedSFTP storage backend, planned and built across forty-nine scenarios. OpenSSH + SFTP on the integration path, sat behind the same
StorageBackendinterface the S3 path uses. - Added
Upload/Download/Delete/List/ResumeUploadon SFTP, with the same crash-resume receipt the S3 path uses.
Changed
- Changed
ConfigProviderrejects S3 and SFTP destinations missing required fields up front, instead of failing partway through a job.
The agent learns to talk to S3.
Agent
Added
- AddedS3-compatible storage backend, built on AWS SDK v2 with a configurable endpoint. Works against AWS, Cloudflare R2, Backblaze B2, MinIO, Wasabi, Hetzner Object Storage, DigitalOcean Spaces, and anything else that speaks the protocol.
- AddedResumable multipart uploads. The agent persists a small receipt after every part, so a crashed upload restarts where it left off instead of from byte zero.
- AddedIntegration suite that drives the backend against a real MinIO:
Upload,Download,Delete,List, andResumeUpload, plus credential-leak negative tests.
Changed
- ChangedBackups are keyed automatically from the database name and a UTC timestamp when no explicit key is supplied; the previous required flag is now optional.
Postgres binaries get a registry, a signature, and a trust chain that survives rotation.
Agent
Added
- AddedHTTPS binary registry with signed manifests. The agent fetches a
pg_dump/pg_restorebundle for the server's major on demand and verifies the manifest signature before extraction. - AddedTwo-key trust hierarchy: a long-lived ROOT public key baked into the agent at build time signs a short-lived MSK certificate; the MSK signs every published manifest. Rotation does not require a new agent build.
- Added
bkpdb binaries list,bkpdb backup, andbkpdb restoreas one-shot dev CLI subcommands against the configured registry. - AddedReal-fixture error tests for the registry path: server failures, network failures, cache corruption. Each one returns a clean, well-typed error rather than a panic.
Changed
- ChangedBuild pipeline:
make build-devandmake build-prodbake the registry URL into the binary via-ldflags. The runtime YAML block becomes optional.
Fixed
- FixedResponse size caps on registry fetches, a deterministic clock for manifest-validity checks, and a defensive path for resolving the cert URL relative to the manifest.
The agent learns to take a backup, put it back, and do both across every supported Postgres major.
Agent
Added
- AddedEnd-to-end streaming backup pipeline:
pg_dump→zstd→ authenticated encryption → upload. Nothing of size is written to local disk. - AddedInverse restore pipeline: download → decrypt → decompress →
pg_restore, closing the dev-mode loop. - AddedLocked-flag
pg_dumpandpg_restoreadapters. The agent always emits Postgres custom format with a fixed set of flags. - AddedCross-version matrix:
pg_dumppinned to the source's major,pg_restorechosen asmax(source, target), with a small unencrypted descriptor sidecar so the restore knows which binary pair to ask for. - AddedEnd-to-end test matrix that runs backup and restore against real Postgres 13 through 18 in Docker.
The project is laid out. The agent is given a shape, and four interfaces to talk to the world.
Agent
Added
- AddedProvider interfaces locked at the agent boundary:
ConfigProvider,StorageBackend,BinaryRegistry,Notifier. - AddedDev-mode implementations (a YAML config loader, a local filesystem backend, a local-directory binary registry, and a stderr notifier) behind a dual-signal dev-mode gate.
- AddedPre-flight Postgres major-version detection. The agent refuses to back up a server whose major it has no matching dump binary for, with a clear error rather than a half-finished archive.
❦The end of the record❦
Earlier alpha builds are kept in the engineering archive. They are not, in good conscience, fit for a public document.