Skip to main content

Daemon

The daemon is the one Omnitron component that stays up. Everything else — CLI invocations, webapp connections, child app processes — is transient. The daemon owns the process tree, the on-disk state, the lock, and the RPC surface.

Verified against src/daemon/:

daemon/
├── daemon.ts (1591 lines) — main daemon class
├── daemon.rpc-service.ts — Netron-facing service
├── daemon.module.ts — Titan module wiring
├── daemon-client.ts — Typed RPC client (used by CLI)
├── daemon-entry.ts — Standalone entrypoint
├── daemon-scheduler.ts — Periodic task scheduler
├── pid-manager.ts — PID file lock + liveness sweep
└── state-store.ts — Persistent state on disk

Lifecycle

PID manager — pid-manager.ts

Owns the daemon lock file ~/.omnitron/daemon.pid. Responsibilities:

  • Acquire on boot: write process.pid to the file with an exclusive open. If the file exists and the PID inside is still alive, abort.
  • Signature verification. The file carries a process-start signature (PID + start time fingerprint). A stale PID with the same numeric value as a recycled new process is detected and treated as dead.
  • Liveness sweep. A periodic check ensures the lock owner is still alive; reclaims if the holder has crashed without releasing.
  • Release on SIGTERM / SIGINT / normal exit.

When omnitron up reports "daemon already running", this is the check that fired.

State store — state-store.ts

Persists daemon intent to ~/.omnitron/state.json. Atomic write semantics: write to a temp file, fsync, rename over the old file. The previous file is kept as state.json.bak after a successful write — the daemon falls back to the backup if the primary becomes unreadable.

Two modes:

ModeWhenLayout
classicApp was launched via classic launcher (single fork)One PID + status per app
bootstrapApp was launched via module-worker spawner (per-process)Per-process PID + status, parent app aggregates

State is persisted on every status transition (start, stop, crash, restart) plus a baseline flush every 30 s. A crash of the daemon itself leaves the file in a consistent state — on next boot, the daemon rehydrates and relaunches whatever was running.

Daemon scheduler — daemon-scheduler.ts

A small in-process scheduler tied to the daemon's lifecycle. Runs periodic background tasks; all timers .unref()-ed so they don't pin the event loop alive.

TaskDefault cadenceSource
Health probe sweepper app config (default 15 s)titan-health integration
Metrics aggregation tick5 sfrom monitoring.metrics.interval
State persistence flushon transition + 30 sstate-store
Per-app crash backoff timersexponentialper IRestartPolicy
Cluster heartbeat (if cluster.enabled)2 sdaemon config
Node health-monitor sweep60 sDEFAULT_DAEMON_CONFIG.healthMonitor

The cluster heartbeat and node-health sweep only run when explicitly enabled — they do not consume resources on a standalone single-node setup.

Daemon configuration

The daemon's own config (not project / ecosystem config) lives in the DEFAULT_DAEMON_CONFIG object — most operators never touch it. Values:

FieldDefault
socketPath~/.omnitron/daemon.sock
port (TCP)9700
host (TCP)0.0.0.0
httpPort9800
pidFile~/.omnitron/daemon.pid
stateFile~/.omnitron/state.json
logDir~/.omnitron/logs
rolemaster
cluster.enabledfalse
cluster.discoveryredis
cluster.electionTimeout5 000–15 000 ms (jittered)
cluster.heartbeatInterval2 000 ms
secrets.providerfile
secrets.path~/.omnitron/secrets.enc
healthMonitor.intervalMs60 000 (1/min)
healthMonitor.concurrency20
healthMonitor.offlineTimeoutMs90 000
healthMonitor.retentionDays90 (uptime history)
healthMonitor.uptimeIntervalMs86 400 000 (24 h per bar segment)

Override via CLI flags on omnitron up or via the daemon configuration file (advanced — not part of the project ecosystem config).

RPC surface — OmnitronDaemon service

The daemon registers @Service({ name: 'OmnitronDaemon' }) on the Netron bus. Verified from src/daemon/daemon.rpc-service.ts.

Every method is gated by role:

RoleMembersMethods
viewerviewer, operator, adminlist, getApp, status, getMetrics, getHealth, getLogs, inspect, getEnv, getDependencyGraph, getWatchStatus
operatoroperator, adminstartApp, startAll, stopApp, stopAll, restartApp, restartAll, reloadApp, scale, exec, enableWatch, disableWatch
adminadmin onlyshutdown, reloadConfig, setMetricsEnabled
anonymousanyone with socket accessping (allowAnonymous)

Methods, by intent

App lifecycle (operator)

MethodReturns
startApp({ name })ProcessInfoDto
startAll()ProcessInfoDto[]
stopApp({ name, force?, timeout? }){ success: boolean }
stopAll({ force? }){ count: number }
restartApp({ name })ProcessInfoDto
restartAll()ProcessInfoDto[]
reloadApp({ name })ProcessInfoDto
scale({ name, instances })ProcessInfoDto

Inspection (viewer)

MethodReturns
list()ProcessInfoDto[]
getApp({ name })ProcessInfoDto
status()DaemonStatusDto
getMetrics({ name? })AggregatedMetricsDto
getHealth({ name? })AggregatedHealthDto
getLogs({ name?, lines? })LogEntryDto[]
inspect({ name })AppDiagnosticsDto
getEnv({ name })Record<string, string>
getDependencyGraph({ name })Graph object
getWatchStatus(){ enabled, apps: [...] }

Watch control (operator)

MethodReturns
enableWatch({ apps? }){ watching: [...] }
disableWatch(){ success: boolean }

Admin (admin)

MethodReturns
shutdown({ force? }){ success: boolean }
reloadConfig(){ success: boolean }
setMetricsEnabled({ name?, enabled }){ success: boolean }

Heartbeat (anonymous)

MethodReturns
ping(){ uptime, version, pid }

RPC plumbing (operator)

MethodReturns
exec({ name, service, method, args })unknown (whatever the called method returns)

exec is what omnitron exec api users findById u_42 routes through.

Auth flow

Authentication is intentionally split between transports:

The Unix socket auth-bypass is deliberate: file mode 0o600 means only the daemon's UID can open it. If the OS trusts you to write to the socket, the daemon trusts the request. TCP and HTTP always require JWT.

Daemon client — daemon-client.ts

The typed Netron client used by every CLI command:

import { DaemonClient } from '@omnitron-dev/omnitron/internal';

const client = new DaemonClient();
await client.ping(); // probe
await client.startApp({ name: 'api' });
await client.getMetrics({ name: 'api' });
await client.exec({
name: 'api',
service: 'users',
method: 'findById',
args: ['u_42'],
});

The same client opens connections to remote daemons (when addressed by alias) over TCP.

Auto-restart and backoff

When an app crashes, the daemon consults its IRestartPolicy (per IProcessEntry.restartPolicy or default supervision.backoff). The default exponential backoff:

AttemptDelay before relaunch
1initial × 1 (1 s)
2initial × 2 (2 s)
3initial × 4 (4 s)
...capped at max (30 s default)

After maxRestarts within window (default 5 in 60 s), the supervisor stops trying and marks the app crashed. State is persisted so an operator can inspect and intervene.

Failure scenarios

ScenarioWhat the daemon does
Child process exits 0Treat as graceful; don't restart unless autoRestart: always
Child process exits non-zeroApply restart policy; backoff; persist
Child process hangs (no heartbeat)Health probe fails; restart after grace period
Daemon itself crashessystemd / launchd respawns; state.json rehydrates
Lock-file PID is alive but daemon hungomnitron kill force-removes the lock
Two omnitron up raceSecond loses the PID lock; exits with error
Storage failure persisting stateLogs at error; in-memory state continues; next flush retries
Disk fullState persistence + logs fail loudly
Cluster split (cluster.enabled)Both sides may elect a leader; manual step-down resolves

Operating runbook

GoalCommand
Is the daemon up?omnitron ping
What's running right now?omnitron status / omnitron --json status
Force shutdownomnitron down (graceful) or omnitron kill (forceful)
Reload daemon config without restartomnitron --json status then admin call (or restart)
See the lock holdercat ~/.omnitron/daemon.pid
Reset state from coldomnitron downrm ~/.omnitron/state.jsonomnitron up
Reset secret store from coldomnitron downrm ~/.omnitron/secrets.encomnitron up

Anti-patterns

  • Editing state.json by hand. The state shape may evolve; hand-edits often break on next restart. Prefer using the CLI to drive state changes.
  • Running two daemons against the same ~/.omnitron/. They fight over the lock. Multiple daemons need different home directories.
  • Exposing the TCP port without a JWT secret. Anyone reaching the port can inspect at minimum, and operate / admin with default roles. Configure auth before opening port 9700.
  • Treating kill as routine. Force-kill skips graceful shutdown — pending logs may be lost, child processes may be reparented to init. Use down first.
  • Removing the lock file while the daemon is running. Causes inconsistent state where two daemons can subsequently start.

See also