← Back to journal

Genset and ATS Monitoring Is an Operations Problem Before It Is a Dashboard Problem

A production-grade guide to backup power monitoring with normalized telemetry, incident routing, WhatsApp status commands, role access, and monthly asset reports.

Backup power is only useful if people know what happened

A genset can be installed correctly and still be operationally invisible. Mains fails, the generator starts, the ATS does not transfer, and the first real alert is a phone call from a panicked site user.

That is not only a hardware problem. It is an awareness problem. The controller may know the state. The operator may not.

Do not turn OpenClaw into a controller

The genset controller, ATS logic, PLC, protection devices, and interlocks must keep deterministic field control. OpenClaw should read normalized state, route alarms, log incidents, answer status commands, and escalate to the right people.

Read-only monitoring is often the correct MVP. Remote start and stop can come later only with interlocks, approvals, and site SOP.

Genset and ATS monitoring architecture.
Genset and ATS monitoring architecture: normalized state, incident routing, WhatsApp visibility, and asset reporting.

Normalize status before sending it to the agent

Raw inputs like DI_01, alarm_17, or register_40031 are not useful to operators. Normalize them into human state: mains available, genset running, ATS on generator, fuel percent, battery voltage, common alarm, trip state.

The agent should reason over normalized data. It should not need to interpret device-specific register maps on every request.

type BackupPowerStatus = {
  siteId: string
  mains: 'available' | 'failed'
  genset: 'stopped' | 'starting' | 'running' | 'tripped'
  atsSource: 'utility' | 'generator' | 'unknown'
  fuelPercent: number
  batteryVoltage?: number
  activeAlarms: string[]
  updatedAt: string
}

Collapse event noise into incident updates

A power event can generate many state transitions in seconds: mains fail, start signal, voltage available, ATS transfer, load on generator, utility restore, retransfer, cooldown. If every transition becomes a WhatsApp message, the group becomes unreadable.

A better system collapses the sequence into one incident update. It tells the operator what changed, whether the expected sequence completed, and whether action is required.

The critical alerts are predictable

The most important alerts are not exotic: mains failed and genset did not start; genset running but ATS did not transfer; genset tripped under load; fuel low; battery charger fault; telemetry offline.

Each alert needs severity, owner, acknowledgement, escalation timer, and incident log. Without ownership, alerts become noise.

const alertRules = [
  { type: 'fail_start', when: 'mains_failed && !genset_running_after_15s', severity: 'critical' },
  { type: 'fail_transfer', when: 'genset_running && ats_not_on_generator_after_20s', severity: 'critical' },
  { type: 'low_fuel', when: 'fuel_percent < 25', severity: 'warning' },
  { type: 'telemetry_offline', when: 'last_seen > 120s', severity: 'warning' }
]

WhatsApp status should read like an operator summary

A good /genset status reply is short: utility source available, ATS on normal source, genset in auto mode, fuel 63%, battery 25.8V, no active alarms, running hours 1842h.

The user should not have to know the controller brand, Modbus address, or alarm code. The system should translate hardware state into operational language.

Role access keeps monitoring safe

Viewer can read status. Operator can acknowledge alarms. Supervisor can manage escalations. Admin can manage users, sites, rules, and integrations. Remote control, if allowed at all, requires stricter policy.

Every command should be logged with actor, phone number, role, target site, timestamp, input, output, and result. Auditability is not optional around backup power.

Monthly reports turn events into asset governance

The incident log should become a monthly backup power summary: outage count, total outage duration, genset runtime, failed starts, transfer failures, fuel trend, battery trend, open alarms, and missed tests.

That report is how a monitoring system proves value after the emergency is over.