Health Checker & Repair
The health checker tracks the state of every entry in your library and runs scheduled sweeps that re-probe only the entries that actually need checking. Each entry carries a health status visible in Browse, and you can force a recheck of a single entry at any time — independent of the global schedule.
Concepts
Section titled “Concepts”Entry health
Section titled “Entry health”Every entry has exactly one health record with one of these statuses:
| Status | Meaning |
|---|---|
healthy | All probed files are reachable. |
broken | At least one file failed its probe. |
repairing | A sweep is currently probing or fixing this entry. |
unknown | Probe was inconclusive (no files broken, none confirmed OK). |
stale | Last check is older than recheck_interval. |
unsupported | The entry has no probe path on this protocol. |
Each record also stores the file fingerprint, last-checked / last-OK / last-failed timestamps, the next-due time, and the list of broken files (with their Arr IDs when the source is Arr).
When an entry’s file set changes, its fingerprint is invalidated and the entry is marked dirty, forcing a re-probe on the next sweep.
Sweeps
Section titled “Sweeps”A sweep is a single pass that selects candidate entries, probes them, and (when auto_repair is on) attempts to fix the broken ones. Sweeps are recorded as runs (history-only — you cannot create a run by hand).
A sweep does not scan your whole library every time. It selects only entries where:
- status is not
healthy, or - the entry is dirty, or
- the entry has never been checked, or
- the entry’s
next_check_due_athas passed.
Healthy entries that haven’t changed since the last successful check are skipped.
Single-entry recheck
Section titled “Single-entry recheck”You can force a recheck on any one entry from the Browse UI or the API. This works even when the global health checker is disabled, does not produce a run record, and can optionally also repair (fix=true).
Configuration
Section titled “Configuration”{ "repair": { "enabled": true, "source": "arr", "schedule": "0 4 * * *", "workers": 5, "strategy": "per_entry", "recheck_interval": "168h", "arrs": [], "auto_repair": true, "notify_on_complete": false, "nntp_connection_percent": 20 }}| Field | Description | Default |
|---|---|---|
enabled | Master switch. When off, no scheduled sweep runs (single rechecks still work). | false |
source | Where to enumerate from: arr (walk Arr media) or managed (walk managed entries). | arr |
schedule | Cron expression for the recurring sweep. Required when enabled is true. | — |
workers | Number of concurrent probe workers. | 5 |
strategy | per_entry (stop after first broken file in an entry) or per_file (probe all files). | per_entry |
recheck_interval | How long a healthy entry’s check stays fresh before it becomes a candidate again. | 168h |
arrs | Optional filter when source=arr. Empty list means all eligible Arrs. | [] |
auto_repair | When true, brokens are repaired in the same sweep. When false, the sweep is detect-only. | false |
notify_on_complete | Send a notification when a sweep finishes. | false |
nntp_connection_percent | Percentage of NNTP connections the probe is allowed to use. Avoids starving downloads. | 20 |
Strategies
Section titled “Strategies”per_entry— probe stops at the first broken file in an entry. Faster on broken libraries; sufficient when you only need to know “is this entry intact?”.per_file— probe every file. Use when you want a complete broken-file list per entry (e.g. to drive a per-file Arr re-acquire).
Sources
Section titled “Sources”arr— walks Sonarr / Radarr media, resolves each item to an entry through the symlink target, and probes that entry. Broken files are enriched with Arr identifiers so repair can callDeleteFiles+SearchMissingdirectly.managed— walks managed entries directly. Used when you don’t have an Arr in front, or for entries imported outside the Arr flow. NZB-only managed brokens have no Arr re-acquire path; they staybrokenwithfailure_reason=usenet_segment_missingfor you to react to manually.
Workflow
Section titled “Workflow”- The cron tick (or a manual Run Now) opens a new run.
- Candidates are selected per the rules above.
- For each candidate, the entry is marked
repairingand probed by a worker. - The entry’s health is updated with the probe outcome (
healthy/broken/unknown), broken files, fingerprint, and timestamps. - If
auto_repairis on and the entry has Arr-known broken files, they are batched per Arr and submitted via delete + search. After the repair attempt, only those entries are re-probed. - The run record’s stats are updated live throughout, so the status panel reflects progress without needing to wait for completion.
A failure to persist a single entry’s health update is logged but never aborts the sweep.
Stop and disable
Section titled “Stop and disable”- Stop cancels the active run. In-flight probes return at the next checkpoint, the run is marked
cancelled, and entries that hadn’t yet finished probing revert to their previous status. - Disable (
enabled=false) cancels any in-flight run and unregisters the cron job. Per-entry health data is kept, so Browse badges stay meaningful and single-entry rechecks remain available.
Singleton invariants
Section titled “Singleton invariants”- At most one sweep runs at a time. Triggering a second one returns
409 Conflict. - A force-recheck on an entry that is already being probed by the active sweep returns
409 Conflict.
| Method | Path | Purpose |
|---|---|---|
GET | /api/repair/config | Read current config. |
PUT | /api/repair/config | Update config (validates cron, workers, source). |
GET | /api/repair/status | Active run summary, last completed run, and counts of entries by status. |
POST | /api/repair/run | Trigger a sweep now. |
POST | /api/repair/stop | Cancel the active run. |
GET | /api/repair/runs | Run history. |
GET | /api/repair/runs/{id} | Run detail. |
DELETE | /api/repair/runs | Clear run history. |
GET | /api/repair/health | List entry health (optional ?status=broken). |
GET | /api/repair/health/{name} | Per-entry health, including the broken-file list. |
POST | /api/repair/health/{name}/check | Force-recheck this entry. Add ?fix=true to also repair. |
POST | /api/repair/recheck/media | Recheck a single Arr media item. Body: {arr, media_id, fix}. |
Examples
Section titled “Examples”Run a sweep now:
curl -X POST http://localhost:8282/api/repair/run \ -H "Authorization: Bearer YOUR_API_TOKEN"Stop the active sweep:
curl -X POST http://localhost:8282/api/repair/stop \ -H "Authorization: Bearer YOUR_API_TOKEN"Force-recheck a single entry, and repair it if broken:
curl -X POST 'http://localhost:8282/api/repair/health/My.Show.S01/check?fix=true' \ -H "Authorization: Bearer YOUR_API_TOKEN"Recheck a single Sonarr series:
curl -X POST http://localhost:8282/api/repair/recheck/media \ -H "Authorization: Bearer YOUR_API_TOKEN" \ -H "Content-Type: application/json" \ -d '{"arr":"Sonarr","media_id":"123","fix":true}'Browse page
Section titled “Browse page”Each entry row carries a health badge whose colour reflects the current status (healthy / broken / repairing / unknown / stale / unsupported). Hovering shows the last-checked time, broken-file count, and top failure reason. The row’s action menu has Recheck and Recheck & fix, and clicking the badge opens a side panel listing every broken file with its reason and Arr link.
The badges flip live during a sweep — entries move through repairing → healthy/broken as the workers report results, without needing to refresh after the run finishes.
Per-Arr and per-protocol skips
Section titled “Per-Arr and per-protocol skips”Disable repair for specific Arrs:
{ "arrs": [ { "name": "Sonarr", "skip_repair": true } ]}Skip repair for all NZB / Usenet entries:
{ "usenet": { "skip_repair": true }}Rate limiting
Section titled “Rate limiting”Probes against debrid providers respect a separate rate limit so they don’t starve regular downloads:
{ "debrids": [ { "provider": "realdebrid", "repair_rate_limit": "60/minute" } ]}For Usenet probing, nntp_connection_percent (above) caps the share of NNTP connections the health checker uses.
Troubleshooting
Section titled “Troubleshooting”Sweep takes too long
Section titled “Sweep takes too long”Increase workers, or raise recheck_interval so fewer entries become candidates each tick:
{ "repair": { "workers": 10, "recheck_interval": "336h" }}Sweep is too aggressive
Section titled “Sweep is too aggressive”Lower workers, lengthen schedule, or set auto_repair: false to run detect-only and review brokens before fixing.
One entry keeps flipping broken
Section titled “One entry keeps flipping broken”Open its detail panel from Browse to see the broken-file list and reason. For Arr-sourced entries you can use Recheck & fix for a one-shot repair without waiting for the next sweep.
Migration from v1
Section titled “Migration from v1”The job-based repair system (repair_jobs, repair_keys, the Add Job UI) has been removed. On first boot of the new system the legacy buckets are dropped, and existing recurring schedules do not carry over — re-enable repair once with the new config.
Per-entry health records (repair_state) are forward-compatible; new fields default to empty and populate on the next probe.