What We’re Testing
The Health page includes a “Network Relay Health” card that performs an on-demand probe of DERP relay servers. This is the closest thing to an alerting mechanism on the Health page — it indicates degraded relay connectivity that could affect machines falling back to DERP routing.
The probe is triggered by clicking the Check Health button, which calls:
POST /api/derp-health
Body: { "org_id": "<org_id>" }
This hits handleDerpHealth in backend/src/handlers/derp-health.ts. The handler:
- Looks up org-configured DERP regions from
derp_regions WHERE org_id = ? - Falls back to the default server (
vpn.quickztna.com, regionUS-1) if no custom regions are configured - For each region, makes an HTTP
GETtohttps://<hostname>/derp/probewith a 5-second timeout - Considers a region healthy if the probe response status is 200, 404, or 426 (any of these indicate the DERP server is reachable and responding)
- Queries
relay_sessionsjoined tomachinesfor sessions withlast_heartbeatwithin the last 2 minutes - Returns a payload with
region_health,active_sessions,total_sessions, and per-session data
The frontend (HealthPage.tsx) calls this via api.functions.invoke("derp-health"), which maps to POST /api/derp-health (the functions.invoke helper posts to /<fnName>).
After a successful probe, the card renders:
- Relay Status badge:
Healthy(green) ifderpHealth.healthy !== false, otherwiseDegraded(red) - Active Sessions count: from
derpHealth.active_sessions - Region label: from
derpHealth.region(falls back to"Primary"if not present) - Probe Results table (only shown if
probe_resultsis a non-empty array in the response)
Note: The derp-health endpoint requires authentication (getAuthedUser) and org membership (isOrgMember).
Your Test Setup
| Machine | Role |
|---|---|
| ⊞ Win-A | Browser observation + direct API testing of the health check endpoint |
ST1 — DERP Health Check Button Triggers Probe
What it verifies: Clicking “Check Health” sends a POST to /api/derp-health and the card populates with the probe result.
Steps:
-
On ⊞ Win-A , navigate to
/health. -
Find the Network Relay Health card. Before clicking, it shows:
Click "Check Health" to probe DERP relay status -
Click the Check Health button. The button shows a spinning loader icon while the request is in flight.
-
Once the request completes (within ~5 seconds given the 5-second probe timeout per region), the card body should populate with three sub-panels: Relay Status, Active Sessions, and Region.
-
Verify the same result via direct API call from ⊞ Win-A :
TOKEN="YOUR_ADMIN_TOKEN"
ORG_ID="YOUR_ORG_ID"
curl -s -X POST https://login.quickztna.com/api/derp-health \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d "{\"org_id\":\"$ORG_ID\"}" | python3 -m json.tool
Expected API response:
{
"success": true,
"data": {
"derp_server": {
"hostname": "vpn.quickztna.com",
"stun_port": 3478,
"derp_port": 443,
"region_code": "US-1",
"region_name": "QuickZTNA Primary",
"healthy": true,
"stun_endpoint": "vpn.quickztna.com:3478",
"derp_endpoint": "vpn.quickztna.com:443"
},
"region_health": [
{
"hostname": "vpn.quickztna.com",
"stun_port": 3478,
"derp_port": 443,
"region_code": "US-1",
"region_name": "QuickZTNA Primary",
"healthy": true,
"stun_endpoint": "vpn.quickztna.com:3478",
"derp_endpoint": "vpn.quickztna.com:443"
}
],
"regions": [],
"active_sessions": 2,
"total_sessions": 3,
"sessions": [
{
"machine_id": "uuid",
"machine_name": "Linux-C",
"tailnet_ip": "100.64.0.3",
"public_ip": "203.0.113.5",
"status": "ready",
"last_heartbeat": "2026-03-17T10:30:45.123Z"
}
],
"stats": {
"relay_regions": 1,
"machines_using_relay": 2,
"direct_connections": 0
}
}
}
Pass: Button triggers the probe, card populates with relay status, active sessions count, and region. Relay Status badge shows Healthy when the probe returns 200/404/426.
Fail / Common issues:
- Button spinner never stops — the probe request may have timed out (5-second timeout per region). Check browser DevTools for the
/api/derp-healthrequest duration. 401 UNAUTHORIZEDfrom the API — ensure theTOKENis valid (not expired). Thederp-healthhandler callsgetAuthedUser, which validates the JWT.400 MISSING_FIELDS— the request body must includeorg_id. The frontend sends it viaapi.functions.invoke("derp-health")with the org_id injected by the context (verifycurrentOrg.idis set).
ST2 — Healthy DERP Server Probe Result
What it verifies: The probe correctly identifies a healthy DERP server based on HTTP status codes 200, 404, or 426 from /derp/probe.
Steps:
- From ⊞ Win-A , manually probe the default DERP server endpoint:
curl -sv https://vpn.quickztna.com/derp/probe 2>&1 | grep "< HTTP"
Expected output:
< HTTP/2 426
or
< HTTP/2 200
DERP servers typically return 426 (Upgrade Required) to plain HTTP GET requests on the probe endpoint because the client is expected to upgrade to a WebSocket connection. The handler treats 200, 404, and 426 all as “healthy” responses.
-
Trigger the health check via the UI or API and confirm
"healthy": truein theregion_healtharray forvpn.quickztna.com. -
Verify the Relay Status badge on the Health page shows
Healthy(green badge, not destructive/red).
Expected: Direct curl to /derp/probe returns 200, 404, or 426. Health check reports healthy: true. Badge is green.
Pass: DERP probe returns one of the three accepted status codes. Handler reports healthy: true. UI badge is Healthy.
Fail / Common issues:
- Probe returns 5xx or connection refused — the DERP server may be down. Check
vpn.quickztna.comavailability viaping vpn.quickztna.com. - Probe returns 200 but
healthy: falsein the response — this would indicate a logic error. The handler uses(res.status === 200 || res.status === 404 || res.status === 426)— double-check by running the raw API call and inspectingregion_health[0].healthy.
ST3 — Degraded State When DERP Probe Fails
What it verifies: When the DERP probe cannot reach the relay server, healthy is false and the UI shows a Degraded badge.
Steps:
- This test is best done by pointing the org to a non-existent DERP region hostname. If you have admin access, insert a test region:
curl -s -X POST https://login.quickztna.com/api/db/derp_regions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d "{
\"org_id\": \"$ORG_ID\",
\"hostname\": \"nonexistent.test.invalid\",
\"stun_port\": 3478,
\"derp_port\": 443,
\"region_code\": \"TEST-1\",
\"region_name\": \"Test Unreachable\",
\"priority\": 100
}" | python3 -m json.tool
- Run the health check:
curl -s -X POST https://login.quickztna.com/api/derp-health \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d "{\"org_id\":\"$ORG_ID\"}" | python3 -m json.tool
Expected:
{
"data": {
"region_health": [
{
"hostname": "nonexistent.test.invalid",
"healthy": false,
...
}
]
}
}
-
On the Health page, click Refresh (the button text changes to “Refresh” after the first probe). Confirm the Relay Status badge now shows
Degraded(red destructive badge). -
Clean up: delete the test region:
curl -s -X DELETE "https://login.quickztna.com/api/db/derp_regions?org_id=$ORG_ID" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d "{\"_filters\":[{\"column\":\"region_code\",\"op\":\"=\",\"value\":\"TEST-1\"},{\"column\":\"org_id\",\"op\":\"=\",\"value\":\"$ORG_ID\"}]}"
Pass: healthy: false returned for unreachable hostname. UI badge switches to Degraded.
Fail / Common issues:
- Probe does not time out quickly — the handler uses a 5-second AbortController timeout. DNS resolution for
.invaliddomains should fail quickly, but in some environments DNS lookups can take several seconds before failing. - Badge still shows
Healthy— the UI renders based onderpHealth.healthy !== false. Check whetherderpHealthfrom the API has thehealthyfield on the top-levelderp_serverobject (which isregion_health[0]), not on aprobe_resultsarray.
ST4 — Active Sessions Count Reflects Relay Usage
What it verifies: The Active Sessions count shown in the health card accurately reflects the number of machines using the DERP relay in the last 2 minutes.
Steps:
-
Ensure ⊞ Win-A has been running with
ztna upfor more than 2 minutes to ensure a relay session exists (each heartbeat refreshesrelay_sessions.last_heartbeat). -
Run the health check and note the
active_sessionsandtotal_sessionsvalues:
curl -s -X POST https://login.quickztna.com/api/derp-health \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d "{\"org_id\":\"$ORG_ID\"}" | python3 -m json.tool
- Cross-check by querying
relay_sessionsdirectly:
curl -s "https://login.quickztna.com/api/db/relay_sessions?org_id=$ORG_ID&select=machine_id,status,last_heartbeat" \
-H "Authorization: Bearer $TOKEN" | python3 -m json.tool
- From the
relay_sessionsresponse, manually count how many sessions havelast_heartbeatwithin the last 2 minutes. This should matchactive_sessionsfrom the health check response.
Expected: active_sessions equals the count of relay_sessions rows with last_heartbeat newer than 2 minutes ago.
Pass: active_sessions count matches the manual count from relay_sessions with last_heartbeat < 2 min ago.
Fail / Common issues:
active_sessionsis 0 despite machines being online — machines that achieve direct P2P WireGuard connections may not have arelay_sessionsentry, or their session may have expired. This is expected:active_sessionscounts relay-dependent machines, not all online machines.- Sessions appear in
relay_sessionsbutactive_sessionsis still 0 — check whetherlast_heartbeattimestamps are stale. The heartbeat handler updatesrelay_sessions.last_heartbeatonly while the machine is not quarantined or admin-disabled.
ST5 — Unauthenticated and Missing org_id Errors
What it verifies: The derp-health endpoint correctly rejects requests missing authentication or org_id.
Steps:
- Test without an auth token:
curl -s -X POST https://login.quickztna.com/api/derp-health \
-H "Content-Type: application/json" \
-d "{\"org_id\":\"$ORG_ID\"}" | python3 -m json.tool
Expected:
{
"success": false,
"error": {
"code": "UNAUTHORIZED",
"message": "..."
}
}
- Test with a valid token but no
org_id:
curl -s -X POST https://login.quickztna.com/api/derp-health \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d "{}" | python3 -m json.tool
Expected:
{
"success": false,
"error": {
"code": "MISSING_FIELDS",
"message": "org_id required"
}
}
- Test with a valid token but an
org_idthe user does not belong to:
curl -s -X POST https://login.quickztna.com/api/derp-health \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d "{\"org_id\":\"00000000-0000-0000-0000-000000000000\"}" | python3 -m json.tool
Expected:
{
"success": false,
"error": {
"code": "FORBIDDEN",
"message": "Not a member"
}
}
Pass: 401 for missing token, 400 for missing org_id, 403 for non-member org.
Fail / Common issues:
- Missing
org_idreturns 401 instead of 400 — the handler readsorg_idfrom query params or request body. If the body fails to parse, the auth check (which runs first) may return 401 before the missing-fields check. Ensure theContent-Type: application/jsonheader is present.
Summary
| Sub-test | What it proves | Pass condition |
|---|---|---|
| ST1 | Check Health button triggers DERP probe | POST to /api/derp-health fires; card populates with region data |
| ST2 | Healthy DERP server detected correctly | HTTP 200/404/426 from /derp/probe → healthy: true → green badge |
| ST3 | Degraded state for unreachable relay | Connection failure → healthy: false → Degraded red badge |
| ST4 | Active sessions count is accurate | active_sessions matches relay_sessions rows with heartbeat under 2 min |
| ST5 | Auth and input validation | 401 no token, 400 no org_id, 403 non-member org |