Troubleshooting
Vector circuit breaker
The Vectorize / Qdrant client is wrapped in a circuit breaker so a vector-store outage degrades gracefully instead of failing the visitor turn.
What the breaker does
- Counts consecutive errors from the underlying client in a sliding window.
- Once
VECTOR_CIRCUIT_THRESHOLDerrors land insideVECTOR_CIRCUIT_WINDOWseconds, the circuit opens. - While open (
VECTOR_CIRCUIT_COOLDOWNseconds):search()returns an empty array immediately. The LLM still answers โ it just has no retrieved<source>chunks for that turn.upsertPoints()/delete*/ensureCollection()/dropCollection()raiseCircuitOpenException. Queued jobs requeue automatically.
- After cooldown, the next call probes the underlying client. A successful probe clears the failure counter; a failed probe re-opens the circuit for another cooldown window.
Configuration
| Env var | Default | What it does |
|---|---|---|
VECTOR_CIRCUIT_THRESHOLD | 5 | Errors needed to trip the breaker. |
VECTOR_CIRCUIT_WINDOW | 60 | Sliding-window seconds for the failure counter. |
VECTOR_CIRCUIT_COOLDOWN | 60 | Seconds the breaker stays open before the next probe. |
Cache backend is whatever cache.default resolves to.
On a Redis production stack the breaker state is shared across
workers, so an outage trips for the whole cluster at once
instead of being re-discovered per worker.
What an open circuit looks like
From laravel.log:
vector.circuit_open {"client":"vectorize","count":5,"cooldown":60,"error":"Cloudflare 502 Bad Gateway"}
The widget continues to serve responses, just without retrieval. Once Vectorize recovers, the breaker auto-closes on the next successful probe โ no operator action required.
When to tune the values
- Lower the threshold if you'd rather degrade than push retries through during a flap. Default of 5 is reasonable; for very low-traffic sites consider 3.
- Raise the cooldown if your provider tends to misbehave for 5โ10 minutes. Default 60s probes aggressively; setting 300 reduces probe traffic during a long outage.
- Shrink the window if you only want to trip on a true outage, not a daily blip. Default 60s sliding window means transient flaps roll off quickly.