SLANUSZ-25 — Fotózás újra nem működik
P1 — Production screenshot capture broken for all NUSZ operators
Summary
Operators at NUSZ cannot take screenshots during videochat sessions. The IpFilterService rate limiter blocks ALL operators after 5 screenshot attempts within 10 seconds from any combination of operators, because all operators share a single corporate NAT IP (91.82.81.14). Cooldown is 10 minutes, causing recurring 10-15 minute outage windows.
A secondary issue compounds this: 698 RPC transport timeouts (rpc-transport-css) indicate vuer_css is frequently unresponsive to screenshot RPC calls, likely due to customer WebSocket disconnections.
Original Complaint
“Sziasztok! Kérelk nézzétek meg mi lehet a probléma. Ismét nem tududnk fotókat készíteni vagy ha mégis akkor aaz alábbi hibaüzenet érekezik a rendszertől” — SzaboneNagy.Zsuzsa
Translation: “Hi! Please look at what the problem could be. Again we can’t take photos or if we do, the following error message arrives from the system.”
Root Cause 1: IP-Based Rate Limiting (PRIMARY)
Evidence Chain
Confirmed with high confidence
server/web/api/operator/screenshot.js:10appliescreateIpMonitorAndFilter('videochat-operator-screenshot')IpFilterService(server/service/IpFilterService.js) tracks requests by{IP}:{tag}key- Config:
suppressAfterAttempts=5,throttlePeriodMs=10000(10s),coolDownPeriodMs=600000(10min) uncheckedIpsis empty (no whitelist)- All NUSZ operators exit through corporate NAT:
91.82.81.14 - After 5 screenshots from ANY operator combo within 10s → ALL operators blocked for 10 minutes
Audit Log Evidence
| Timestamp | Target User | IP | Type |
|---|---|---|---|
| 2026-01-30T08:37:41 | 63 | 91.82.81.14 | videochat-operator-screenshot |
| 2026-02-27T08:28:01 | 9 | 91.82.81.14 | videochat-operator-screenshot |
| 2026-03-13T10:32:38 | 74 | 91.82.81.14 | videochat-operator-screenshot |
| 2026-03-16T12:54:51 | 56 | 91.82.81.14 | videochat-operator-screenshot |
4 different operators, same IP, same throttle type.
Flow Diagram
sequenceDiagram participant OpA as Operator A participant OpB as Operator B participant API as screenshot.js participant IPF as IpFilterService participant Audit as AuditLog Note over IPF: Tracking: 91.82.81.14:videochat-operator-screenshot OpA->>API: POST /api/screenshot (attempt 1) API->>IPF: check(91.82.81.14, screenshot) IPF-->>API: OK (1/5) OpA->>API: POST /api/screenshot (attempt 2) IPF-->>API: OK (2/5) OpB->>API: POST /api/screenshot (attempt 3) IPF-->>API: OK (3/5) OpA->>API: POST /api/screenshot (attempt 4) IPF-->>API: OK (4/5) OpB->>API: POST /api/screenshot (attempt 5) IPF-->>API: OK (5/5) OpA->>API: POST /api/screenshot (attempt 6) API->>IPF: check(91.82.81.14, screenshot) IPF->>Audit: user.user.throttled IPF-->>API: BLOCKED (cooldown 10min) API-->>OpA: ERROR_FKIPFT02 Note over IPF: ALL operators from 91.82.81.14 blocked for 10 minutes OpB->>API: POST /api/screenshot IPF-->>API: BLOCKED API-->>OpB: ERROR_FKIPFT02
Root Cause 2: RPC Transport Timeouts (SECONDARY)
142 screenshot failures + 698 total transport timeouts
Even without throttling, screenshots fail because the RPC call from vuer_oss → vuer_css (rpc-transport-css) times out.
Failed to create remote screenshot Error: RPCCLIENT MESSAGE TIMEOUT rpc-transport-css
Failure distribution by day
| Date | Screenshot Failures |
|---|---|
| 2026-02-09 | 17 |
| 2026-02-23 | 13 |
| 2026-02-27 | 10 |
| 2026-02-04 | 9 |
| 2026-01-21 | 9 |
| 2026-03-13 | 7 |
| 2026-03-16 | 6 |
| 2026-03-10 | 6 |
CSS-side evidence
TransportErrorevents in vuer_css logs- Customer
transport closedisconnections (“the user has lost connection, or the network was changed from WiFi to 4G”) UNABLE TO MATCH RPC REPLYwarnings — responses arrive after timeout
Likely cause
The customer’s browser disconnects (network change, WiFi→4G, poor connectivity) before the screenshot RPC completes. The operator requests a remote screenshot → vuer_oss sends RPC to vuer_css → vuer_css tries to capture from customer’s WebRTC stream → customer is disconnected → timeout.
Affected Components
| Component | File | Role |
|---|---|---|
| IpFilterService | server/service/IpFilterService.js | Rate limiting by IP+tag |
| Screenshot API | server/web/api/operator/screenshot.js:10 | Applies throttle middleware |
| Audit log | server/auditlog.js:98-99 | Records throttle events |
| Config | config/docker.json > security.ipFilter | Thresholds |
| VideoChatService | server/service/VideoChatService.js | processRemoteScreenshot() |
| TransportCss RPC | server/queue/rpc_client/ | rpc-transport-css channel |
Recommended Fixes
| # | Fix | Type | Time | Risk |
|---|---|---|---|---|
| 1 | Whitelist 91.82.81.14 in uncheckedIps | Config | 5 min | Disables ALL rate limiting for that IP |
| 2 | Increase suppressAfterAttempts to 50+ | Config | 5 min | Weakens login protection globally |
| 3 | Per-tag throttle overrides in IpFilterService | Code | 2h | None — proper separation |
| 4 | User-based throttling for authenticated endpoints | Refactor | 4h | Best long-term solution |
| 5 | Investigate RPC transport timeout root cause | Debug | ? | May need timeout increase or retry logic |
Fix 3 Example (Recommended)
{
"security": {
"ipFilter": {
"suppressAfterAttempts": 5,
"throttlePeriodMs": 10000,
"coolDownPeriodMs": 600000,
"overrides": {
"videochat-operator-screenshot": {
"suppressAfterAttempts": 100,
"throttlePeriodMs": 60000,
"coolDownPeriodMs": 60000
}
}
}
}
}Prevention
- All IP-based rate limiting should be reviewed for NAT-awareness
- Authenticated endpoints should throttle by
req.user.id, notreq.ip - Rate limiter should emit a more descriptive error (not HTTP 200 with error string)
- Consider adding a monitoring alert for
user.throttledevents
Related
- FaceKom — Platform overview
- vuer_oss — Backend architecture (IpFilterService in Services Layer section)
- security-audit — Rate limiting listed as positive finding, but this shows a gap
- breakage-risks — NUSZ has 74 core file modifications
- customization-branches — NUSZ branch details
- debug-agents — Agent pipeline used for this investigation
- room-export-blueprint — Room export analysis methodology