Spaces:
Running
Running
File size: 8,702 Bytes
696222f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 |
# User Story 009: Web Worker Architecture (Main-thread Safe Web Library)
## Story
**As a** user building robotics UIs that also render live camera previews and interactive controls
**I want** `@lerobot/web` to run heavy control/recording work off the main thread
**So that** my UI stays smooth (no flicker/jank) even when teleoperation and recording are active
## Background
The current browser implementation runs teleoperation control loops, dataset assembly, and export logic on the main thread. When activating keyboard teleoperation while previewing a camera stream, the preview can flicker due to main-thread contention. This is a UX blocker for real-world apps that combine live video, UI interactions, and hardware control.
A worker-based architecture lets us move CPU-intensive, frequent, or bursty work off the main thread. The main thread remains responsible for DOM, video rendering and user interactions. The library must preserve the existing API (`calibrate()`, `teleoperate()`, `record()`) while transparently using workers when available, and cleanly falling back to the current approach otherwise.
## Goals
- Identical public API to todayβs `@lerobot/web` (no breaking changes)
- Main-thread safe by default: heavy or frequent work executes in a Web Worker
- Graceful fallback when workers or specific APIs arenβt available
- Type-safe, minimal-copy message protocol using Transferables when possible
- Strict library/demo separation: UI and storage remain in demos
- Maintain Python lerobot UX parity and behavior
## Non-Goals (for this story)
- Changing dataset formats or camera acquisition approach
- Rewriting Web Serial API usage into worker (browser support is limited in workers)
- Introducing new external dependencies
## Acceptance Criteria
- Smooth UI under load:
- With at least one active camera preview and keyboard teleoperation at 60β120 Hz, the preview does not flicker and UI remains responsive at ~60 FPS
- API compatibility:
- `calibrate()`, `teleoperate()`, `record()` signatures and return shapes are unchanged
- Feature-detect workers; automatically use worker-backed runtime when available, otherwise use current main-thread runtime
- Clear separation of responsibilities:
- Worker executes control loops, interpolation, dataset assembly, export packaging, and CPU-heavy transforms
- Main thread owns DOM/UI and browser-only APIs that are unavailable in workers (e.g., Web Serial write calls)
- Type-safe protocol:
- Strongly typed request/response messages with versioned `type` fields; Transferable payloads used for large data
- Reliability & fallback:
- If the worker crashes or becomes unavailable, operations fail gracefully with descriptive errors and suggest retry
- Fallback path (main-thread) is automatically used when worker creation fails
- Tests & docs:
- Unit tests cover protocol routing and basic round-trips
- Planning docs updated; README notes main-thread-safe architecture
## Architecture Overview
### Worker Boundaries
- Execute in Worker:
- Control loop scheduling and target computation for teleoperation (keyboard/direct and future teleoperators)
- Episode/frame buffering and interpolation (regularization) for recording
- Dataset assembly (tables/metadata), packaging (ZIP writer), and background export streaming
- Lightweight telemetry aggregation for UI
- Execute on Main Thread:
- DOM, UI, and camera previews (`<video>` elements)
- Web Serial API read/write bridge (if browser does not permit worker access)
- MediaRecorder handling (browser-optimized implementation already off main CPU in many engines)
### Threading Model
- Main thread spawns one worker per βprocessβ instance as needed:
- TeleoperationProcess β TeleopWorker
- RecordProcess β RecordWorker (can be shared or composed with teleop worker depending on lifecycle)
- The public process objects returned from `teleoperate()`/`record()` are proxies. Method calls post messages to the worker and return promises where appropriate.
- SerialBridge (main-thread): worker requests motor write/read; main thread performs Web Serial operations and returns results. This preserves worker advantages while respecting browser API constraints.
### Message Protocol (Typed)
All messages include a discriminant `type` and a `requestId` when a response is expected.
- Teleoperation (examples):
- `teleop/start`, `teleop/stop`
- `teleop/update_key_state` { key, pressed }
- `teleop/move_motor` { motorName, position }
- `teleop/state_update` { motorConfigs, keyStates, lastUpdate } (worker β main)
- `serial/write_position` { id, position } (worker β main) β `serial/ack`
- Recording (examples):
- `record/start`, `record/stop`, `record/next_episode`
- `record/frame_append` { payload transferable }
- `record/export_zip` { options } β streaming progress events
- Error & lifecycle:
- `worker/error`, `worker/ready`, `worker/teardown`
Use Transferables (ArrayBuffer/MessagePort) for large payloads to avoid copies.
### File Structure (web package)
```
packages/web/src/
βββ workers/
β βββ teleop.worker.ts # Teleoperation control loop
β βββ record.worker.ts # Recording assembly/export
β βββ protocol.ts # Message types & guards
β βββ utils.worker.ts # Worker-side helpers (interpolation, zip)
βββ bridges/
β βββ serial-bridge.ts # Main-thread serial proxy for workers
βββ teleoperate.ts # Spawns worker, returns proxy process
βββ record.ts # Spawns worker, returns proxy process
βββ types/
βββ worker.ts # Public worker-related types (narrow)
```
### Lifecycle & Fallback
- On `teleoperate()`/`record()` call:
- Try to instantiate corresponding worker via `new Worker(new URL(...), { type: 'module' })`
- If success: wire protocol channels and return proxy-backed process
- If fail: fall back to current main-thread implementation (no behavioral changes)
- On `process.stop()` or page unload: send `worker/teardown` and terminate the worker
### Performance Notes
- Control loop cadence generated inside worker to avoid main-thread timers
- Batch serial commands from worker to main-thread bridge to minimize postMessage overhead
- Use coarse-to-fine update: high-rate calculations in worker; lower-rate UI state updates to main thread (e.g., 10β20 Hz) for rendering
- For export, stream chunks from worker; main thread triggers download or HF upload
### Error Handling
- All request/response messages enforce timeouts with descriptive errors
- Worker initialization guarded with feature detection and clear fallback
- Protocol version field enables future evolution without breaking older callers
## Phased Implementation Plan
### Phase 1: Dataset & Export Offload (Low Risk)
- Move episode interpolation, dataset assembly, and ZIP packaging to `record.worker.ts`
- Main thread keeps MediaRecorder and camera preview as-is
- Public API unchanged; verify ZIP download and HF upload via streamed messages
### Phase 2: Teleoperation Offload with SerialBridge
- Move control loop scheduling and target computation to `teleop.worker.ts`
- Implement SerialBridge on main thread for Web Serial commands
- Worker posts motor write requests; main thread executes and responds
- Throttle state updates to UI while maintaining high-rate control internally
### Phase 3: Fine-Grained Optimizations
- Introduce Transferables for large buffers
- Optional OffscreenCanvas pipelines for future video transforms (not required for current scope)
- Tune batching and message cadence under hardware testing
### Phase 4: Reliability & Observability
- Heartbeat messages and auto-restart policy for worker failures
- Dev diagnostics toggles; production minimal logging
## Risks & Mitigations
- Web Serial availability in workers: use main-thread SerialBridge (design accounts for this)
- Message overhead at high Hz: batch commands and reduce UI state update frequency
- Browser differences: feature-detect and test on Chromium, Firefox (where supported), Safari Technology Preview
## Definition of Done
- UI remains smooth with active camera preview and keyboard teleoperation; no flicker observed in manual tests
- Worker-backed runtime enabled by default when available; fallback path verified
- `calibrate()`, `teleoperate()`, `record()` maintain identical signatures and behavior
- Typed protocol implemented with Transferables where applicable
- Unit tests for protocol routing and error timeouts
- Documentation updated (this user story + README note)
|