File size: 8,702 Bytes
696222f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
# User Story 009: Web Worker Architecture (Main-thread Safe Web Library)

## Story

**As a** user building robotics UIs that also render live camera previews and interactive controls
**I want** `@lerobot/web` to run heavy control/recording work off the main thread
**So that** my UI stays smooth (no flicker/jank) even when teleoperation and recording are active

## Background

The current browser implementation runs teleoperation control loops, dataset assembly, and export logic on the main thread. When activating keyboard teleoperation while previewing a camera stream, the preview can flicker due to main-thread contention. This is a UX blocker for real-world apps that combine live video, UI interactions, and hardware control.

A worker-based architecture lets us move CPU-intensive, frequent, or bursty work off the main thread. The main thread remains responsible for DOM, video rendering and user interactions. The library must preserve the existing API (`calibrate()`, `teleoperate()`, `record()`) while transparently using workers when available, and cleanly falling back to the current approach otherwise.

## Goals

- Identical public API to today’s `@lerobot/web` (no breaking changes)
- Main-thread safe by default: heavy or frequent work executes in a Web Worker
- Graceful fallback when workers or specific APIs aren’t available
- Type-safe, minimal-copy message protocol using Transferables when possible
- Strict library/demo separation: UI and storage remain in demos
- Maintain Python lerobot UX parity and behavior

## Non-Goals (for this story)

- Changing dataset formats or camera acquisition approach
- Rewriting Web Serial API usage into worker (browser support is limited in workers)
- Introducing new external dependencies

## Acceptance Criteria

- Smooth UI under load:
  - With at least one active camera preview and keyboard teleoperation at 60–120 Hz, the preview does not flicker and UI remains responsive at ~60 FPS
- API compatibility:
  - `calibrate()`, `teleoperate()`, `record()` signatures and return shapes are unchanged
  - Feature-detect workers; automatically use worker-backed runtime when available, otherwise use current main-thread runtime
- Clear separation of responsibilities:
  - Worker executes control loops, interpolation, dataset assembly, export packaging, and CPU-heavy transforms
  - Main thread owns DOM/UI and browser-only APIs that are unavailable in workers (e.g., Web Serial write calls)
- Type-safe protocol:
  - Strongly typed request/response messages with versioned `type` fields; Transferable payloads used for large data
- Reliability & fallback:
  - If the worker crashes or becomes unavailable, operations fail gracefully with descriptive errors and suggest retry
  - Fallback path (main-thread) is automatically used when worker creation fails
- Tests & docs:
  - Unit tests cover protocol routing and basic round-trips
  - Planning docs updated; README notes main-thread-safe architecture

## Architecture Overview

### Worker Boundaries

- Execute in Worker:
  - Control loop scheduling and target computation for teleoperation (keyboard/direct and future teleoperators)
  - Episode/frame buffering and interpolation (regularization) for recording
  - Dataset assembly (tables/metadata), packaging (ZIP writer), and background export streaming
  - Lightweight telemetry aggregation for UI
- Execute on Main Thread:
  - DOM, UI, and camera previews (`<video>` elements)
  - Web Serial API read/write bridge (if browser does not permit worker access)
  - MediaRecorder handling (browser-optimized implementation already off main CPU in many engines)

### Threading Model

- Main thread spawns one worker per β€œprocess” instance as needed:
  - TeleoperationProcess β†’ TeleopWorker
  - RecordProcess β†’ RecordWorker (can be shared or composed with teleop worker depending on lifecycle)
- The public process objects returned from `teleoperate()`/`record()` are proxies. Method calls post messages to the worker and return promises where appropriate.
- SerialBridge (main-thread): worker requests motor write/read; main thread performs Web Serial operations and returns results. This preserves worker advantages while respecting browser API constraints.

### Message Protocol (Typed)

All messages include a discriminant `type` and a `requestId` when a response is expected.

- Teleoperation (examples):
  - `teleop/start`, `teleop/stop`
  - `teleop/update_key_state` { key, pressed }
  - `teleop/move_motor` { motorName, position }
  - `teleop/state_update` { motorConfigs, keyStates, lastUpdate } (worker β†’ main)
  - `serial/write_position` { id, position } (worker β†’ main) β†’ `serial/ack`
- Recording (examples):
  - `record/start`, `record/stop`, `record/next_episode`
  - `record/frame_append` { payload transferable }
  - `record/export_zip` { options } β†’ streaming progress events
- Error & lifecycle:
  - `worker/error`, `worker/ready`, `worker/teardown`

Use Transferables (ArrayBuffer/MessagePort) for large payloads to avoid copies.

### File Structure (web package)

```
packages/web/src/
β”œβ”€β”€ workers/
β”‚   β”œβ”€β”€ teleop.worker.ts             # Teleoperation control loop
β”‚   β”œβ”€β”€ record.worker.ts             # Recording assembly/export
β”‚   β”œβ”€β”€ protocol.ts                  # Message types & guards
β”‚   └── utils.worker.ts              # Worker-side helpers (interpolation, zip)
β”œβ”€β”€ bridges/
β”‚   └── serial-bridge.ts             # Main-thread serial proxy for workers
β”œβ”€β”€ teleoperate.ts                   # Spawns worker, returns proxy process
β”œβ”€β”€ record.ts                        # Spawns worker, returns proxy process
└── types/
    └── worker.ts                    # Public worker-related types (narrow)
```

### Lifecycle & Fallback

- On `teleoperate()`/`record()` call:
  - Try to instantiate corresponding worker via `new Worker(new URL(...), { type: 'module' })`
  - If success: wire protocol channels and return proxy-backed process
  - If fail: fall back to current main-thread implementation (no behavioral changes)
- On `process.stop()` or page unload: send `worker/teardown` and terminate the worker

### Performance Notes

- Control loop cadence generated inside worker to avoid main-thread timers
- Batch serial commands from worker to main-thread bridge to minimize postMessage overhead
- Use coarse-to-fine update: high-rate calculations in worker; lower-rate UI state updates to main thread (e.g., 10–20 Hz) for rendering
- For export, stream chunks from worker; main thread triggers download or HF upload

### Error Handling

- All request/response messages enforce timeouts with descriptive errors
- Worker initialization guarded with feature detection and clear fallback
- Protocol version field enables future evolution without breaking older callers

## Phased Implementation Plan

### Phase 1: Dataset & Export Offload (Low Risk)

- Move episode interpolation, dataset assembly, and ZIP packaging to `record.worker.ts`
- Main thread keeps MediaRecorder and camera preview as-is
- Public API unchanged; verify ZIP download and HF upload via streamed messages

### Phase 2: Teleoperation Offload with SerialBridge

- Move control loop scheduling and target computation to `teleop.worker.ts`
- Implement SerialBridge on main thread for Web Serial commands
- Worker posts motor write requests; main thread executes and responds
- Throttle state updates to UI while maintaining high-rate control internally

### Phase 3: Fine-Grained Optimizations

- Introduce Transferables for large buffers
- Optional OffscreenCanvas pipelines for future video transforms (not required for current scope)
- Tune batching and message cadence under hardware testing

### Phase 4: Reliability & Observability

- Heartbeat messages and auto-restart policy for worker failures
- Dev diagnostics toggles; production minimal logging

## Risks & Mitigations

- Web Serial availability in workers: use main-thread SerialBridge (design accounts for this)
- Message overhead at high Hz: batch commands and reduce UI state update frequency
- Browser differences: feature-detect and test on Chromium, Firefox (where supported), Safari Technology Preview

## Definition of Done

- UI remains smooth with active camera preview and keyboard teleoperation; no flicker observed in manual tests
- Worker-backed runtime enabled by default when available; fallback path verified
- `calibrate()`, `teleoperate()`, `record()` maintain identical signatures and behavior
- Typed protocol implemented with Transferables where applicable
- Unit tests for protocol routing and error timeouts
- Documentation updated (this user story + README note)