Gideon commited on
Commit
9deac74
·
1 Parent(s): 85f62fa

Add DOCS modal, tool icons

Browse files
README.md CHANGED
@@ -1,102 +1,241 @@
1
  ---
2
- title: VoiceKit MCP Server
3
- emoji: 🎙️
4
- colorFrom: indigo
5
- colorTo: purple
6
  sdk: gradio
7
- sdk_version: 6.0.0
8
  app_file: app.py
9
  pinned: false
10
- license: mit
11
  tags:
12
  - building-mcp-track-creative
 
13
  ---
14
 
15
- # VoiceKit MCP Server
16
 
17
- **Voice analysis toolkit exposing 6 MCP tools for AI assistants.**
18
 
19
- VoiceKit provides comprehensive voice processing capabilities through the Model Context Protocol (MCP), enabling Claude and other AI assistants to analyze, compare, transcribe, and process audio.
20
 
21
- ## Purpose
 
 
22
 
23
- VoiceKit bridges the gap between AI assistants and advanced voice analysis. It allows:
24
- - **Voice comparison** for mimicry games, pronunciation practice
25
- - **Audio transcription** in multiple languages
26
- - **Acoustic analysis** for voice coaching, music production
27
- - **Background removal** for clean audio extraction
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
- ## MCP Endpoint
 
 
30
 
31
  ```
32
- https://MCP-1st-Birthday-voicekit.hf.space/gradio_api/mcp/sse
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  ```
34
 
35
- ## Quick Start
 
 
36
 
37
- Add to your `claude_desktop_config.json`:
 
 
38
 
39
  ```json
40
  {
41
- "mcpServers": {
42
- "voicekit": {
43
- "url": "https://MCP-1st-Birthday-voicekit.hf.space/gradio_api/mcp/sse"
44
- }
45
  }
 
46
  }
47
  ```
48
 
49
- ## Available Tools (6)
50
 
51
- ### Primitive Tools
 
 
 
 
52
 
53
- | Tool | Purpose | Input | Output |
54
- |------|---------|-------|--------|
55
- | `extract_embedding` | Get voice fingerprint | Audio file | 768-dim Wav2Vec2 vector |
56
- | `compare_voices` | Measure voice similarity | 2 audio files | Similarity score (0-1) |
57
- | `analyze_acoustic_features` | Analyze voice characteristics | Audio file | Pitch, energy, rhythm, tempo |
58
- | `transcribe_audio` | Speech-to-text | Audio + language | Transcribed text |
59
- | `isolate_voice` | Remove background noise/music | Audio file | Clean voice audio |
60
 
61
- ### Composite Tool
 
 
62
 
63
- | Tool | Purpose | Input | Output |
64
- |------|---------|-------|--------|
65
- | `analyze_voice_similarity` | Full voice analysis | 2 audios + text | 5 metrics + overall score |
66
 
67
- ## Use Cases
68
 
69
- ### Voice Mimicry Game
70
- ```
71
- User: "Compare my voice to this movie clip"
72
- Claude: [uses analyze_voice_similarity] Returns pronunciation, tone, pitch, rhythm, energy scores
73
- ```
 
 
 
74
 
75
- ### Audio Transcription
76
- ```
77
- User: "What does this Korean audio say?"
78
- Claude: [uses transcribe_audio with language="ko"] → Returns Korean text
79
- ```
80
 
81
- ### Clean Audio Extraction
82
- ```
83
- User: "Remove the background music from this meme"
84
- Claude: [uses isolate_voice] → Returns isolated voice track
85
- ```
86
 
87
- ## Architecture
 
 
 
 
 
 
88
 
89
- ```
90
- ┌─────────────────┐ MCP/SSE ┌─────────────────┐ API ┌─────────────────┐
91
- │ Claude Desktop ◄──────────────► │ HF Space │ ◄──────────► │ Modal GPU │
92
- │ (MCP Client) │ │ (Gradio) │ │ (Inference) │
93
- └─────────────────┘ └─────────────────┘ └─────────────────┘
94
- ```
 
 
 
 
 
 
 
 
95
 
96
- - **Frontend**: Gradio 6 MCP Server on Hugging Face Spaces
97
- - **Backend**: Modal serverless GPU for ML inference
98
- - **Models**: Wav2Vec2, ElevenLabs Scribe STT, Voice Isolator
 
 
 
 
 
 
 
 
 
 
 
99
 
100
- ## Demo
101
 
102
- Try each tool directly in the tabs on the Space UI!
 
1
  ---
2
+ title: VoiceKit MCP
3
+ emoji: 🎤
4
+ colorFrom: purple
5
+ colorTo: indigo
6
  sdk: gradio
7
+ sdk_version: "6.0.0"
8
  app_file: app.py
9
  pinned: false
 
10
  tags:
11
  - building-mcp-track-creative
12
+ - mcp-server
13
  ---
14
 
15
+ # 🎤 VoiceKit MCP
16
 
17
+ > **Professional voice analysis as MCP tools extract embeddings, compare voices, transcribe speech, and more.**
18
 
19
+ 6 powerful MCP tools for voice processing, all accepting base64-encoded audio.
20
 
21
+ 📢 **Social Post:** [View on X/Twitter](#) <!-- TODO: Add link to your social media post --><br>
22
+ 🎬 **Demo Video:** [Watch (1-5 min)](#) <!-- TODO: Add link to your demo video --><br>
23
+ 👥 **Team:** [@EricYoun](https://huggingface.co/EricYoun), [@NickEo](https://huggingface.co/NickEo), [@HYENA-WON](https://huggingface.co/HYENA-WON), [@jjin6573](https://huggingface.co/jjin6573), [@cocoajoa](https://huggingface.co/cocoajoa)
24
 
25
+ ---
26
+
27
+ ## 📋 Submission Info
28
+
29
+ | | |
30
+ |---|---|
31
+ | **Track** | Building MCP — Creative |
32
+ | **MCP Endpoint** | `https://mcp-1st-birthday-voicekit.hf.space/gradio_api/mcp/sse` |
33
+ | **Framework** | Gradio 6.0 |
34
+
35
+ ---
36
+
37
+ ## ✅ Track 1 Requirements
38
+
39
+ | Requirement | How We Fulfill It |
40
+ |-------------|-------------------|
41
+ | **Functioning MCP Server** | 6 MCP tools exposed via Gradio's `mcp_server=True` |
42
+ | **MCP Client Demo** | Video shows integration with Claude Desktop / MCP client |
43
+ | **Documented Tools** | Full API documentation with inputs/outputs below |
44
+ | **Gradio App** | Interactive demo UI + hidden MCP tool interfaces |
45
+
46
+ ---
47
+
48
+ ## 🛠️ MCP Tools (6 Tools)
49
+
50
+ All tools accept **base64-encoded audio** as input.
51
+
52
+ ### 1. `extract_embedding` <img src="icons/extract_embedding.svg" width="20" height="20">
53
+ Extract voice embeddings using Wav2Vec2 model.
54
+
55
+ | | |
56
+ |---|---|
57
+ | **Input** | `audio_base64` (base64-encoded audio) |
58
+ | **Output** | `embedding_preview` (first 5 values), `embedding_length` (768) |
59
+ | **Use Case** | Speaker identification, voice fingerprinting |
60
+
61
+ <img src="imgs/extract_embedding.jpg" height="300">
62
+
63
+ ### 2. `match_voice` <img src="icons/match_voice.svg" width="20" height="20">
64
+ Compare similarity between two voices.
65
+
66
+ | | |
67
+ |---|---|
68
+ | **Inputs** | `audio1_base64`, `audio2_base64` |
69
+ | **Output** | `similarity` (0-1), `tone_score` (0-100) |
70
+ | **Use Case** | Voice cloning verification, speaker matching |
71
+
72
+ <img src="imgs/match_voice.jpg" height="300">
73
+
74
+ ### 3. `analyze_acoustics` <img src="icons/analyze_acoustics.svg" width="20" height="20">
75
+ Extract detailed acoustic characteristics.
76
+
77
+ | | |
78
+ |---|---|
79
+ | **Input** | `audio_base64` |
80
+ | **Output** | Pitch, energy, rhythm, tempo, spectral info |
81
+ | **Use Case** | Emotional tone detection, voice profiling |
82
+
83
+ <img src="imgs/analyze_acoustics.jpg" height="300">
84
+
85
+ ### 4. `transcribe_audio` <img src="icons/transcribe_audio.svg" width="20" height="20">
86
+ Convert speech to text (multilingual).
87
+
88
+ | | |
89
+ |---|---|
90
+ | **Inputs** | `audio_base64`, `language` (default: "en") |
91
+ | **Output** | Transcribed text, detected language |
92
+ | **Model** | ElevenLabs Scribe v1 |
93
+ | **Languages** | English, Korean, Japanese, and 15+ more |
94
+
95
+ <img src="imgs/transcribe_audio.jpg" height="300">
96
+
97
+ ### 5. `isolate_voice` <img src="icons/isolate_voice.svg" width="20" height="20">
98
+ Remove background music/noise and extract clean voice.
99
+
100
+ | | |
101
+ |---|---|
102
+ | **Input** | `audio_base64` (audio with background sounds) |
103
+ | **Output** | Isolated audio (base64), BGM detection status |
104
+ | **Use Case** | Audio cleanup for memes, songs, movies |
105
+
106
+ <img src="imgs/isolate_voice.jpg" height="300">
107
+
108
+ ### 6. `grade_voice` <img src="icons/grade_voice.svg" width="20" height="20">
109
+ Comprehensive voice comparison with multi-metric scoring.
110
+
111
+ | | |
112
+ |---|---|
113
+ | **Inputs** | `user_audio_base64`, `reference_audio_base64`, `reference_text` (optional), `category` (meme\|song\|movie) |
114
+ | **Output** | Pitch, rhythm, energy, pronunciation scores (0-100), overall score, user transcription |
115
+ | **Use Case** | Voice mimicry evaluation, pronunciation games |
116
+
117
+ <img src="imgs/grade_voice.jpg" height="300">
118
 
119
+ ---
120
+
121
+ ## 🏗️ Architecture
122
 
123
  ```
124
+ ┌──────────────────────────────────────────────────────��──────────┐
125
+ │ VoiceKit MCP │
126
+ ├─────────────────────────────────────────────────────────────────┤
127
+ │ │
128
+ │ ┌────────────────────────────────────────────────────────────┐ │
129
+ │ │ MCP Client (Claude) │ │
130
+ │ │ base64 audio → SSE endpoint │ │
131
+ │ └──────────────────────────┬─────────────────────────────────┘ │
132
+ │ ↓ │
133
+ │ ┌────────────────────────────────────────────────────────────┐ │
134
+ │ │ Gradio MCP Server (app.py) │ │
135
+ │ │ mcp_server=True • 6 tool interfaces │ │
136
+ │ └──────────────────────────┬─────────────────────────────────┘ │
137
+ │ ↓ │
138
+ │ ┌────────────────────────────────────────────────────────────┐ │
139
+ │ │ Modal GPU Container (T4) │ │
140
+ │ │ Wav2Vec2 • librosa • ElevenLabs APIs • DTW │ │
141
+ │ └──────────────────────────┬─────────────────────────────────┘ │
142
+ │ ↓ │
143
+ │ ┌────────────────────────────────────────────────────────────┐ │
144
+ │ │ JSON Response │ │
145
+ │ │ embeddings • scores • transcripts • audio │ │
146
+ │ └────────────────────────────────────────────────────────────┘ │
147
+ │ │
148
+ └─────────────────────────────────────────────────────────────────┘
149
  ```
150
 
151
+ ---
152
+
153
+ ## 🔌 How to Connect
154
 
155
+ ### Claude Desktop / MCP Client
156
+
157
+ Add to your MCP configuration:
158
 
159
  ```json
160
  {
161
+ "mcpServers": {
162
+ "voicekit": {
163
+ "url": "https://mcp-1st-birthday-voicekit.hf.space/gradio_api/mcp/sse"
 
164
  }
165
+ }
166
  }
167
  ```
168
 
169
+ ### Example Usage
170
 
171
+ ```python
172
+ # 1. Encode audio to base64
173
+ import base64
174
+ with open("audio.wav", "rb") as f:
175
+ audio_base64 = base64.b64encode(f.read()).decode()
176
 
177
+ # 2. Call MCP tool
178
+ result = mcp_client.call("extract_embedding", {"audio_base64": audio_base64})
 
 
 
 
 
179
 
180
+ # 3. Use the 768-dim embedding
181
+ embedding = result["embedding"]
182
+ ```
183
 
184
+ ---
 
 
185
 
186
+ ## 🛠️ Tech Stack
187
 
188
+ | Component | Technology |
189
+ |-----------|------------|
190
+ | MCP Server | Gradio 6.0 (`mcp_server=True`) |
191
+ | GPU Compute | Modal (T4 GPU) |
192
+ | Embeddings | Wav2Vec2 (facebook/wav2vec2-base-960h) |
193
+ | Speech-to-Text | ElevenLabs Scribe v1 |
194
+ | Voice Isolation | ElevenLabs Voice Isolator |
195
+ | Acoustic Analysis | librosa + scipy |
196
 
197
+ ---
 
 
 
 
198
 
199
+ ## Performance
 
 
 
 
200
 
201
+ | Metric | Value |
202
+ |--------|-------|
203
+ | Response Time (warm) | <200ms |
204
+ | Cold Start | 1-3s (memory snapshot optimized) |
205
+ | Embedding Dimensions | 768 |
206
+ | Supported Audio | Any format (auto-converts to WAV) |
207
+ | Max Duration | Tested up to 10 minutes |
208
 
209
+ ---
210
+
211
+ ## 🎯 Why VoiceKit MCP?
212
+
213
+ | Criteria | Our Approach |
214
+ |----------|--------------|
215
+ | **Functionality** | 6 production-ready tools covering full voice analysis pipeline |
216
+ | **Innovation** | First MCP server for comprehensive voice analysis |
217
+ | **Documentation** | Complete API docs with inputs/outputs/use cases |
218
+ | **Real-world Impact** | Powers Voice Sementle game; applicable to voice cloning, accessibility, language learning |
219
+
220
+ ---
221
+
222
+ ## 🎮 Interactive Demo
223
 
224
+ 👆 **Click the interface above to try each tool!**
225
+
226
+ 1. Upload or record audio
227
+ 2. Select a tool to test
228
+ 3. View JSON results with scores and analysis
229
+ 4. Copy embeddings or transcripts for your app
230
+
231
+ ---
232
+
233
+ ## 🔗 Related Projects
234
+
235
+ - **[Voice Sementle](https://huggingface.co/spaces/MCP-1st-Birthday/Voice-Sementle)** — Daily voice puzzle game powered by VoiceKit MCP
236
+
237
+ ---
238
 
239
+ **Built for [MCP's 1st Birthday Hackathon](https://huggingface.co/MCP-1st-Birthday)** 🎂
240
 
241
+ *Celebrating one year of Model Context Protocol!*
app.py CHANGED
@@ -14,6 +14,7 @@ import os
14
  import json
15
  import tempfile
16
  import math
 
17
 
18
  # Set Gradio temp directory to current directory
19
  GRADIO_TEMP_DIR = os.path.join(os.getcwd(), "gradio_temp")
@@ -34,6 +35,96 @@ except Exception as e:
34
  print(f"Modal not available: {e}")
35
 
36
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  def file_to_base64(file_path: str) -> str:
38
  """Convert file path to base64 string"""
39
  if not file_path:
@@ -1104,6 +1195,138 @@ footer,
1104
  margin-left: 6px;
1105
  }
1106
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1107
  /* ===== CARD STYLES ===== */
1108
  .card {
1109
  background: rgba(15, 15, 35, 0.8);
@@ -1934,7 +2157,7 @@ with gr.Blocks() as demo:
1934
  """)
1935
 
1936
  # ==================== HEADER (FLOATING) ====================
1937
- gr.HTML("""
1938
  <div class="header-main">
1939
  <div class="header-left">
1940
  <span class="header-icon">
@@ -1971,6 +2194,35 @@ with gr.Blocks() as demo:
1971
  <span class="header-subtitle">MCP Server</span>
1972
  </div>
1973
  </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1974
  </div>
1975
  """)
1976
 
 
14
  import json
15
  import tempfile
16
  import math
17
+ import re
18
 
19
  # Set Gradio temp directory to current directory
20
  GRADIO_TEMP_DIR = os.path.join(os.getcwd(), "gradio_temp")
 
35
  print(f"Modal not available: {e}")
36
 
37
 
38
+ # Load README.md and convert to HTML
39
+ def load_readme_as_html():
40
+ """Load README.md and convert markdown to HTML"""
41
+ try:
42
+ with open("README.md", "r", encoding="utf-8") as f:
43
+ content = f.read()
44
+
45
+ # Remove YAML front matter
46
+ content = re.sub(r'^---\n.*?\n---\n', '', content, flags=re.DOTALL)
47
+
48
+ html = content
49
+
50
+ # Headers
51
+ html = re.sub(r'^### (.+)$', r'<h3>\1</h3>', html, flags=re.MULTILINE)
52
+ html = re.sub(r'^## (.+)$', r'<h2>\1</h2>', html, flags=re.MULTILINE)
53
+ html = re.sub(r'^# (.+)$', r'<h1>\1</h1>', html, flags=re.MULTILINE)
54
+
55
+ # Code blocks
56
+ html = re.sub(r'```(\w*)\n(.*?)```', r'<pre><code>\2</code></pre>', html, flags=re.DOTALL)
57
+
58
+ # Inline code
59
+ html = re.sub(r'`([^`]+)`', r'<code>\1</code>', html)
60
+
61
+ # Bold
62
+ html = re.sub(r'\*\*(.+?)\*\*', r'<strong>\1</strong>', html)
63
+
64
+ # Links
65
+ html = re.sub(r'\[([^\]]+)\]\(([^)]+)\)', r'<a href="\2" target="_blank">\1</a>', html)
66
+
67
+ # Tables
68
+ lines = html.split('\n')
69
+ in_table = False
70
+ table_html = []
71
+ new_lines = []
72
+
73
+ for line in lines:
74
+ if '|' in line and line.strip().startswith('|'):
75
+ if not in_table:
76
+ in_table = True
77
+ table_html = ['<table>']
78
+
79
+ if re.match(r'^\|[\s\-:|]+\|$', line.strip()):
80
+ continue
81
+
82
+ cells = [c.strip() for c in line.strip().split('|')[1:-1]]
83
+ if len(table_html) == 1:
84
+ table_html.append('<thead><tr>')
85
+ for cell in cells:
86
+ table_html.append(f'<th>{cell}</th>')
87
+ table_html.append('</tr></thead><tbody>')
88
+ else:
89
+ table_html.append('<tr>')
90
+ for cell in cells:
91
+ table_html.append(f'<td>{cell}</td>')
92
+ table_html.append('</tr>')
93
+ else:
94
+ if in_table:
95
+ table_html.append('</tbody></table>')
96
+ new_lines.append('\n'.join(table_html))
97
+ table_html = []
98
+ in_table = False
99
+ new_lines.append(line)
100
+
101
+ if in_table:
102
+ table_html.append('</tbody></table>')
103
+ new_lines.append('\n'.join(table_html))
104
+
105
+ html = '\n'.join(new_lines)
106
+
107
+ # Lists
108
+ html = re.sub(r'^- (.+)$', r'<li>\1</li>', html, flags=re.MULTILINE)
109
+ html = re.sub(r'(<li>.*</li>\n?)+', r'<ul>\g<0></ul>', html)
110
+
111
+ # Paragraphs
112
+ lines = html.split('\n')
113
+ result = []
114
+ for line in lines:
115
+ stripped = line.strip()
116
+ if stripped and not stripped.startswith('<') and not stripped.startswith('```'):
117
+ result.append(f'<p>{stripped}</p>')
118
+ else:
119
+ result.append(line)
120
+
121
+ return '\n'.join(result)
122
+ except Exception as e:
123
+ return f"<p>Error loading README: {e}</p>"
124
+
125
+ readme_html = load_readme_as_html()
126
+
127
+
128
  def file_to_base64(file_path: str) -> str:
129
  """Convert file path to base64 string"""
130
  if not file_path:
 
1195
  margin-left: 6px;
1196
  }
1197
 
1198
+ /* ===== DOCS BUTTON ===== */
1199
+ .docs-button {
1200
+ display: flex;
1201
+ align-items: center;
1202
+ gap: 8px;
1203
+ padding: 10px 20px;
1204
+ background: linear-gradient(135deg, rgba(124, 58, 237, 0.3), rgba(99, 102, 241, 0.3));
1205
+ border: 1px solid rgba(124, 58, 237, 0.5);
1206
+ border-radius: 12px;
1207
+ color: #e0e7ff;
1208
+ font-size: 14px;
1209
+ font-weight: 600;
1210
+ cursor: pointer;
1211
+ transition: all 0.3s ease;
1212
+ text-transform: uppercase;
1213
+ letter-spacing: 0.5px;
1214
+ }
1215
+
1216
+ .docs-button:hover {
1217
+ background: linear-gradient(135deg, rgba(124, 58, 237, 0.5), rgba(99, 102, 241, 0.5));
1218
+ border-color: rgba(124, 58, 237, 0.8);
1219
+ transform: translateY(-2px);
1220
+ box-shadow: 0 4px 20px rgba(124, 58, 237, 0.4);
1221
+ }
1222
+
1223
+ .docs-button svg {
1224
+ width: 18px;
1225
+ height: 18px;
1226
+ }
1227
+
1228
+ /* ===== DOCS MODAL ===== */
1229
+ .docs-modal-overlay {
1230
+ display: none;
1231
+ position: fixed !important;
1232
+ top: 0 !important;
1233
+ left: 0 !important;
1234
+ right: 0 !important;
1235
+ bottom: 0 !important;
1236
+ width: 100vw !important;
1237
+ height: 100vh !important;
1238
+ background: rgba(0, 0, 0, 0.85) !important;
1239
+ backdrop-filter: blur(10px) !important;
1240
+ z-index: 99999 !important;
1241
+ justify-content: center !important;
1242
+ align-items: flex-start !important;
1243
+ padding: 10px 20px !important;
1244
+ box-sizing: border-box !important;
1245
+ }
1246
+
1247
+ .docs-modal-overlay.active {
1248
+ display: flex !important;
1249
+ }
1250
+
1251
+ .docs-modal {
1252
+ background: #0d0d1a !important;
1253
+ border: 2px solid #7c3aed !important;
1254
+ border-radius: 20px !important;
1255
+ width: calc(100vw - 40px) !important;
1256
+ max-width: 1800px !important;
1257
+ height: auto !important;
1258
+ max-height: 80vh !important;
1259
+ overflow: hidden !important;
1260
+ box-shadow: 0 25px 80px rgba(0, 0, 0, 0.9) !important;
1261
+ margin: 0 auto !important;
1262
+ position: relative !important;
1263
+ top: 20px !important;
1264
+ }
1265
+
1266
+ .docs-modal-header {
1267
+ display: flex !important;
1268
+ justify-content: space-between !important;
1269
+ align-items: center !important;
1270
+ padding: 20px 24px !important;
1271
+ border-bottom: 2px solid #7c3aed !important;
1272
+ background: #1a1a2e !important;
1273
+ }
1274
+
1275
+ .docs-modal-title {
1276
+ font-size: 20px;
1277
+ font-weight: 700;
1278
+ color: #e0e7ff;
1279
+ display: flex;
1280
+ align-items: center;
1281
+ gap: 10px;
1282
+ }
1283
+
1284
+ .docs-modal-close {
1285
+ background: rgba(124, 58, 237, 0.3);
1286
+ border: 2px solid rgba(124, 58, 237, 0.5);
1287
+ border-radius: 12px;
1288
+ color: #e0e7ff;
1289
+ font-size: 28px;
1290
+ font-weight: 300;
1291
+ cursor: pointer;
1292
+ padding: 4px 14px;
1293
+ line-height: 1;
1294
+ transition: all 0.2s;
1295
+ }
1296
+
1297
+ .docs-modal-close:hover {
1298
+ background: rgba(124, 58, 237, 0.4);
1299
+ border-color: rgba(124, 58, 237, 0.6);
1300
+ }
1301
+
1302
+ .docs-modal-content {
1303
+ padding: 24px !important;
1304
+ overflow-y: auto !important;
1305
+ max-height: calc(80vh - 80px) !important;
1306
+ color: #c7d2fe !important;
1307
+ font-size: 15px !important;
1308
+ line-height: 1.7 !important;
1309
+ background: #0d0d1a !important;
1310
+ }
1311
+
1312
+ .docs-modal-content h1 { font-size: 28px; color: #e0e7ff; margin: 0 0 16px 0; padding-bottom: 12px; border-bottom: 2px solid rgba(124, 58, 237, 0.3); }
1313
+ .docs-modal-content h2 { font-size: 22px; color: #e0e7ff; margin: 24px 0 12px 0; }
1314
+ .docs-modal-content h3 { font-size: 18px; color: #a5b4fc; margin: 20px 0 10px 0; }
1315
+ .docs-modal-content p { margin: 12px 0; }
1316
+ .docs-modal-content ul, .docs-modal-content ol { margin: 12px 0; padding-left: 24px; }
1317
+ .docs-modal-content li { margin: 6px 0; }
1318
+ .docs-modal-content code { background: rgba(124, 58, 237, 0.2); padding: 2px 6px; border-radius: 4px; font-family: 'SF Mono', 'Monaco', 'Consolas', monospace; font-size: 13px; color: #c4b5fd; }
1319
+ .docs-modal-content pre { background: rgba(0, 0, 0, 0.4); border: 1px solid rgba(124, 58, 237, 0.2); border-radius: 12px; padding: 16px; overflow-x: auto; margin: 16px 0; }
1320
+ .docs-modal-content pre code { background: transparent; padding: 0; color: #a5b4fc; }
1321
+ .docs-modal-content table { width: 100%; border-collapse: collapse; margin: 16px 0; }
1322
+ .docs-modal-content th, .docs-modal-content td { padding: 10px 12px; text-align: left; border: 1px solid rgba(124, 58, 237, 0.2); }
1323
+ .docs-modal-content th { background: rgba(124, 58, 237, 0.15); color: #e0e7ff; font-weight: 600; }
1324
+ .docs-modal-content td { color: #c7d2fe; }
1325
+ .docs-modal-content a { color: #a78bfa; text-decoration: none; }
1326
+ .docs-modal-content a:hover { text-decoration: underline; }
1327
+ .docs-modal-content strong { color: #e0e7ff; }
1328
+ .docs-modal-content img { max-width: 100%; height: auto; border-radius: 8px; margin: 12px 0; }
1329
+
1330
  /* ===== CARD STYLES ===== */
1331
  .card {
1332
  background: rgba(15, 15, 35, 0.8);
 
2157
  """)
2158
 
2159
  # ==================== HEADER (FLOATING) ====================
2160
+ gr.HTML(f"""
2161
  <div class="header-main">
2162
  <div class="header-left">
2163
  <span class="header-icon">
 
2194
  <span class="header-subtitle">MCP Server</span>
2195
  </div>
2196
  </div>
2197
+ <button class="docs-button" onclick="document.getElementById('docsModal').classList.add('active')">
2198
+ <svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
2199
+ <path d="M14 2H6a2 2 0 0 0-2 2v16a2 2 0 0 0 2 2h12a2 2 0 0 0 2-2V8z"/>
2200
+ <polyline points="14 2 14 8 20 8"/>
2201
+ <line x1="16" y1="13" x2="8" y2="13"/>
2202
+ <line x1="16" y1="17" x2="8" y2="17"/>
2203
+ <polyline points="10 9 9 9 8 9"/>
2204
+ </svg>
2205
+ DOCS
2206
+ </button>
2207
+ </div>
2208
+
2209
+ <!-- DOCS Modal -->
2210
+ <div id="docsModal" class="docs-modal-overlay" onclick="if(event.target === this) this.classList.remove('active')">
2211
+ <div class="docs-modal">
2212
+ <div class="docs-modal-header">
2213
+ <div class="docs-modal-title">
2214
+ <svg width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="#a78bfa" stroke-width="2">
2215
+ <path d="M14 2H6a2 2 0 0 0-2 2v16a2 2 0 0 0 2 2h12a2 2 0 0 0 2-2V8z"/>
2216
+ <polyline points="14 2 14 8 20 8"/>
2217
+ </svg>
2218
+ Documentation
2219
+ </div>
2220
+ <button class="docs-modal-close" onclick="document.getElementById('docsModal').classList.remove('active')">&times;</button>
2221
+ </div>
2222
+ <div class="docs-modal-content">
2223
+ {readme_html}
2224
+ </div>
2225
+ </div>
2226
  </div>
2227
  """)
2228
 
icons/analyze_acoustics.svg ADDED
icons/extract_embedding.svg ADDED
icons/grade_voice.svg ADDED
icons/isolate_voice.svg ADDED
icons/match_voice.svg ADDED
icons/transcribe_audio.svg ADDED