Datasourceforcryptocurrency / archive /docs /README_HF_INTEGRATION.md
Really-amin's picture
Upload 295 files
d6d843f verified
# Hugging Face Integration - Complete
## تغییرات انجام شده
### 1. AI Models - Ensemble Sentiment (`ai_models.py`)
**Model Catalog:**
- ✅ Crypto Sentiment: ElKulako/cryptobert, kk08/CryptoBERT, burakutf/finetuned-finbert-crypto, mathugo/crypto_news_bert
- ✅ Social Sentiment: svalabs/twitter-xlm-roberta-bitcoin-sentiment, mayurjadhav/crypto-sentiment-model
- ✅ Financial Sentiment: ProsusAI/finbert, cardiffnlp/twitter-roberta-base-sentiment
- ✅ News Sentiment: mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis
- ✅ Decision Models: agarkovv/CryptoTrader-LM
**Ensemble Sentiment:**
- `ensemble_crypto_sentiment(text)` - استفاده از چند model برای sentiment analysis
- Majority voting برای تعیین label نهایی
- Confidence scoring مبتنی بر میانگین score ها
### 2. HF Registry - Dataset Catalog (`backend/services/hf_registry.py`)
**Curated Datasets:**
- **Price/OHLCV**: 7 datasets (Bitcoin, Ethereum, XRP price data)
- **News Raw**: 2 datasets (crypto news headlines)
- **News Labeled**: 5 datasets (news with sentiment/impact labels)
**Features:**
- Category-based organization
- Automatic refresh from HF Hub
- Metadata (likes, downloads, tags)
### 3. Unified Server - Complete API (`hf_unified_server.py`)
**New Endpoints:**
**Health & Status:**
- `GET /api/health` - Dashboard health check
**Market Data:**
- `GET /api/coins/top?limit=10` - Top coins by market cap
- `GET /api/coins/{symbol}` - Coin details
- `GET /api/market/stats` - Global market stats
- `GET /api/charts/price/{symbol}?timeframe=7d` - Price chart
- `POST /api/charts/analyze` - Chart analysis with AI
**News & AI:**
- `GET /api/news/latest?limit=40` - News with sentiment
- `POST /api/news/summarize` - Summarize article
- `POST /api/sentiment/analyze` - Sentiment analysis
- `POST /api/query` - Natural language query
**Datasets & Models:**
- `GET /api/datasets/list` - Available datasets
- `GET /api/datasets/sample?name=...` - Dataset sample
- `GET /api/models/list` - Available models
- `POST /api/models/test` - Test model
**Real-time:**
- `WS /ws` - WebSocket for live updates (market + news + sentiment)
### 4. Frontend Compatibility
**admin.html + static/js/**
- ✅ Tمام endpoint های مورد نیاز پیاده شده
- ✅ WebSocket support
- ✅ Sentiment از ensemble models
- ✅ Real-time updates هر 10 ثانیه
## نحوه استفاده
### Docker (HuggingFace Space)
```bash
docker build -t crypto-hf .
docker run -p 7860:7860 -e HF_TOKEN=your_token crypto-hf
```
### مستقیم
```bash
pip install -r requirements.txt
export HF_TOKEN=your_token
uvicorn hf_unified_server:app --host 0.0.0.0 --port 7860
```
### تست
```bash
# Health check
curl http://localhost:7860/api/health
# Top coins
curl http://localhost:7860/api/coins/top?limit=10
# Sentiment analysis
curl -X POST http://localhost:7860/api/sentiment/analyze \
-H "Content-Type: application/json" \
-d '{"text": "Bitcoin price surging to new heights!"}'
# Models list
curl http://localhost:7860/api/models/list
# Datasets list
curl http://localhost:7860/api/datasets/list
```
## Model Usage
Ensemble sentiment در action:
```python
from ai_models import ensemble_crypto_sentiment
result = ensemble_crypto_sentiment("Bitcoin breaking resistance!")
# {
# "label": "bullish",
# "confidence": 0.87,
# "scores": {
# "ElKulako/cryptobert": {"label": "bullish", "score": 0.92},
# "kk08/CryptoBERT": {"label": "bullish", "score": 0.82}
# },
# "model_count": 2
# }
```
## Dependencies
requirements.txt includes:
- transformers>=4.36.0
- datasets>=2.16.0
- huggingface-hub>=0.19.0
- torch>=2.0.0
## Environment Variables
```.env
HF_TOKEN=hf_your_token_here # برای private models
```
## چک لیست تست
- [x] `/api/health` - Status OK
- [x] `/api/coins/top` - Top 10 coins
- [x] `/api/market/stats` - Market data
- [x] `/api/news/latest` - News با sentiment
- [x] `/api/sentiment/analyze` - Ensemble working
- [x] `/api/models/list` - 10+ models listed
- [x] `/api/datasets/list` - 14+ datasets listed
- [x] `/ws` - WebSocket live updates
- [x] Dashboard UI - All tabs working
## توجه
- Models به صورت lazy-load می‌شوند (اولین استفاده)
- Ensemble sentiment از 2-3 model استفاده می‌کند برای سرعت
- Dataset sampling نیاز به authentication دارد برای بعضی datasets
- CryptoTrader-LM model بزرگ است (7B) - فقط با GPU
## Support
All endpoints from the requirements document are implemented and tested.
Frontend (admin.html) works without 404/403 errors.