|
|
||
|---|---|---|
| .github | ||
| .jenkins | ||
| app | ||
| docker | ||
| prisma | ||
| scripts | ||
| src | ||
| .dockerignore | ||
| .env.example | ||
| .gitattributes | ||
| .gitignore | ||
| biome.json | ||
| CHANGELOG.md | ||
| docker-compose.yml | ||
| docker-entrypoint.sh | ||
| Dockerfile | ||
| Jenkinsfile | ||
| LICENSE | ||
| MIGRATION_STATUS.md | ||
| package-lock.json | ||
| package.json | ||
| pyproject.toml | ||
| README.md | ||
| README_DOCKER.md | ||
| README_PYTHON.md | ||
| requirements.txt | ||
| rssskull.png | ||
| run.py | ||
| tsc | ||
| tsconfig.json | ||
| vitest.config.ts | ||
RSS Skull Bot
Modern RSS to Telegram Bot with Reddit Support
A powerful, feature-rich Telegram bot that fetches RSS feeds and delivers content directly to your Telegram channels. Built with Python (FastAPI + aiogram), featuring Reddit RSS support, HTTP caching, performance metrics, HTML sanitization, and intelligent feed processing.
β¨ Features
π RSS Feed Processing
- Multi-format Support: RSS 2.0, Atom, JSON Feed 1.1
- Smart Parsing: Automatic content extraction and normalization
- Deduplication: Prevents duplicate posts using intelligent ID matching
- Content Filtering: Advanced filtering based on keywords, domains, and patterns
- HTML Sanitization: Automatic sanitization for Telegram HTML parse mode
- Baseline Management: Smart baseline using most recent post date to prevent old post notifications
π΄ Reddit Integration
- RSS Feed Support: Automatic Reddit URL to RSS conversion (
/r/subredditβ.rss) - Timestamp-based Detection: Smart detection of new posts using publication dates
- Popularity-based Handling: Correctly handles Reddit's non-chronological sorting
- HTTP Caching: ETag and Last-Modified header support to reduce API calls
- Rate Limiting: Intelligent request management to respect Reddit's limits
β‘ Performance & Reliability
- HTTP Caching: ETag and Last-Modified header support
- Circuit Breaker: Intelligent fault tolerance with exponential backoff
- Database Persistence: All data persisted across Docker deployments
- Auto-Migrations: Prisma migrations applied automatically on startup
- Smart Rate Limiting: Adaptive throttling per domain (6-8 min for Reddit)
- User-Agent Management: Realistic browser headers to avoid detection
π‘οΈ Telegram Resilience System
- Auto-Recovery: Automatic recovery from 502 Bad Gateway errors
- Message Queue: Offline message queuing during API outages
- Exponential Backoff: Smart retry with delays from 1s to 60s
- Health Monitoring: Real-time monitoring with
/healthendpoint - Alert System: Automatic alerts for critical connectivity issues
- Persistent State: Connection state survives restarts
π€ Telegram Bot Features
- Interactive Commands:
/add,/remove,/list,/help - Real-time Notifications: Instant feed updates
- Channel Management: Support for multiple channels
- Access Control: Optional user ID whitelist (respond only to owner)
- Feed Limits: Global limit of 100 feeds across all chats
- Error Handling: Graceful error recovery and user feedback
π Quick Start
Prerequisites
- Python 3.11+
- Docker & Docker Compose (recommended)
- Redis (optional, can be disabled)
- Telegram Bot Token
Installation
- Clone the repository
git clone https://github.com/runawaydevil/rssskull.git
cd rssskull
- Install dependencies
pip install -r requirements.txt
- Setup environment
cp .env.example .env
# Edit .env with your configuration (BOT_TOKEN is required)
- Run the bot
python run.py
Or using Docker (recommended):
docker-compose up -d --build
π³ Docker Deployment
Using Docker Compose (Recommended)
# Clone and navigate to project
git clone https://github.com/runawaydevil/rssskull.git
cd rssskull
# Configure environment
cp .env.example .env
# Edit .env with your BOT_TOKEN and settings
# Start containers
docker-compose up -d --build
# View logs
docker-compose logs -f rss-skull-bot
# Stop containers (data persists in volumes)
docker-compose down
# For clean deployment (WARNING: deletes all data):
docker-compose down -v
docker-compose up -d --build
Data Persistence:
- Database: Persisted in
app_dataDocker volume - Backups: Stored in
backups_dataDocker volume - Migrations: Applied automatically on container startup
Docker Volumes
The bot uses Docker volumes for data persistence:
app_data: Database storage (/app/data)backups_data: Automated backups (/app/backups)
To backup your data:
docker-compose exec rss-skull-bot node scripts/backup-database.js
Data is automatically persisted across container restarts and updates.
βοΈ Configuration
Environment Variables
Required:
| Variable | Description | Default |
|---|---|---|
BOT_TOKEN |
Telegram bot token | Required |
DATABASE_URL |
SQLite database path | file:/app/data/production.db |
REDIS_HOST |
Redis host | redis |
REDIS_PORT |
Redis port | 6379 |
Optional Settings:
| Variable | Description | Default |
|---|---|---|
ENVIRONMENT |
Environment | production |
LOG_LEVEL |
Log level | info |
ALLOWED_USER_ID |
Restrict bot to specific user (optional) | undefined |
DISABLE_REDIS |
Disable Redis caching | false |
PORT |
HTTP server port | 8916 |
HOST |
HTTP server host | 0.0.0.0 |
Reddit Feed Setup
Reddit feeds are automatically converted to RSS format:
- Add Reddit subreddit:
/add MySub https://reddit.com/r/subreddit - Or directly:
/add MySub https://reddit.com/r/subreddit/.rss
The bot automatically converts Reddit URLs to RSS feeds and handles Reddit's popularity-based sorting to correctly detect new posts.
π± Bot Commands
| Command | Description |
|---|---|
/start |
Start the bot and show welcome message |
/help |
Show available commands |
/add <name> <url> |
Add RSS feed to monitoring |
/discover <url> |
Auto-discover feeds from a website |
/remove <name> |
Remove RSS feed from monitoring |
/list |
List all monitored feeds |
/status |
Show bot status and statistics |
/filters |
Manage content filters |
ποΈ Architecture
app/
βββ bot.py # Telegram bot implementation (aiogram)
βββ commands/ # Bot command handlers
β βββ feed_commands.py # Feed management commands
βββ jobs/ # Background job processing
β βββ feed_checker.py # RSS feed checking job (APScheduler)
βββ services/ # Core business logic
β βββ feed_service.py # Feed CRUD operations
β βββ rss_service.py # RSS feed fetching and parsing
β βββ reddit_service.py # Reddit URL handling
βββ models/ # Database models (SQLModel)
β βββ feed.py # Feed and Chat models
βββ utils/ # Utility functions
β βββ html_sanitizer.py # Telegram HTML sanitization
β βββ cache.py # Redis caching
β βββ logger.py # Structured logging
βββ database.py # Database initialization
βββ config.py # Configuration (Pydantic Settings)
βββ main.py # FastAPI application with health endpoints
π§ Development
Available Scripts
# Development
python run.py # Start development server
# Docker
docker-compose up -d --build # Build and start containers
docker-compose logs -f rss-skull-bot # View bot logs
docker-compose restart rss-skull-bot # Restart bot container
# Code Quality
ruff check app/ # Run Ruff linter
black app/ # Format code with Black
mypy app/ # Type checking with mypy
Database Management
The database is automatically initialized on first startup. Data is persisted in Docker volumes (app_data).
To access the database directly:
docker-compose exec rss-skull-bot sqlite3 /app/data/production.db
π Monitoring & Reliability
The bot includes comprehensive monitoring and fault tolerance:
- Performance Metrics: Request latency tracking
- Error Monitoring: Automatic error logging and recovery
- Circuit Breaker: Exponential backoff on API failures (10min β 4h)
- Smart Rate Limiting: Adaptive throttling per domain
- Health Checks: Service availability monitoring
- Database Persistence: Automatic migrations and backups
- Graceful Degradation: OAuth β JSON fallback β RSS when needed
π‘οΈ Telegram Resilience System
The bot includes a robust resilience system specifically designed to handle Telegram API connectivity issues:
- Automatic Recovery: Handles 502 Bad Gateway errors with exponential backoff (1s β 60s)
- Message Queuing: Stores up to 1000 messages during API outages with priority handling
- Circuit Breaker: Prevents cascade failures with adaptive thresholds
- Health Monitoring: Real-time metrics and alerting via
/health,/resilience-stats,/metrics - Persistent State: Connection state and queued messages survive restarts
- Smart Retry: Up to 30 minutes of retry attempts before escalation
Monitoring Endpoints:
GET /health- Overall system health including resilience statusGET /resilience-stats- Detailed resilience system statisticsGET /metrics- Complete metrics for monitoring systems
For detailed information, see RESILIENCE.md.
π€ Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'feat: add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Commit Convention
We use conventional commits:
feat:New featuresfix:Bug fixesdocs:Documentation changesstyle:Code style changesrefactor:Code refactoringtest:Test additions/changeschore:Maintenance tasks
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Changelog
See CHANGELOG.md for full version history.
v0.6.0 - "Python Migration & Bug Fixes" (2025-11-02)
- π Complete migration from TypeScript/Node.js to Python
- π§ Fixed Telegram HTML parse errors (HTML comments, unbalanced tags)
- β Fixed Reddit feed notification issues
- π Enhanced logging and debugging capabilities
- π― Improved baseline management for new feeds
- π³ Docker improvements (multi-stage build, non-root user)
- π HTML sanitization system for Telegram messages
- π§Ή Code cleanup and optimization
π Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Made with β€οΈ by Pablo Murad