@@ -63,7 +63,7 @@ Test that everything works:
6363# Check available scrapers
6464newswatch --list_scrapers
6565
66- # Should show 60 stable scrapers
66+ # Should show 63 stable scrapers
6767```
6868
6969## Your First Scraping Session
@@ -216,10 +216,17 @@ print(f"\nRecent articles (>= Jan 15): {len(recent)}")
216216| ` -k, --keywords ` | Comma-separated search terms | ` "bank,kredit,fintech" ` |
217217| ` -sd, --start_date ` | Start date (YYYY-MM-DD) | ` "2025-01-01" ` |
218218| ` -s, --scrapers ` | Specific scrapers or "auto"/"all" | ` "kompas,tempo" ` |
219- | ` -of, --output_format ` | Output format: csv, xlsx, or json | ` "csv" ` |
219+ | ` -of, --output_format ` | Output format: csv, xlsx, json, or jsonl | ` "csv" ` |
220220| ` -o, --output_path ` | Custom output file path | ` "news-watch-output.csv" ` |
221221| ` -v, --verbose ` | Show detailed progress | (flag only) |
222222| ` --list_scrapers ` | Show available scrapers | (flag only) |
223+ | ` --max-pages ` | Max pages per scraper (latest mode) | ` 2 ` |
224+ | ` --scraper-timeout ` | Timeout per scraper in seconds | ` 30 ` |
225+ | ` --time-range ` | Filter by ISO8601 time range | ` "2025-01-01T00:00:00/2025-01-31T23:59:59" ` |
226+ | ` --dedup-file ` | Skip articles already in this file | ` "previous-output.csv" ` |
227+ | ` --proxy ` | Proxy URL for all requests | ` "http://proxy:8080" ` |
228+ | ` --progress ` | Show progress bar (requires tqdm) | (flag only) |
229+ | ` --health-report ` | Run health checks instead of scraping | (flag only) |
223230
224231## Next Steps
225232
0 commit comments