The API to search, scrape, and interact with the web at scale. 🔥
-
Updated
Jun 25, 2026 - TypeScript
The API to search, scrape, and interact with the web at scale. 🔥
Python scraper based on AI
Structured data gathering from any website using AI-powered scraper, crawler, and browser automation. Scraping and crawling with natural language prompts. Equip your LLM agents with fresh data. AI Studio python SDK for intelligent web data gathering.
Crawl a website starting from a URL, find relevant pages, and extract data – all guided by your natural language prompt.
Open source web infrastructure for AI. Scrape, crawl, and automate the web, clean markdown, browser sessions, ready for your agents.
High-performance web crawler API optimized for LLMs. Turn any search or website into clean Markdown using remote browsers. Firecrawl alternative
Structured data gathering from any website using AI-powered scraper, crawler, and browser automation. Scraping and crawling with natural language prompts. Equip your LLM agents with fresh data. AI Studio JS SDK for intelligent web data gathering.
A simple proxy server to integrate crawl4ai with OpenWebUI
Build-time llms.txt, JSON-LD, markdown mirrors, AI crawler controls, and validation for Vite, Astro, and Next.js websites.
A sophisticated system that uses multiple AI agents to research, create, and polish video scripts for social media platforms. The system employs specialized agents for research, script writing, polishing, and evaluation to ensure high-quality, engaging content.
The official Node.js SDK for Spidra.
Machine-readable AI permissions for websites. A consolidated spec at /.well-known/ai-policy.json for declaring how AI agents may train on, search, or use your content.
Tool for Fast Detection of Website/Server AI Crawler Blocking Policies(Not robots.txt)
A collection-based format for serving clean, structured web content to AI training systems and search engines through pre-generated collections.
High-performance, zero-allocation HTTP User-Agent parser for Go — browser, OS, device, bot & AI crawler detection with Client Hints support
🤖 Generate high-quality social media posts effortlessly with this AI agent that researches, drafts, critiques, and finalizes content for you.
A powerful tool that crawls documentation websites and generates a clean, well-formatted markdown document. Built with FastAPI and support for multiple LLM providers (DeepSeek and Groq).
See which AI crawlers can read your site — GPTBot, ClaudeBot, PerplexityBot & 20 more. Curated, operator-sourced bot list + a zero-dep CLI and GitHub Action to audit robots.txt, test reachability, read access logs, and gate it all in CI.
Stack-agnostic web template + 2026 standards (BFSG/GDPR/CWV/CSP/AI-crawler) as three Claude Code skills: setup → design → audit.
Add a description, image, and links to the ai-crawler topic page so that developers can more easily learn about it.
To associate your repository with the ai-crawler topic, visit your repo's landing page and select "manage topics."