LLM-Driven Extraction of Unstructured Data — Built for API Deployments & ETL Pipeline Workflows
-
Updated
Jun 17, 2026 - Python
LLM-Driven Extraction of Unstructured Data — Built for API Deployments & ETL Pipeline Workflows
Turn Chaos Into Structure. A Type-Safe AI Agent that extracts valid JSON from unstructured data using PydanticAI, FastHTML, and Gemini 2.5.
Universal prompt library for structured outputs & ready-to-use content. Teachers get lesson plans, developers get reliable JSON/CSV. Works across GPT-4, Claude, Gemini.
n8n workflow templates extractor
Fine-tune Qwen3-0.6B for resume parsing using LoRA
CourtListener scraper and legal data extraction API. Extract court opinions, cases from CourtListener with this Apify actor. Free tier available.
Structured JSON extraction from LLMs with validation, repair, and streaming.
AI-powered structured web scraper that visually builds JSON schemas and uses Gemini 2.5 Flash & Playwright to extract clean, validated JSON with >80% DOM noise reduction.
MyNavi scraper and Japan recruitment data extraction API. Extract job listings, JPY salaries, and corporate profiles from tenshoku.mynavi.jp with mobile WAF bypass. Free tier available.
Haraj scraper and Saudi Arabia classifieds data extraction API. Extract cars, real estate, electronics, prices, and locations from haraj.com.sa with high-performance GraphQL API. Free tier available.
ShopGoodwill scraper and Goodwill online auction data extraction API. Extract bids, buy-now prices, ending times, seller locations, and photos from shopgoodwill.com via private REST APIs. Free tier available.
Production-style fine-tuning project for schema-constrained JSON extraction using QLoRA + DPO, with reproducible evals, training curves, and vLLM benchmarks.
Apna.co scraper and Indian blue-collar job data extraction API. Extract job listings, salary ranges, recruiter WhatsApp and call preferences, company addresses, and coordinate mappings from apna.co with this Apify Actor. Free tier available.
HIJOBS scraper and Scotland Highlands & Islands regional job data extraction. Extract salaries, locations, apply emails, and full descriptions from hijobs.net without Cloudflare blocking. Free tier available.
arXiv scraper and research paper API. Extract titles, authors, abstracts, PDFs from arXiv with this Apify actor. Free tier available.
Google Shopping Ads scraper and e-commerce product data extraction API. Extract live paid listings, merchant domains, pricing, and discount stickers from Google Shopping with this Apify Actor. Free tier available.
Lianjia scraper and China real estate data extraction API. Extract resale housing listings, prices per sqm, and community data from 12 major Chinese cities with residential proxy rotation. Free tier available.
HigherEdJobs scraper and US academic recruitment data extraction API. Extract university job listings, salary ranges, disciplines, and application deadlines from higheredjobs.com. Free tier available.
Welcome to the Jungle scraper and job data extraction API. Extract job listings, salary, and company data from WTTJ with this Apify actor. Free tier available.
OpenSooq scraper and MENA classifieds data extraction API. Extract cars, real estate, electronics, prices, and locations from opensooq.com across 20 countries. Free tier available.
Add a description, image, and links to the json-extraction topic page so that developers can more easily learn about it.
To associate your repository with the json-extraction topic, visit your repo's landing page and select "manage topics."