English | Français | Español | 中文 | Nederlands | Русский | 한국어
Automated AI blog pipeline that discovers Google Ads & PPC questions from Reddit, generates expert-grade articles with Claude, and deploys to Cloudflare Pages — twice daily, fully unattended.
This engine runs on a Cloudflare Worker, triggered twice daily by cron. Each run:
- Discovers trending PPC questions from Reddit via SearchAPI (SERP-based, no Reddit API required)
- Filters for quality — minimum engagement, blocklist, NSFW skip, duplicate detection
- Deduplicates using Vectorize semantic similarity to avoid publishing the same topic twice
- Generates a full 1,500-3,000 word expert article using Claude (Anthropic)
- Validates the HTML — JSON-LD schemas, GTM, meta tags, AI disclosure, Reddit attribution
- Publishes via GitHub API commit, triggering Cloudflare Pages deploy
- Verifies the live URL returns 200 before marking as published
- Notifies you via email + SMS (AT&T gateway, no Twilio needed)
- Pings IndexNow to get the post indexed within hours
Every generated article includes full SEO structured data (BlogPosting, QAPage, BreadcrumbList), E-E-A-T author signals, AI disclosure, and a link back to the source Reddit discussion.
This project exists because of the incredible PPC practitioner communities on Reddit, LinkedIn, Quora, and beyond. Every article generated by this engine credits its source discussion and links back to the original thread.
| Subreddit | Focus | Link |
|---|---|---|
| r/PPC | Pay-per-click advertising, Google Ads, Meta Ads | reddit.com/r/PPC |
| r/googleads | Google Ads-specific questions and strategies | reddit.com/r/googleads |
| r/digital_marketing | Broad digital marketing including paid media | reddit.com/r/digital_marketing |
| r/adwords | Legacy Google AdWords community (still active) | reddit.com/r/adwords |
| r/marketing | General marketing strategy and industry discussion | reddit.com/r/marketing |
| r/SEO | Search engine optimization (cross-channel context) | reddit.com/r/SEO |
| r/DigitalMarketing | Digital strategy, analytics, paid media | reddit.com/r/DigitalMarketing |
| r/FacebookAds | Meta/Facebook advertising | reddit.com/r/FacebookAds |
| r/AmazonSeller | Amazon PPC and marketplace advertising | reddit.com/r/AmazonSeller |
- Should I use broad match or exact match? — The perennial match type debate
- How much should I spend on Google Ads for a small business? — Budget sizing questions
- Performance Max vs Search campaigns — Campaign type selection
- My CPC keeps going up, what do I do? — Cost optimization
- Is Google Ads worth it for a new business? — ROI justification
- Smart Bidding isn't working for me — Bidding strategy troubleshooting
- How to structure a Google Ads account? — Account organization
- Google Ads Community on LinkedIn — Active professionals group
- PPC Chat Community — Weekly Twitter/X chat community
- Digital Marketing Professionals — Broad digital marketing group
- Paid Search Association — Industry organization
- Google Ads Help Community — Official Google forum
- Google Ads (Quora) — Q&A on Google Ads strategy
- PPC Advertising (Quora) — Pay-per-click discussions
- Digital Marketing (Quora) — Broad marketing Q&A
- SEM (Quora) — Search engine marketing
| Project | Description | Link |
|---|---|---|
| Google Ads MCP | Python MCP server with 29 tools for Google Ads API | github.com/itallstartedwithaidea/google-ads-mcp |
| Google Ads Gemini Extension | Gemini CLI extension with 22 tools | github.com/itallstartedwithaidea/google-ads-gemini-extension |
| Google Ads Skills | Anthropic Claude Agent Skills for Google Ads | github.com/itallstartedwithaidea/google-ads-skills |
| Google Ads API Agent | Full Python agent with 28 actions | github.com/itallstartedwithaidea/google-ads-api-agent |
| GoogleAdsAgent.ai | The complete website and tools | github.com/itallstartedwithaidea/googleadsagent-site |
┌─────────────────┐
│ Cron Scheduler │
│ 7am + 7pm ET │
└────────┬────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ Blog Engine Worker │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────────┐│
│ │ Discover │──▶│ Filter │──▶│ Generate │──▶│ Validate ││
│ │ (SERP) │ │ & Dedupe │ │ (Claude) │ │ (schema, links) ││
│ └──────────┘ └──────────┘ └──────────┘ └────────┬─────────┘│
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────────┐│
│ │SearchAPI │ │Vectorize │ │ Claude │ │ GitHub API ││
│ │ │ │ D1 (DB) │ │ API │ │ Commit + Push ││
│ └──────────┘ └──────────┘ └──────────┘ └────────┬─────────┘│
│ │ │
└─────────────────────────────────────────────────────────┼──────────┘
│
┌───────────────────────────────┼──────────┐
│ ▼ │
│ ┌──────────┐ ┌──────────────────────┐ │
│ │ Verify │──▶│ Notify (Email + SMS) │ │
│ │ (200 OK) │ │ IndexNow Ping │ │
│ └──────────┘ └──────────────────────┘ │
│ Cloudflare Pages Deploy │
└──────────────────────────────────────────┘
Every post flows through these statuses:
discovered → queued → generating → draft_ready → published
│ │
▼ ▼
skipped failed
(filtered out) (retryable error)
| Service | Purpose |
|---|---|
| Workers | Blog engine runtime (cron-triggered) |
| D1 | Post tracker database (SQLite) — status, budgets, run logs |
| KV | Blog manifest for dynamic index page |
| R2 | Hero image storage |
| Workers AI | Embeddings (bge-base-en-v1.5) + image generation (Stable Diffusion XL) |
| Vectorize | Semantic dedup index (cosine similarity on blog-topics) |
| Pages | Static site hosting + Functions |
- Node.js 20+
- Wrangler CLI (
npm install -g wrangler) - Cloudflare account with Workers paid plan
- Anthropic API key
- SearchAPI key
- GitHub personal access token (repo scope)
- Google OAuth via Buddy (for email/SMS notifications — uses Gmail API, no extra service)
git clone https://github.com/itallstartedwithaidea/reddit.git
cd reddit
npm install# Create D1 database
wrangler d1 create blog-tracker
# Copy the database_id into wrangler.toml
# Create KV namespace
wrangler kv namespace create BLOG_DATA
# Copy the id into wrangler.toml
# Create R2 bucket
wrangler r2 bucket create blog-assets
# Create Vectorize index
wrangler vectorize create blog-topics --dimensions=768 --metric=cosineUpdate the binding IDs in wrangler.toml with the values from Step 2:
[[d1_databases]]
binding = "DB"
database_name = "blog-tracker"
database_id = "YOUR_D1_ID_HERE"
[[kv_namespaces]]
binding = "BLOG_DATA"
id = "YOUR_KV_ID_HERE"wrangler secret put ANTHROPIC_API_KEY
# Paste your Anthropic API key
wrangler secret put SEARCHAPI_KEY
# Paste your SearchAPI key
wrangler secret put GITHUB_TOKEN
# Paste your GitHub personal access token
# Notifications use Gmail API via the admin's Google OAuth session
# No additional API keys needed for email/SMS
wrangler secret put EMAIL_TO
# Enter: your@email.com
wrangler secret put SMS_GATEWAY_TO
# Enter: your 10-digit phone number (e.g. 5551234567)
# SMS is sent via AT&T gateway: number@txt.att.net
# For other carriers:
# T-Mobile: number@tmomail.net
# Verizon: number@vtext.com
# Sprint: number@messaging.sprintpcs.com
wrangler secret put CRON_SECRET
# Enter a random secret string for authwrangler d1 execute blog-tracker --file=./schema.sqlwrangler deploycurl -X POST "https://your-worker.workers.dev/run?key=YOUR_CRON_SECRET&dry_run=true"curl -X POST "https://your-worker.workers.dev/run?key=YOUR_CRON_SECRET"reddit/
├── src/
│ ├── index.ts # Worker entry — cron handler, HTTP routes, pipeline orchestration
│ ├── types.ts # TypeScript interfaces for all data types
│ ├── db.ts # D1 database operations — posts, budget, run logs
│ ├── ingest.ts # Reddit thread discovery via SearchAPI SERP
│ ├── gates.ts # Quality filters, blocklist, Vectorize semantic dedup
│ ├── topics.ts # Topic classification into 12 PPC clusters
│ ├── generate.ts # Claude article generation + related post linking
│ ├── template.ts # Full HTML template (matches googleadsagent.ai design)
│ ├── images.ts # Workers AI hero image generation (Stable Diffusion XL)
│ ├── validate.ts # Post-generation validation (10+ checks)
│ ├── publish.ts # GitHub API commit + deploy verification
│ ├── notify.ts # Email + SMS notifications + IndexNow
│ └── budget.ts # Cost tracking, daily/monthly caps
├── schema.sql # D1 database schema
├── wrangler.toml # Cloudflare Worker configuration
├── package.json
├── tsconfig.json
└── README.md
| Variable | Default | Description |
|---|---|---|
MAX_POSTS_PER_RUN |
2 |
Maximum posts to publish per cron run |
MAX_DAILY_GENERATIONS |
4 |
Maximum posts per day (2 runs x 2 posts) |
MAX_MONTHLY_USD |
50 |
Monthly budget cap (Claude + SearchAPI costs) |
SITE_URL |
https://googleadsagent.ai |
Your site's base URL |
GITHUB_REPO |
itallstartedwithaidea/googleadsagent-site |
GitHub repo for git commits |
GITHUB_BRANCH |
main |
Branch to commit to |
Every article is classified into one of 12 PPC topic clusters:
| Cluster | Keywords |
|---|---|
| Bidding | bid, tCPA, tROAS, smart bidding, manual CPC |
| Creative | RSA, headlines, ad copy, extensions, sitelinks |
| Audiences | targeting, remarketing, custom segments, broad match |
| Measurement | conversion tracking, GA4, attribution, enhanced conversions |
| Automation | scripts, rules, API, AI Max, optimization score |
| Policy | disapproved, suspended, trademark, appeal |
| Shopping | shopping, merchant center, PMax, product feed |
| Video | YouTube, demand gen, bumper, TrueView |
| Local | local campaigns, GMB, location extensions |
| Budget | CPC, CPA, ROAS, daily budget, pacing |
| Account Structure | ad groups, naming conventions, SKAG, Hagakure |
| General | Catch-all for broad strategy questions |
Each article is generated by Claude with a system prompt that:
- Sets the voice as John Williams, Senior Paid Media Specialist ($350M+ managed)
- Provides the Reddit thread context (title, subreddit, snippet)
- Requires 1,500-3,000 words of substantive, actionable content
- Enforces callout boxes, comparison tables, stat cards
- Requires specific benchmarks and real campaign data ranges
- Prohibits fabricated Redditor quotes
- Mandates a "Bottom Line" section with numbered action items
Every article includes:
- 3 JSON-LD blocks:
BlogPosting(with full author entity + sameAs),BreadcrumbList,QAPage - Full meta tags: og:, twitter:, canonical, robots
- GTM tracking: GTM-NR7F6P92
- AI disclosure: Visible block crediting Reddit source + AI assistance
- Author E-E-A-T signals: Person schema with name, jobTitle, sameAs links
- Responsive design: Dark theme matching googleadsagent.ai
Before publishing, every article is validated for:
- Word count within bounds (800-5,000)
- GTM snippet present
- BlogPosting JSON-LD present and valid JSON
- BreadcrumbList JSON-LD present
- QAPage JSON-LD present
- Reddit source URL in article body
- AI disclosure block present
- Canonical URL has no
.htmlsuffix - og:image and twitter:image tags present
- Author name present
- Shared scripts (site-search, cookie-consent, chat-widget)
From: blog-engine@googleadsagent.ai
To: your@email.com
Subject: New post live: {title}
Body: {title}\n{live URL}
To: 5551234567@txt.att.net
Body: New post: {title} {url}
- 80% monthly budget: Email + SMS warning
- 100% monthly budget: Email + SMS, pipeline pauses until next month
- Pipeline failures: Email only (no SMS to avoid waking you up)
| Risk | Protection |
|---|---|
| Duplicate posts | D1 unique constraint on reddit_id + Vectorize cosine similarity (0.85 threshold) |
| 404 after publish | 5-retry verification (GET with cache bypass, checks 200 + content > 1KB) |
| Runaway costs | MAX_POSTS_PER_RUN, MAX_DAILY_GENERATIONS, MAX_MONTHLY_USD — all fail closed |
| Stale generating state | Auto-reset posts stuck in "generating" for > 2 hours on next run |
| Toxic content | Blocklist + minimum quality thresholds |
| Reddit TOS | Read-only SERP discovery (no Reddit API calls), attribution in every post |
| Bad HTML | 10+ validation checks before any publish attempt |
| API outages | Exponential backoff, bounded retries, clean failure logging |
These posts were generated by this engine and are live on googleadsagent.ai:
- Should I Use Broad Match or Exact Match in Google Ads in 2026?
- How Much Should I Spend on Google Ads? The Small Business Reality Check
- Performance Max vs Search Campaigns: When to Use Which
- Reddit: This engine uses SearchAPI (Google SERP) for thread discovery — it does NOT call the Reddit API directly. All articles link back to the source thread and attribute content to the community discussion. No Reddit user content is copied verbatim.
- AI Disclosure: Every article contains a visible AI disclosure block. No content is presented as human-only authored.
- Copyright: Articles are original AI-generated analysis inspired by community questions. They do not reproduce Reddit posts or comments.
- Privacy: No Reddit usernames, IPs, or personal data are stored. Only thread IDs and URLs are tracked.
John Williams — Senior Paid Media Specialist, $350M+ Managed
MIT License — see LICENSE for details.