CONTAK Browser Skills for AI Agents

Two drop-in skills that let an AI agent control a browser:

Headless screenshot -- quick PNG of a URL via a Browserless Chromium service. No login, no session, no GUI. One-shot.
Interactive browser -- full persistent browser via KASM Chrome + Chrome DevTools Protocol. Real cookies, real session state, real synthetic events. Use this when a flow needs login state, captchas, or human-style interaction.

Use one, use both, or use neither -- the skills are independent.

What's in here

.
+-- README.md                                      <- you are here
+-- assets/
|   \-- cover.png
+-- skills/
|   +-- browser-screenshot/SKILL.md                <- drop-in slash-skill, headless
|   \-- kasm-browser/SKILL.md                      <- drop-in slash-skill, interactive
\-- sops/
    +-- SOP_Headless_Browser_Screenshot.md         <- full setup + cross-host + troubleshooting
    \-- SOP_Interactive_Browser_via_KASM_CDP.md    <- same, for the interactive case (incl. Human-Speed Rule)

Quick start

For the headless screenshot skill

docker run -d --name browser-headless --restart unless-stopped \
  -p 30001:3000 \
  ghcr.io/browserless/chromium:latest

curl -s -X POST http://localhost:30001/screenshot \
  -H 'Content-Type: application/json' \
  -d '{"url":"https://example.com","options":{"type":"png"},"viewport":{"width":1920,"height":1080}}' \
  -o /tmp/screenshot.png

Then drop skills/browser-screenshot/SKILL.md into your agent's skill directory.

For the interactive KASM skill

docker run -d --name kasm-chrome --restart unless-stopped \
  -e CHROME_CLI="--remote-debugging-port=9222 --remote-debugging-address=0.0.0.0" \
  -p 6901:6901 \
  kasmweb/chrome:latest

docker exec kasm-chrome curl -s http://localhost:9222/json/version

Then drop skills/kasm-browser/SKILL.md into your agent's skill directory.

Read the matching SOP in sops/ for the full setup, cross-host setup, common recipes, and troubleshooting.

How to adapt to your project

The two skills are intentionally generic. To turn them into a team-specific SOP:

Replace placeholders (<host>, <port>, <container>, <kasm-host>, <local-path>, <target-url>) with your environment's values.
Add team-specific context -- who maintains the container, escalation contacts, common URLs you screenshot or automate against.
If you have policies on which sites are OK to automate (Terms-of-Service, allowlists), document them at the top.
Add representative examples drawn from your real workflows.

Each SOP file in sops/ has a dedicated chapter for this -- see "Generating an SOP from This Skill."

Choosing between the two

Use case	Skill
Single screenshot of a static URL, no login	browser-screenshot (headless)
Visual smoke-test of your own dashboard	browser-screenshot
Login flow with captcha or 2FA	kasm-browser (interactive)
Multi-step flow where session cookies matter	kasm-browser
React app needing real synthetic events	kasm-browser
Scraping behind a logged-in UI	kasm-browser

Responsible use

Always check the target site's Terms of Service before automating against it.
The skills are intended for sites you legitimately have access to -- your own dashboards, vendors where you have an account, services you operate.
Apply the Human-Speed Rule (described in the KASM SOP, chapter 4) any time you drive a real browser. Bot-detection walls are real, and slow pacing is the cheapest way through them.
Never log secrets, OAuth tokens, session cookies, or passwords into chat surfaces or persistent logs.

License

MIT -- use it, fork it, ship it. See LICENSE for the full text.

Credits

These skills were extracted from the internal browser-tooling SOPs of the CONTAK fleet -- a heterogeneous AI agent system running on a single home server. The original docs were heavily fleet-specific; this public version is scrubbed and made generic so anyone with a similar use case can drop the skills in.

If you build something cool with these, drop a note. Issues + PRs welcome.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CONTAK Browser Skills for AI Agents

What's in here

Quick start

For the headless screenshot skill

For the interactive KASM skill

How to adapt to your project

Choosing between the two

Responsible use

License

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
skills		skills
sops		sops
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

CONTAK Browser Skills for AI Agents

What's in here

Quick start

For the headless screenshot skill

For the interactive KASM skill

How to adapt to your project

Choosing between the two

Responsible use

License

Credits

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages