Skip to content

User-visible mirror URL surfaces leak localhost or placeholder domains #13

Description

@hqhq1025

Problem

Some user-visible URL surfaces in WebHarbor mirror sites expose local mirror addresses or placeholder domains instead of realistic upstream URLs.

Observed examples:

  • Google Map place detail share box can show http://localhost:40008/place/<slug> instead of a real Google Maps URL.
  • Google Map seeded place websites can use https://example.com/<slug> placeholders.
  • Booking and Allrecipes save/return forms can serialize absolute local request.url values into hidden next inputs.
  • BBC News article share copies window.location.href, which is the local mirror URL.
  • GitHub external-host recovery redirects to fixed http://localhost:40006..., which is brittle outside the default local port layout.

Expected

Benchmark/runtime entry URLs should remain local in sites/*/tasks.jsonl, README, and Docker docs, but user-visible share/copy/return/external-link surfaces should not leak local mirror hosts or placeholder domains.

Impact

This reduces realism for web-agent tasks that inspect or copy sharing links, and it makes some post-action redirect paths depend on host-derived absolute URLs rather than stable relative paths.

Proposed Fix

  • Use realistic upstream URLs for share/copy surfaces.
  • Use relative paths for hidden next inputs and validate next redirects.
  • Keep benchmark localhost URLs only in runtime/task configuration.
  • Add a regression check documenting allowed vs forbidden URL patterns.

A PR with the fix is available in #7.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions