Skip to content

Track: Track1; Team name: PushparajD; Model: Graph Attention with self/neighbour-separated attention#355

Open
Pdevadiga45 wants to merge 6 commits into
geometric-intelligence:mainfrom
Pdevadiga45:track1-gate
Open

Track: Track1; Team name: PushparajD; Model: Graph Attention with self/neighbour-separated attention#355
Pdevadiga45 wants to merge 6 commits into
geometric-intelligence:mainfrom
Pdevadiga45:track1-gate

Conversation

@Pdevadiga45

@Pdevadiga45 Pdevadiga45 commented Jun 14, 2026

Copy link
Copy Markdown

Checklist

  • My pull request has a clear and explanatory title.
  • My pull request passes the Linting test.
  • I added appropriate unit tests and I made sure the code passes all unit tests.
  • My PR follows PEP8 guidelines.
  • My code is properly documented, using numpy docs conventions, and I made sure the documentation renders properly.
  • I linked to issues and PRs that are relevant to this PR.

Description

Adds GATE (Graph Attention with self/neighbour-separated attention) as a Track 1 graph backbone.

GATE is GATv2 with one targeted change (Eq. 4): the attention logit uses a separate learnable vector for a node's self-loop (a_t) versus its neighbours (a_s), so a node can parameterize its own self-attention independently and suppress aggregation from unrelated ("intrusive") neighbours. That's what makes it robust on heterophilic graphs - the axis GraphUniverse sweeps.

Files

  • topobench/nn/backbones/graph/gate.py : GATEConv (the attention layer) and GATE (stacked backbone). Docstrings cite the paper's Eq. 1/2/4.
  • configs/model/graph/gate.yaml : Hydra config on GNNWrapper + NoReadOut; one config serves both challenge tasks.
  • test/nn/backbones/graph/test_gate.py : 10 tests, 100% backbone coverage.
  • test/pipeline/test_pipeline.py : registers graph/gate for the CI MUTAG integration test.
  • 2026_tdl_challenge/outputs/.../results.json : GraphUniverse grid output

Fidelity
The reference repo doesn't run under modern PyG (it ships a modified, older-PyG MessagePassing base), so I implemented a clean standalone version directly from the paper's equations, and validated it three ways:

  1. an independent dense reimplementation of the GATE update (parity across configurations);
  2. reduction to PyG's official GATv2Conv : in the shared-attention special case our layer matches PyG bit-for-bit (external reference for the routing/softmax/aggregation);
  3. a property test of Thm. 4.3 - with zero-initialized attention the layer is exactly uniform mean-aggregation at init.

I follow the paper's GATE (Eq. 1/2/4), not the reference repo's optional omega gate or separate self-loop value transform, neither is part of the published model (the paper adds only the d-dimensional a_t). Init follows the paper: zero attention (Thm. 4.3) + random-orthogonal weights.

Initialization
I apply the paper's prescription where it bears on the GATE mechanism: zero attention vectors (Thm. 4.3 - no initial inductive bias, so the layer starts as uniform mean-aggregation) and random-orthogonal weight matrices. I deliberately do not reproduce the paper's full looks-linear (channel-mirroring) weight construction: zero-attention already delivers the at-init uniform-aggregation property that looks-linear is there to support, and orthogonal init captures the random-orthogonal specification it builds on. The mirroring would add implementation surface without changing the attention mechanism this PR contributes. This is documented in the module docstring.

TopoBench integration

  • Backbone returns node embeddings; the readout owns the classification head.
  • forward(x, edge_index, edge_weight=None, **kwargs) accepts the GNNWrapper arguments; hidden width matches the encoder so the wrapper's residual is consistent.
  • The config uses the fully-qualified _target_ (topobench.nn.backbones.graph.gate.GATE): the backbone auto-discovery loads files under a synthetic module name, which otherwise breaks PyG MessagePassing's inspector.

Cost. Attention is O(E·H·d); the benchmarked config (hidden 64, 4 heads, 2 layers) has 16,896 trainable parameters.

Results (72 runs, seeds 42/43/44). In-distribution community-detection accuracy 0.31–0.69 (mean 0.45; chance ≈ 0.05 over 20 communities); triangle-count MSE/triangles 0.015–3.80, all finite. Per-setting/per-seed/OOD values are in results.json. Run on CPU (no CUDA device) under WANDB_MODE=offline, so the optional W&B fields are empty; metrics are seeded and
device-independent.

On results.json generation. The shipped evaluation notebook can't run as-is (its integrity-check cell's stored hash doesn't match its own cells, so it aborts). Without modifying the notebook or utils.py, I called the functions it wraps i.e., run_challenge_grid + save_challenge_artifacts - which run the identical pipeline.

Issue

Track 1 entry for the TDL Challenge 2026.

Additional context

Python 3.11, torch 2.3.0. pre-commit (ruff-format, ruff, numpydoc-validation, standard hooks) passes; 10 unit tests at 100% backbone coverage; MUTAG pipeline test passes.

@gbg141 gbg141 added the track-1-gnn 2026 Topological Deep Learning Challenge -- Track 1 GNNs label Jun 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

track-1-gnn 2026 Topological Deep Learning Challenge -- Track 1 GNNs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants