Skip to content

cpu loop R72: 2D gemm_bias batch2 family (294→298)#68

Merged
chenxingqiang merged 1 commit into
cursor/cpu-loop-r71-layernorm-gated-batch2-78f5from
cursor/cpu-loop-r72-gemm-bias-batch2-78f5
Jun 30, 2026
Merged

cpu loop R72: 2D gemm_bias batch2 family (294→298)#68
chenxingqiang merged 1 commit into
cursor/cpu-loop-r71-layernorm-gated-batch2-78f5from
cursor/cpu-loop-r72-gemm-bias-batch2-78f5

Conversation

@chenxingqiang

@chenxingqiang chenxingqiang commented Jun 27, 2026

Copy link
Copy Markdown
Owner

Summary

CPU optimization loop R72 — closes batch=2 inference naming for the full 2D gemm_bias compound family.

Inventory: 294 → 298 (+4 builders)

New builders

Builder Contract
gemm_bias_batch2 2D [M,K] @ [K,N] + [1,N] bias
gemm_bias_relu_batch2 same + ReLU
gemm_bias_gelu_batch2 same + GELU
gemm_bias_silu_batch2 same + SiLU

Verification

make test-cpu-value-verify
# 298 passed, 1 skipped

Stacked on R71 (cursor/cpu-loop-r71-layernorm-gated-batch2-78f5).

R73 candidates

  • gemm_gelu_batch2 / gemm_relu_batch2 / gemm_silu_batch2
  • rms_norm_linear_batch2 + GELU/ReLU/SiLU
Open in Web Open in Cursor 

Summary by Sourcery

Add batch=2 coverage for the 2D gemm_bias operator family and update CPU certification inventory counts accordingly.

New Features:

  • Introduce batch=2 builders for 2D gemm_bias with plain, ReLU, GELU, and SiLU variants.

Documentation:

  • Extend AGENTS.md with a new Loop R72 entry documenting the 2D gemm_bias batch=2 contracts and verification status.

Tests:

  • Register the new gemm_bias batch=2 builders in parity and inventory tests and increase planned value-verify count from 294 to 298.

Add four value-verify builders:
- gemm_bias_batch2
- gemm_bias_relu_batch2
- gemm_bias_gelu_batch2
- gemm_bias_silu_batch2

Register in CUSTOMIZED_OP_BUILDERS, parity list, and inventory gates.

Co-authored-by: Johnson.Chen <joy6677@outlook.com>
@sourcery-ai

sourcery-ai Bot commented Jun 27, 2026

Copy link
Copy Markdown

Reviewer's Guide

Add batch=2 2D gemm_bias CPU builders (with ReLU/GELU/SiLU variants), register them in parity/inventory tests, and bump the planned value-verify inventory count from 294 to 298, documenting Loop R72 in AGENTS.md.

Flow diagram for new 2D gemm_bias batch2 CPU builders and activation variants

flowchart LR
    gemm_bias_batch2["gemm_bias_batch2\n[M,K] @ [K,N] + [1,N] bias"]

    gemm_bias_relu_batch2["gemm_bias_relu_batch2\nGEMM + bias + ReLU"]
    gemm_bias_gelu_batch2["gemm_bias_gelu_batch2\nGEMM + bias + GELU"]
    gemm_bias_silu_batch2["gemm_bias_silu_batch2\nGEMM + bias + SiLU"]

    gemm_bias_batch2 --> gemm_bias_relu_batch2
    gemm_bias_batch2 --> gemm_bias_gelu_batch2
    gemm_bias_batch2 --> gemm_bias_silu_batch2
Loading

File-Level Changes

Change Details Files
Introduce 2D gemm_bias batch=2 builders and reference implementations for plain and activated variants.
  • Add builder functions that construct kernel graphs for gemm_bias_batch2 and activation variants using float16 inputs with shapes [8,32] @ [32,16] + [1,16].
  • Define torch-based reference computations (matmul + bias, then optional ReLU/GELU/SiLU) cast back to float16 for each builder.
  • Return graph, inputs, and reference tensors from each builder for use in value-verify tests.
tests/integration/cpu_op_builders.py
Wire new batch=2 gemm_bias builders into general ML parity and builder registries.
  • Register the new gemm_bias_batch2 and its ReLU/GELU/SiLU variants in the cpu_op_builders registry dict used by tests.
  • Extend the list of builder keys used in general ML PyTorch parity tests to include the new batch=2 gemm_bias contracts.
tests/integration/cpu_op_builders.py
tests/integration/test_cpu_general_ml_pytorch_parity.py
Update planned CPU value-verify certification counts from 294 to 298 to account for the new builders.
  • Change assertions that planned_value_verify_count() equals 294 to expect 298 in certification profile and inventory tests.
  • Adjust mocked certification profiles to expect 298 passed value-verify tests instead of 294, keeping skipped and failed counts consistent.
  • Update cert_profile_from_stages calls to pass planned_value_verify=298 so alignment checks remain valid.
tests/integration/test_cpu_certification_profile.py
tests/integration/test_cpu_inventory_planned.py
Document Loop R72 and finalize Loop R71 metadata in AGENTS.md. AGENTS.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@chenxingqiang chenxingqiang marked this pull request as ready for review June 30, 2026 01:42
@chenxingqiang chenxingqiang merged commit febf5b7 into cursor/cpu-loop-r71-layernorm-gated-batch2-78f5 Jun 30, 2026
5 of 12 checks passed

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The four new build_gemm_bias_*_batch2 helpers duplicate most of their setup logic; consider factoring out a shared builder factory that takes the activation op as a parameter to reduce repetition and potential for inconsistencies.
  • The planned value-verify count 298 is now hardcoded in multiple tests; it may be more robust to source this from a single constant or helper to avoid future mismatches when the inventory changes again.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The four new `build_gemm_bias_*_batch2` helpers duplicate most of their setup logic; consider factoring out a shared builder factory that takes the activation op as a parameter to reduce repetition and potential for inconsistencies.
- The planned value-verify count `298` is now hardcoded in multiple tests; it may be more robust to source this from a single constant or helper to avoid future mismatches when the inventory changes again.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants