Skip to content

FEAT: Add TokenBijectionConverter #2023

@sajisanchu1913-source

Description

@sajisanchu1913-source

Following up on #1942 which added LetterBijectionConverter and DigitBijectionConverter.

The bijection attack paper (arXiv:2410.01294) describes a third bijection type where each English letter maps to a randomly sampled distinct token from the target model's tokenizer vocabulary. This mode was discussed during the #1942 review and agreed to be tracked as a separate follow-up.

The abstract base class introduced in #1942 makes this straightforward to add as a new subclass:

class TokenBijectionConverter(BijectionConverter):
def init(self, *, tokenizer, mapping=None, seed=None):
...

def _generate_mapping(self, rng):
    # sample 26 distinct tokens from tokenizer vocabulary
    # map each letter to a token string
    ...

Main consideration is the tokenizer dependency, likely HuggingFace tokenizers. Happy to take this on once #1942 lands.

Metadata

Metadata

Labels

convertersRelated to PyRIT convertersenhancementNew feature or request
No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions