Can a Transformer learn to solve LTLf synthesis, a task normally handled by symbolic tools? This repository contains the complete pipeline used to find out: dataset generation, model training, and model-checked evaluation.
The code accompanies my MSc dissertation at the University of Oxford (2024), supervised by Giuseppe De Giacomo, Antonio Di Stasio, and Eleonora Giunchiglia, and the NeSy 2026 paper Can Transformers Perform LTLf Strategic Reasoning? Sometimes (see Citation).
The task: LTLf synthesis: given a specification in Linear Temporal Logic over finite traces, construct a strategy (an AIGER circuit) that satisfies it, or determine that none exists (the specification is unrealizable). Symbolic tools such as Strix already solve this problem, but suffer from state explosion as specifications grow.
The idea: Train an encoder-decoder Transformer to do it instead: TLSF specification in, realizability verdict and AIGER circuit out.
The pipeline:
- Dataset generation: builds LTLf specifications by conjoining randomly instantiated basic cases, and labels each one with a classical tool (Strix or Spot's
ltlsynt): a circuit if realizable,UNREALIZABLEotherwise. - Training: encoder-decoder Transformer on (specification, circuit) pairs, with a custom trie-based tokenizer and optional tree positional encodings derived from the formula's AST.
- Evaluation: constrained beam search (candidates violating the AIGER format are pruned), with one model predicting realizability and a second, trained on realizable instances only, generating the circuit. Since no LTLf model checker for AIGER circuits exists, each prediction is verified against the equivalent LTL problem with nuXmv. This way the model gets credit for any correct circuit, not just an exact match with Strix's.
| Path | Description |
|---|---|
dataset_generation/ |
Dataset generation and synthesis automation (Strix/Spot) |
transformer/ |
Model, tokenizer, training and evaluation scripts, config.yaml |
transformer/sample_data/ |
A few sample TLSF inputs and AIGER outputs |
model_checking/ |
Semantic verification of generated circuits with nuXmv |
helpers/ |
AIGER format constraints for beam search, sequence generation strategies |
basic_cases_ltlf.txt |
Basic LTLf formulas used to generate specifications (see Dataset) |
Install the Python dependencies:
pip install -r requirements.txtTraining and evaluation log to Weights & Biases, so a wandb account (or wandb offline) is needed.
External tools (only needed for the stage that uses them):
| Tool | Used for | Expected location |
|---|---|---|
| Strix | Synthesis (dataset labels) | ~/Strix/bin/strix (see strix_tlsf.sh) |
| syfco | TLSF conversion | on PATH |
Spot (ltlfilt, ltlsynt) |
LTLf→LTL translation, formula stats, alternative synthesizer | on PATH |
nuXmv, smvtoaig, ltl2smv, combine-aiger, meyerphi-syfco |
Model checking during evaluation | ./synt-post-processor/ relative to the working directory |
cd dataset_generation
python data_generation_and_synthesis_automation.py <formulas_file> <num_iterations> [strix|spot]
# e.g.
python data_generation_and_synthesis_automation.py ../basic_cases_ltlf.txt 1000 strix<formulas_file> is the basic-cases file (basic_cases_ltlf.txt); each of the <num_iterations> iterations builds specifications with an increasing number of conjuncts; the synthesis tool defaults to Strix. Generation parameters (number of input/output variables, synthesis timeout, CPUs) are set in dataset_generation/config.py.
The output is a dataset folder next to the formulas file, containing the TLSF inputs, their LTL transformations, the synthesis outcomes, and a summary stats file.
Set the data directories and hyperparameters in transformer/config.yaml, then:
cd transformer
python main.py <config_path> <experiment_name>
# e.g.
python main.py config.yaml tree_pe_experimentCheckpoints are written to mixed_experiments__<experiment_name>__models/, together with a copy of the config file so each experiment stays reproducible.
Set the evaluate section of transformer/config.yaml (test data directories and the paths of the two trained models), then:
cd transformer
python evaluate.py <config_path> <run_name>
# e.g.
python evaluate.py config.yaml tree_pe_evaluationPer-instance metrics and totals are written to evaluation_results/<run_name>_results.csv, with periodic snapshots saved during the run. Model checking requires the synt-post-processor/ tools (see Setup).
basic_cases_ltlf.txt contains basic LTLf formulas from the Random family (Lydia and Syft folders), extracted with extract_basic_cases.py. Only the formula structure matters: variables are placeholders that get randomly instantiated during generation. The extraction follows the way the original formulas were created (conjunctions of basic cases), and the formulas are ordered by ascending length.
A citation will be added here once the paper is published.
See LICENSE.