docs: Evaluation module demo and documentation#194
Conversation
Co-authored-by: Jamie Milsom <jamie.milsom@ons.gov.uk>
|
I've applied additional changes based on feedback. I've set all the VectorStores instantiations in the eval notebook to use |
jamie-ons
left a comment
There was a problem hiding this comment.
Yes, had a chat to luke and skip_save is a better idea.
Have had another look at this and it all looks good now :)
lukeroantreeONS
left a comment
There was a problem hiding this comment.
This looks good - noted a few small tweaks to be made before merging
| "source": [ | ||
| "To begin, we're going to create 2 VectorStores from our `fake_soc_dataset.csv` file which contains <b>mock</b> SOC survey responses and their corresponding occupation codes. One VectorStore will be built from the full dataset, and the second one will be built from half the dataset. \n", | ||
| "\n", | ||
| "Since the second VectorStore will contain only have the training data, we can reason that this lack of coverage will showcase poorer performance against a evaluation dataset that assesses the full coverage of the training data." |
There was a problem hiding this comment.
"only have half of the training data"
There was a problem hiding this comment.
"will not perform as well as the full dataset due to lack of coverage"
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "If the code cells in this section ran successfully then we now have 2 VectorStores we can use to evaluate with the Evaluation module." |
There was a problem hiding this comment.
Suggest cutting this line - it makes it seem like we're not confident it works
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "The results object is a dataframe with provided VectorStore names as the row indexes, and each column is associated with a give metric. We should also see the results have been saved to a CSV file in the specified directory." |
There was a problem hiding this comment.
"saved to the specified output CSV file"

✨ Summary
These changes add documentation for the recent Evaluation module #171 #172 . Includes a new demo notebook
evaluation_workflow_demo.ipynbwalking users through the important concepts and code of the evaluation module. It also contains a new demo file of ground truth queries that work with the existingfake_soc_dataset.csvdata, used to showcase evaluation.It also has minor updates to the demo readme file.
📜 Changes Introduced
fake_soc_dataset.csvdata✅ Checklist
terraform fmt&terraform validate)🔍 How to Test