The idea:

This might mean some changes in the ReferenceFinder class but honestly just building something up in a notebook first is probably the right way. The first step is to get a proper dataset of FIPS algorithms with the certificates that mention a given algorithm, this should be split by the different sources of algorithm mentions (web and PDF). The PDF ones should be cleaned first however, as the raw ones may contain lots of BS.