DNA alignment has been a killer app for the FM-index, but aligning DNA reads against a single genome can bias research results and medical diagnoses. In the past few years, we have found ways to FM-index datasets of thousands of genomes, but researchers want the results expressed in terms of compact representations called pangenome graphs.
Hundreds of matches in the dataset may correspond to only one or two matches in the graph. Given a read, therefore, we would like to find which parts of it match well and where they match in the graph in time depending on the length of the read and the number of matches in the graph but not on the number of matches in the dataset. We are now closing in on that goal; this talk will give a high-level view of the challenges and some potential solutions.