MS-BioGraphs MSA50

Dataset

Description

MS-BioGraphs is a family of sequence similarity graphs that are created by matching all-to-all similarity between 1.7 billion protein sequences in the Metaclust dataste. The family have up to 2.5 trillion edges and is published in the compressed WebGraph format.
Date made availableOct 2023
PublisherQueen's University Belfast
Date of data production2022 -

Cite this