The Cross-Beta DB is a database dedicated to compiling naturally occurring
cross-beta-forming amyloids. All data included are experimentally validated for cross-beta
structure formation. The database primarily serves as a resource for training and
benchmarking new amyloid prediction models
(see
Cross-Beta predictor).
The database also includes experimental conditions and additional information.
All entries can be downloaded individually or in groups. The benchmark set and other
database versions are available for download in the
"Download" section.
A full description of all the variables in the database, accessible by downloading one or
more entries, is available
Here.
For detailed instructions on using the database interface and its features, refer to the
"Tutorial" section.
The importance of protein amyloidogenesis, associated with various diseases and functional
roles, has driven the creation of computational predictors of amyloidogenicity.
The accuracy of these predictors, particularly those utilizing artificial intelligence
technologies, heavily depends on the quality of the data. We built Cross-Beta DB, a
database containing high-quality data on known cross-β amyloids formed under natural
conditions. We used it to train and benchmark several machine-learning (ML) algorithms to
predict amyloid-forming potential of proteins. We developed the Cross-Beta predictor
using an Extra trees ML algorithm, which outperforms other amyloid predictors with the
highest F1 score (0.852) and accuracy (0.844) compared to existing methods.
The development of the Cross-Beta DB database and a new ML-based Cross-Beta predictor
may enable the creation of personalized risk profiles for neurodegenerative diseases
and other amyloidoses—especially as genome sequencing becomes more affordable.