Introduction
Recent advances in epitranscriptomics have unveiled functional associations between RNA modifications and multiple human diseases but distinguishing the functional or disease-related SNVs from the majority of ‘silent’ variants remains a major challenge. We previously developed the RMDisease database for unveiling the association between genetics variants and RNA modifications concerning human disease pathogenesis. In this work, we present RMDisease v2.0, an updated database with expanded coverage. Using deep learning models and from 873,819 experimentally-validated RNA modification sites, we identified a total of 1,366,252 RNA modification (RM)-associated variants that may affect (add or remove RM site) 16 different types of RNA modification (m6A, m5C, m1A, m5U, Ψ, m6Am, m7G, A-to-I, ac4C, Am, Cm, Um, Gm, hm5C, D and f5C) in 20 organisms (human, mouse, rat, zebrafish, maize, fruit fly, yeast, fission yeast, Arabidopsis, rice, chicken, goat, sheep, pig, cow, rhesus monkey, tomato, chimpanzee, green monkey and COVID-19). Among them, 14,749 disease- and 2,441 trait-associated genetic variants may function via the perturbation of epitranscriptomic markers. RMDisease v2.0 should serve as a useful resource for studying the genetic drivers of phenotypes that lie within the epitranscriptome layer circuitry.
16 RNA modification types from 20 different species
In RMDisease v2.0, we collected the epitranscriptome profiles of 16 types of RNA modifications from 20 species, including m6A (589,290 sites), m5C (150,412), m1A (32,758), m5U (3,696), Ψ (7,032), m6Am (2,447), m7G (9,951), A-to-I (52,760), ac4C (14,266), Am (1,591), Cm (1,878), Um (2,253), Gm (1,471), hm5C (1,759), D (371) and f5C (1,892), respectively. Specifically, the various types of RNA modification sites were derived from 679 high-throughput sequencing samples by 21 sequencing techniques.
RM-associated variantsRMDisease v2.0 contains a total of 1,366,252 genetic variants that may affect (add or remove) various types of RNA modification in multiple species. This represents a six-fold increase in RM-associated variants, as well as a significant expansion in covered species (from human only to 20 species) and type of RNA modification (from eight to 16 types), compared with our previous version. Specifically, RMDisease v2.0 hosts RM-associated variants related to m6A (833,196), m5C (72,484), m1A (97,104), m5U (14,586), Ψ (84,950), m6Am (15,436), m7G (24,049), A-to-I (71,367), ac4C (45,891), Am (21,806), Cm (24,437), Um (37,313), Gm (19,623), hm5C (49), D (17) and f5C (3,944), covering a variety of species in human (732,418), mouse (227,739), rat (1,752), zebrafish (11,752), maize (1,322), fruit fly (208), yeast (27,533), fission yeast (17), Arabidopsis (144,198), rice (10,438), chicken (14,679), goat (7,860), sheep (18,439), pig (25,484), cow (64,275), rhesus (2,442), tomato (64,830), chimpanzee (167), green monkey (137) and COVID-19 (10,562).
[1] Li, Xiaoyu, Xushen Xiong, and Chengqi Yi. "Epitranscriptome sequencing technologies: decoding RNA modifications." Nature methods 14.1 (2017): 23-31.Synonyms | Abbreviation |
---|---|
N6-methyladenosine | m6A |
N1-methyladenosine | m1A |
Pseudouridine | Ψ |
5-methylcytosine | m5C |
5-methyluridine | m5U |
N7-methylguanosine | m7G |
N6, 2′-O-dimethyladenosine | m6Am |
Adenosine-to-inosine | A-to-I |
N4-acetylcytidine | ac4C |
2′-O-methylation | Am |
2′-O-methylation | Cm |
2′-O-methylation | Um |
2′-O-methylation | Gm |
5-hydroxymethylcytosine | hm5C |
Dihydrouridine | D |
5-formylcytidine | f5C |
Binomial nomenclature | Name |
---|---|
Homo sapiens | Human |
Mus musculus | Mouse |
Rattus norvegicus | Rat |
Danio rerio | Zebrafish |
Zea mays | Maize |
Drosophila melanogaster | Fly |
Saccharomyces cerevisiae | Yeast |
Schizosaccharomyces pombe | Fission yeast |
Oryza sativa | Rice |
Gallus gallus | Chicken |
Capra hircus | Goat |
Ovis aries | Sheep |
Sus scrofa | Pig |
Bos taurus | Cow |
Macaca mulatta | Rhesus |
Solanum lycopersicum | Tomato |
Pan troglodytes | chimpanzee |
Chlorocebus aethiops | GreenMonkey |
COVID-19 | COVID-19 |
Arabidopsis thaliana | Arabidopsis |
User Guide
RMDisease V2.0 provides users with a browser page for 16 types of RNA modifications (m6A, m5C, m1A, m5U, Ψ, m6Am, m7G, A-to-I, ac4C, Am, Cm, Um, Gm, hm5C, D and f5C) in 20 species (human, mouse, rat, zebrafish, maize, fly, yeast, fission yeast, arabidopsis, rice, chicken, goat, sheep, pig, cow, rhesus, tomato, chimpanzee, green monkey and COVID-19). The detailed list of modified sites information is listed here. Users can filter the results with Gene Type. Users can also filter the results by their annotations: whether is protein binding region, miRNA target site, splicing site or not, whether it has known diseases, phenotype linkage integrated from GWAS or ClinVar and the type of trait. Due to the specific situations of different species, filter columns in second line are changed with species' change.
Click on button, after turning the scroll bar to the right, to link to the Jbrowser showing SNP-sites, genes, and different modification sites context.
Click on RM ID, e.g. . Users can explore more details of a modification site and more details about genetic variants, RNA binding protein information, miRNA target, splicing, ClinVar and GWAS annotation.
Statistic chartsPie chart shows the percentage of modification type, species and modification status. Bar chart illustrate the numbers of detected sites, gene, RBP, miRNA Target, Splicing Site, GWAS and ClinVar.
# SearchRMDisease V2.0 provides users a combined selection function of several choices to query the various data which you are interested in. The first choice is to select the species of interest, e.g. Homo sapiens(human). The second choice is used to determine the searching approach, i.e. searching depending on Gene, Region, Rs ID, Disease or Trait.
Example:For example, after setting the Gene and Homo sapiens(human) choices, users can type in any letters to trigger the drop-down box correspondingly, and then select the gene which you want. Or you could directly type in a word, the query results will provide you all types of gene if human in RMDisease V2.0. Here you go!
The detailed list of search results containing all information of this gene in m6A of human.
# DownloadThe used data on this website from RMDisease V2.0 is available. Users could batch download. In addition, users could also view and download data by API.
API (application program interface)
This set of APIs was made to let our users reach the information they wanted in a new pragmatical way.
parameter | description | example |
---|---|---|
&modification= | *essential parameter. All modification including m6A, m5C, m1A, m5U, psi, m6Am, m7G, atoi, ac4C, Am, Cm, Um, Gm, hm5C, D and f5C | http://www.rnamd.org/rmdisease2/api.php?modification=m6a&species=cow |
&species= | *essential parameter. All species including human, mouse, rat, zebrafish, maize, fruit fly, yeast, fission yeast, Arabidopsis, rice, chicken, goat, sheep, pig, cow, rhesus monkey, tomato, chimpanzee, green monkey and covid19 | http://www.rnamd.org/rmdisease2/api.php?modification=m6a&species=cow |
&gene_type= | filters by the Gene Type [protein_coding, lincRNA, miRNA, snoRNA, snRNA, pseudogene, other] | http://www.rnamd.org/rmdisease2/api.php?modification=m6a&species=cow&gene_type=protein_coding |
&m6a_status= | filters by the modification status | https://www.rnamd.org/rmdisease2/api.php?modification=m6a&species=cow&m6a_status=cow m6a-gain variant |
&confidence_level= | filters by the confidence level [low, medium, high] | http://www.rnamd.org/rmdisease2/api.php?modification=m6a&species=cow&confidence_level=low |
&RM_ID= | filters by the RM ID | http://www.rnamd.org/rmdisease2/api.php?modification=m6a&species=cow&RM_ID=cow_m6a_associatedSNP_1 |
- I want to retrive all rat's m6a information which gene type is snoRNA, modification status is gain variant and confidence level is low.
http://www.rnamd.org/rmdisease2/api.php?modification=m6a&species=rat&gene_type=snoRNA&m6a_status=rat m6a-gain variant&confidence_level=low - I want to retrive all human's m1a information which gene type is miRNA and confidence level is high.
http://www.rnamd.org/rmdisease2/api.php?modification=m1a&species=human&gene_type=miRNA&confidence_level=high
Related Works
- MODOMICS
MODOMICS is a database of RNA modifications that provides comprehensive information concerning the chemical structures of modified ribonucleosides, their biosynthetic pathways, the location of modified residues in RNA sequences, and RNA-modifying enzymes. - Met-DB v2.0
Met-DB v2.0, the significantly improved second version of Met-DB, which is entirely redesigned to focus more on elucidating context-specific m6A functions. Met-DB v2.0 has a major increase in context-specific m6A peaks and single-base sites predicted from 185 samples for 7 species from 26 independent studies. - RMBase v2.0
RMBase v2.0 is a comprehensive database that integrates epitranscriptome sequencing data for the exploration of post-transcriptional modifications of RNAs and their relationships with miRNA binding events, disease-related single-nucleotide polymorphisms (SNPs) and RNA-binding proteins (RBPs). - CVm6A
CVm6A identified 340,950 and 179,201 m6A peaks from 23 human and eight mouse cell lines respectively, and classified them according to subcellular components, gene regions and relevance to cancers. - REPIC
The REPIC (RNA Epitranscriptome Collection) database records about 10 million peaks called from publicly available m6A-seq and MeRIP-seq data using our unified pipeline. These data were collected from 672 samples of 49 studies, covering 61 cell lines or tissues in 11 organisms. - RNAmod
RNAmod is an interactive, one-stop, web-based platform for the automated analysis, annotation, and visualization of mRNA modifications in 21 species. . - m6AVar
m6AVar is a comprehensive database of m6A-associated variants that potentially influence m6A modification, which will help to interpret variants by m6A function.