RMDisease V2.0 an updated database of genetic variants that affect RNA modifications with disease and trait implication
Introduction Modification & Species User Guide Related Works

Introduction

Recent advances in epitranscriptomics have unveiled functional associations between RNA modifications and multiple human diseases but distinguishing the functional or disease-related SNVs from the majority of ‘silent’ variants remains a major challenge. We previously developed the RMDisease database for unveiling the association between genetics variants and RNA modifications concerning human disease pathogenesis. In this work, we present RMDisease v2.0, an updated database with expanded coverage. Using deep learning models and from 873,819 experimentally-validated RNA modification sites, we identified a total of 1,366,252 RNA modification (RM)-associated variants that may affect (add or remove RM site) 16 different types of RNA modification (m6A, m5C, m1A, m5U, Ψ, m6Am, m7G, A-to-I, ac4C, Am, Cm, Um, Gm, hm5C, D and f5C) in 20 organisms (human, mouse, rat, zebrafish, maize, fruit fly, yeast, fission yeast, Arabidopsis, rice, chicken, goat, sheep, pig, cow, rhesus monkey, tomato, chimpanzee, green monkey and COVID-19). Among them, 14,749 disease- and 2,441 trait-associated genetic variants may function via the perturbation of epitranscriptomic markers. RMDisease v2.0 should serve as a useful resource for studying the genetic drivers of phenotypes that lie within the epitranscriptome layer circuitry.

intr.png

16 RNA modification types from 20 different species

Epitranscriptome datasets

In RMDisease v2.0, we collected the epitranscriptome profiles of 16 types of RNA modifications from 20 species, including m6A (589,290 sites), m5C (150,412), m1A (32,758), m5U (3,696), Ψ (7,032), m6Am (2,447), m7G (9,951), A-to-I (52,760), ac4C (14,266), Am (1,591), Cm (1,878), Um (2,253), Gm (1,471), hm5C (1,759), D (371) and f5C (1,892), respectively. Specifically, the various types of RNA modification sites were derived from 679 high-throughput sequencing samples by 21 sequencing techniques.

RM-associated variants

RMDisease v2.0 contains a total of 1,366,252 genetic variants that may affect (add or remove) various types of RNA modification in multiple species. This represents a six-fold increase in RM-associated variants, as well as a significant expansion in covered species (from human only to 20 species) and type of RNA modification (from eight to 16 types), compared with our previous version. Specifically, RMDisease v2.0 hosts RM-associated variants related to m6A (833,196), m5C (72,484), m1A (97,104), m5U (14,586), Ψ (84,950), m6Am (15,436), m7G (24,049), A-to-I (71,367), ac4C (45,891), Am (21,806), Cm (24,437), Um (37,313), Gm (19,623), hm5C (49), D (17) and f5C (3,944), covering a variety of species in human (732,418), mouse (227,739), rat (1,752), zebrafish (11,752), maize (1,322), fruit fly (208), yeast (27,533), fission yeast (17), Arabidopsis (144,198), rice (10,438), chicken (14,679), goat (7,860), sheep (18,439), pig (25,484), cow (64,275), rhesus (2,442), tomato (64,830), chimpanzee (167), green monkey (137) and COVID-19 (10,562).

[1] Li, Xiaoyu, Xushen Xiong, and Chengqi Yi. "Epitranscriptome sequencing technologies: decoding RNA modifications." Nature methods 14.1 (2017): 23-31.
Synonyms Abbreviation
N6-methyladenosine m6A
N1-methyladenosine m1A
Pseudouridine Ψ
5-methylcytosine m5C
5-methyluridine m5U
N7-methylguanosine m7G
N6, 2′-O-dimethyladenosine m6Am
Adenosine-to-inosine A-to-I
N4-acetylcytidine ac4C
2′-O-methylation Am
2′-O-methylation Cm
2′-O-methylation Um
2′-O-methylation Gm
5-hydroxymethylcytosine hm5C
Dihydrouridine D
5-formylcytidine f5C
Binomial nomenclature Name
Homo sapiens Human
Mus musculus Mouse
Rattus norvegicus Rat
Danio rerio Zebrafish
Zea mays Maize
Drosophila melanogaster Fly
Saccharomyces cerevisiae Yeast
Schizosaccharomyces pombe Fission yeast
Oryza sativa Rice
Gallus gallus Chicken
Capra hircus Goat
Ovis aries Sheep
Sus scrofa Pig
Bos taurus Cow
Macaca mulatta Rhesus
Solanum lycopersicum Tomato
Pan troglodytes chimpanzee
Chlorocebus aethiops GreenMonkey
COVID-19 COVID-19
Arabidopsis thaliana Arabidopsis

User Guide

# Table

RMDisease V2.0 provides users with a browser page for 16 types of RNA modifications (m6A, m5C, m1A, m5U, Ψ, m6Am, m7G, A-to-I, ac4C, Am, Cm, Um, Gm, hm5C, D and f5C) in 20 species (human, mouse, rat, zebrafish, maize, fly, yeast, fission yeast, arabidopsis, rice, chicken, goat, sheep, pig, cow, rhesus, tomato, chimpanzee, green monkey and COVID-19). The detailed list of modified sites information is listed here. Users can filter the results with Gene Type. Users can also filter the results by their annotations: whether is protein binding region, miRNA target site, splicing site or not, whether it has known diseases, phenotype linkage integrated from GWAS or ClinVar and the type of trait. Due to the specific situations of different species, filter columns in second line are changed with species' change.

Click on button, after turning the scroll bar to the right, to link to the Jbrowser showing SNP-sites, genes, and different modification sites context.

Click on RM ID, e.g. . Users can explore more details of a modification site and more details about genetic variants, RNA binding protein information, miRNA target, splicing, ClinVar and GWAS annotation.

Statistic charts

Pie chart shows the percentage of modification type, species and modification status. Bar chart illustrate the numbers of detected sites, gene, RBP, miRNA Target, Splicing Site, GWAS and ClinVar.

# Search

RMDisease V2.0 provides users a combined selection function of several choices to query the various data which you are interested in. The first choice is to select the species of interest, e.g. Homo sapiens(human). The second choice is used to determine the searching approach, i.e. searching depending on Gene, Region, Rs ID, Disease or Trait.

Example:

For example, after setting the Gene and Homo sapiens(human) choices, users can type in any letters to trigger the drop-down box correspondingly, and then select the gene which you want. Or you could directly type in a word, the query results will provide you all types of gene if human in RMDisease V2.0. Here you go!

The detailed list of search results containing all information of this gene in m6A of human.

# Download

The used data on this website from RMDisease V2.0 is available. Users could batch download. In addition, users could also view and download data by API.


API (application program interface)

This set of APIs was made to let our users reach the information they wanted in a new pragmatical way.

parameter description example
&modification= *essential parameter. All modification including m6A, m5C, m1A, m5U, psi, m6Am, m7G, atoi, ac4C, Am, Cm, Um, Gm, hm5C, D and f5C http://www.rnamd.org/rmdisease2/api.php?modification=m6a&species=cow
&species= *essential parameter. All species including human, mouse, rat, zebrafish, maize, fruit fly, yeast, fission yeast, Arabidopsis, rice, chicken, goat, sheep, pig, cow, rhesus monkey, tomato, chimpanzee, green monkey and covid19 http://www.rnamd.org/rmdisease2/api.php?modification=m6a&species=cow
&gene_type= filters by the Gene Type [protein_coding, lincRNA, miRNA, snoRNA, snRNA, pseudogene, other] http://www.rnamd.org/rmdisease2/api.php?modification=m6a&species=cow&gene_type=protein_coding
&m6a_status= filters by the modification status https://www.rnamd.org/rmdisease2/api.php?modification=m6a&species=cow&m6a_status=cow m6a-gain variant
&confidence_level= filters by the confidence level [low, medium, high] http://www.rnamd.org/rmdisease2/api.php?modification=m6a&species=cow&confidence_level=low
&RM_ID= filters by the RM ID http://www.rnamd.org/rmdisease2/api.php?modification=m6a&species=cow&RM_ID=cow_m6a_associatedSNP_1
Examples: Click on the link to get the results

Related Works

  • MODOMICS
    MODOMICS is a database of RNA modifications that provides comprehensive information concerning the chemical structures of modified ribonucleosides, their biosynthetic pathways, the location of modified residues in RNA sequences, and RNA-modifying enzymes.

  • Met-DB v2.0
    Met-DB v2.0, the significantly improved second version of Met-DB, which is entirely redesigned to focus more on elucidating context-specific m6A functions. Met-DB v2.0 has a major increase in context-specific m6A peaks and single-base sites predicted from 185 samples for 7 species from 26 independent studies.

  • RMBase v2.0
    RMBase v2.0 is a comprehensive database that integrates epitranscriptome sequencing data for the exploration of post-transcriptional modifications of RNAs and their relationships with miRNA binding events, disease-related single-nucleotide polymorphisms (SNPs) and RNA-binding proteins (RBPs).

  • CVm6A
    CVm6A identified 340,950 and 179,201 m6A peaks from 23 human and eight mouse cell lines respectively, and classified them according to subcellular components, gene regions and relevance to cancers.

  • REPIC
    The REPIC (RNA Epitranscriptome Collection) database records about 10 million peaks called from publicly available m6A-seq and MeRIP-seq data using our unified pipeline. These data were collected from 672 samples of 49 studies, covering 61 cell lines or tissues in 11 organisms.

  • RNAmod
    RNAmod is an interactive, one-stop, web-based platform for the automated analysis, annotation, and visualization of mRNA modifications in 21 species. .

  • m6AVar
    m6AVar is a comprehensive database of m6A-associated variants that potentially influence m6A modification, which will help to interpret variants by m6A function.