This guide aims at offering a quick-start on how to access the data sets provided by m6AConquer database. The results provided by m6AConquer consist two parts: Multi-Omics data framework and IDR analysis-based integration result. We will show you how to load and extract data from the downloaded files sequentially.

Multi-Omics data framework

m6AConquer provides MultiAssayExperiment (multi-omics), SummarizedExperiment (single-omics) and RaggedExperiment (single-omics) files for sharing quantitative m6A data. The MultiAssayExperiment (MAE) contains 4 types of omics data: m6A methylome, transcriptomics, genetic variants, and alernative splicing data. You are free to use any of them.

To access the site calling results in MAE file format, please download the corresponding R object from the MultiAssayExperiment column. As an illustration, here we use the site calling result from eTAM-seq sequencing data. First, load the SummarizedExperiment and MultiAssayExperiment packages and read the file into R:

suppressWarnings(suppressPackageStartupMessages(library(SummarizedExperiment)))
suppressWarnings(suppressPackageStartupMessages(library(MultiAssayExperiment)))
suppressWarnings(suppressPackageStartupMessages(library(RaggedExperiment)))
MAE <- readRDS("eTAM_hg38_WT_MAE.rds")
MAE
## A MultiAssayExperiment object of 4 listed
##  experiments with user-defined names and respective classes.
##  Containing an ExperimentList class object of length 4:
##  [1] m6AOmics: RangedSummarizedExperiment with 2788705 rows and 4 columns
##  [2] TranscriptOmics: RangedSummarizedExperiment with 58650 rows and 4 columns
##  [3] SplicingEvents: RangedSummarizedExperiment with 671257 rows and 4 columns
##  [4] GeneticVariation: RaggedExperiment with 46 rows and 4 columns
## Functionality:
##  experiments() - obtain the ExperimentList instance
##  colData() - the primary/phenotype DataFrame
##  sampleMap() - the sample coordination DataFrame
##  `$`, `[`, `[[` - extract colData columns, subset, or experiment
##  *Format() - convert into a long or wide DataFrame
##  assays() - convert ExperimentList to a SimpleList of matrices
##  exportClass() - save data to flat files

The MAE object contains both the 4 levels of omics data.

experiments(MAE)
## ExperimentList class object of length 4:
##  [1] m6AOmics: RangedSummarizedExperiment with 2788705 rows and 4 columns
##  [2] TranscriptOmics: RangedSummarizedExperiment with 58650 rows and 4 columns
##  [3] SplicingEvents: RangedSummarizedExperiment with 671257 rows and 4 columns
##  [4] GeneticVariation: RaggedExperiment with 46 rows and 4 columns

Assay Data

To access the m6A methylome data, you may extract the m6AOmics experiment:

m6AOmics <- experiments(MAE)[["m6AOmics"]]
m6AOmics
## class: RangedSummarizedExperiment 
## dim: 2788705 4 
## metadata(1): OmixM6A_para
## assays(4): m6A Total m6ASiteProb AdjPvalue
## rownames: NULL
## rowData names(2): high_confidence_union high_confidence_specific
## colnames(4): SRR21070403 SRR21070404 SRR21070405 SRR21070406
## colData names(10): SampleID Title ... CurationDate BioSample

This is basically a SummarizedExperiment object with four assay matrices:

  • m6A: m6A read counts

  • Total: Total read counts

  • m6ASiteProb: m6A site probabilities as normalized m6A levels

  • AdjPvalue: Benjamini-Hochberg method corrected p-values (FDR) returned by sample-specific site calling

For each m6A profiling techniques, the m6A read counts and total read counts were extracted as below:

Category m6A read counts Total read counts
GLORI Chemical-assisted Counts of unconverted A bases at candidate sites Total read coverages covering the candidate sites
DART-seq Enzyme-assisted Counts of U bases converted from C bases at one base downstream of candidate sites Total read coverages covering the candidate sites
eTAM-seq Enzyme-assisted Counts of unconverted A bases at candidate sites Total read coverages covering the candidate sites
m6A-REF-seq Enzyme-assisted Counts of undigested ACA motifs covering the candidate sites with the first A base Counts of digested and undigested ACA motifs covering the candidate sites with the first A base
m6A-SAC-seq Enzyme-assisted Counts of bases mismatching the candidate sites Total read coverages covering the candidate sites
MAZTER-seq Enzyme-assisted 3’ read-end coverages at the first A base of the ACA motif covering the candidate sites Total base coverages at the first A base of the ACA motif covering the candidate sites
scDART-seq Enzyme-assisted Counts of U bases converted from C bases at one base downstream of candidate sites Total read coverages covering the candidate sites
MeRIP-seq Antibody-assisted Counts of IP reads covering the candidate sites Counts of both IP and input reads covering the candidate sites
m6ACE-seq Antibody-assisted Counts of digested reads covering the candidate sites Counts of both digested and input reads covering the candidate sites
Oxford-nanopore Direct RNA sequencing Counts of m6A reads identified by m6Anet covering the candidate sites Total read coverages covering the candidate sites

If you have downloaded the SE objects from the corresponding columns, you can also use the codes on MAE experiments to extract data of interest.

Each of these assays can be access as follow:

## Here, we extract the sites with total m6A coverages across various samples larger than 30
index <- rowSums(assays(m6AOmics)[["m6A"]]) > 30
head(assays(m6AOmics)[["m6A"]][index,])
##      SRR21070403 SRR21070404 SRR21070405 SRR21070406
## [1,]          23           9          12          14
## [2,]          32          19          27          37
## [3,]          11          15          23          26
## [4,]          15          20          20          27
## [5,]           8          13          23          24
## [6,]          36          54          52          90
head(assays(m6AOmics)[["Total"]][index,])
##      SRR21070403 SRR21070404 SRR21070405 SRR21070406
## [1,]         925         614         924        2024
## [2,]        2473        3693        4103        7044
## [3,]        3158        4628        5032        8219
## [4,]        3222        4795        5035        8236
## [5,]        3285        4790        5042        8168
## [6,]        3277        4740        4900        7884
head(assays(m6AOmics)[["m6ASiteProb"]][index,])
##      SRR21070403 SRR21070404 SRR21070405 SRR21070406
## [1,]   0.1306800   0.1318108   0.1334797   0.1410390
## [2,]   0.1199664   0.1549337   0.1458592   0.1509339
## [3,]   0.1795600   0.1844007   0.1623979   0.1790439
## [4,]   0.1591554   0.1669527   0.1702940   0.1767272
## [5,]   0.2101689   0.1984438   0.1625029   0.1839964
## [6,]   0.1223861   0.1285718   0.1330076   0.1292391
head(assays(m6AOmics)[["AdjPvalue"]][index,])
##      SRR21070403 SRR21070404 SRR21070405 SRR21070406
## [1,]           1           1           1           1
## [2,]           1           1           1           1
## [3,]           1           1           1           1
## [4,]           1           1           1           1
## [5,]           1           1           1           1
## [6,]           1           1           1           1

To access the expression levels of eTAM-seq sequencing data, you may extract the TranscriptOmics experiment:

TranscriptOmics <- experiments(MAE)[["TranscriptOmics"]]
TranscriptOmics
## class: RangedSummarizedExperiment 
## dim: 58650 4 
## metadata(0):
## assays(2): ReadCounts RPKM
## rownames(58650): ENSG00000223972 ENSG00000227232 ... ENSG00000231514
##   ENSG00000235857
## rowData names(6): gene_id gene_name ... symbol entrezid
## colnames(4): SRR21070403 SRR21070404 SRR21070405 SRR21070406
## colData names(10): SampleID Title ... CurationDate BioSample

This SE only contains two assay, which are the original read counts on all the genes and the normalized RPKM:

head(assays(TranscriptOmics)[["ReadCounts"]])
##                 SRR21070403 SRR21070404 SRR21070405 SRR21070406
## ENSG00000223972           0           0           0           0
## ENSG00000227232         164         105         118         162
## ENSG00000278267           0           0           0           0
## ENSG00000243485           0           0           0           0
## ENSG00000237613           0           0           0           0
## ENSG00000268020           0           0           0           0
head(assays(TranscriptOmics)[["RPKM"]])
##                 SRR21070403 SRR21070404 SRR21070405 SRR21070406
## ENSG00000223972    0.000000    0.000000    0.000000    0.000000
## ENSG00000227232    3.259619    2.344645    2.467891    2.497534
## ENSG00000278267    0.000000    0.000000    0.000000    0.000000
## ENSG00000243485    0.000000    0.000000    0.000000    0.000000
## ENSG00000237613    0.000000    0.000000    0.000000    0.000000
## ENSG00000268020    0.000000    0.000000    0.000000    0.000000

Similarly, to access the alternative splicing data and genetic variant data, you can extract the SplicingEvents and GeneticVariation experiment respectively.

SplicingEvents <- experiments(MAE)[["SplicingEvents"]]
SplicingEvents
## class: RangedSummarizedExperiment 
## dim: 671257 4 
## metadata(0):
## assays(6): ReadCounts SE ... A5SS A3SS
## rownames(671257): ENSG00000290825:E001 ENSG00000290825:E002 ...
##   ENSG00000210194:E001 ENSG00000210196:E001
## rowData names(7): tx_id tx_name ... exon_rank exonic_part
## colnames(4): SRR21070403 SRR21070404 SRR21070405 SRR21070406
## colData names(10): SampleID Title ... CurationDate BioSample
GeneticVariation <- experiments(MAE)[["GeneticVariation"]]
GeneticVariation
## class: RaggedExperiment 
## dim: 46 4 
## assays(10): paramRangeID REF ... GQ PL
## rownames(46): rs1279138
##   chr12:6770904_TCCCTTTTTGTATAATTTAATAAAGAAATGGTCGCGCTTCTGTTTTTAACCTGTCTCCTGCTTTCCCGGGGGCTCCAGTCAGTGCGACAAAAGGGTAGAGAGGGAGGAGGGTGGCTGACCTCCCATTCTGCCAGGA/T
##   ... rs10096642 rs6478689
## colnames(4): SRR21070403 SRR21070404 SRR21070405 SRR21070406
## colData names(10): SampleID Title ... CurationDate BioSample

Rowrange data

To access the consistent genomic ranges used to count m6A and total read coverages, you can extract the Rowrange data from m6AOmics:

rowRanges(m6AOmics)
## GRanges object with 2788705 ranges and 2 metadata columns:
##             seqnames    ranges strand | high_confidence_union
##                <Rle> <IRanges>  <Rle> |             <numeric>
##         [1]     chr1     11873      + |                     0
##         [2]     chr1     11896      + |                     0
##         [3]     chr1     11940      + |                     0
##         [4]     chr1     12017      + |                     0
##         [5]     chr1     12147      + |                     0
##         ...      ...       ...    ... .                   ...
##   [2788701]     chrM     14324      - |                     0
##   [2788702]     chrM     15023      - |                     0
##   [2788703]     chrM     15413      - |                     0
##   [2788704]     chrM     15639      - |                     0
##   [2788705]     chrM     15968      - |                     0
##             high_confidence_specific
##                            <numeric>
##         [1]                        0
##         [2]                        0
##         [3]                        0
##         [4]                        0
##         [5]                        0
##         ...                      ...
##   [2788701]                        0
##   [2788702]                        0
##   [2788703]                        0
##   [2788704]                        0
##   [2788705]                        0
##   -------
##   seqinfo: 25 sequences (1 circular) from hg38 genome

The metacolumn in the rowrange data of m6AOmics are the dummy variables indicating if the sites are supported by technical orthogonal integration. high_confidence_union indicates if the site is supported by integration of any orthogonal technique pair, namely high-confidence m6A sites. high_confidence_specific indicates if the site is supported by the integration of any orthogonal technique pair covering current technqiue.

Similarly, to access the genomic locations of all the genes and exonic parts, you can extract the Rowrange from TranscriptOmics and SplicingEvents respectively:

rowRanges(TranscriptOmics)
## GRanges object with 58650 ranges and 6 metadata columns:
##                   seqnames            ranges strand |         gene_id
##                      <Rle>         <IRanges>  <Rle> |     <character>
##   ENSG00000223972     chr1       11869-14409      + | ENSG00000223972
##   ENSG00000227232     chr1       14404-29570      - | ENSG00000227232
##   ENSG00000278267     chr1       17369-17436      - | ENSG00000278267
##   ENSG00000243485     chr1       29554-31109      + | ENSG00000243485
##   ENSG00000237613     chr1       34554-36081      - | ENSG00000237613
##               ...      ...               ...    ... .             ...
##   ENSG00000224240     chrY 26549425-26549743      + | ENSG00000224240
##   ENSG00000227629     chrY 26586642-26591601      - | ENSG00000227629
##   ENSG00000237917     chrY 26594851-26634652      - | ENSG00000237917
##   ENSG00000231514     chrY 26626520-26627159      - | ENSG00000231514
##   ENSG00000235857     chrY 56855244-56855488      + | ENSG00000235857
##                     gene_name           gene_biotype seq_coord_system
##                   <character>            <character>      <character>
##   ENSG00000223972     DDX11L1 transcribed_unproces..       chromosome
##   ENSG00000227232      WASH7P unprocessed_pseudogene       chromosome
##   ENSG00000278267   MIR6859-1                  miRNA       chromosome
##   ENSG00000243485   MIR1302-2                lincRNA       chromosome
##   ENSG00000237613     FAM138A                lincRNA       chromosome
##               ...         ...                    ...              ...
##   ENSG00000224240     CYCSP49   processed_pseudogene       chromosome
##   ENSG00000227629  SLC25A15P1 unprocessed_pseudogene       chromosome
##   ENSG00000237917     PARP4P1 unprocessed_pseudogene       chromosome
##   ENSG00000231514     FAM58CP   processed_pseudogene       chromosome
##   ENSG00000235857     CTBP2P1   processed_pseudogene       chromosome
##                        symbol                       entrezid
##                   <character>                         <list>
##   ENSG00000223972     DDX11L1 100287596,100287102,727856,...
##   ENSG00000227232      WASH7P                           <NA>
##   ENSG00000278267   MIR6859-1                      102466751
##   ENSG00000243485   MIR1302-2            105376912,100302278
##   ENSG00000237613     FAM138A           654835,645520,641702
##               ...         ...                            ...
##   ENSG00000224240     CYCSP49                           <NA>
##   ENSG00000227629  SLC25A15P1                           <NA>
##   ENSG00000237917     PARP4P1                           <NA>
##   ENSG00000231514     FAM58CP                           <NA>
##   ENSG00000235857     CTBP2P1                           <NA>
##   -------
##   seqinfo: 25 sequences from hg38 genome
rowRanges(SplicingEvents)
## GRanges object with 671257 ranges and 7 metadata columns:
##                        seqnames      ranges strand |         tx_id
##                           <Rle>   <IRanges>  <Rle> | <IntegerList>
##   ENSG00000290825:E001     chr1 11869-12009      + |             1
##   ENSG00000290825:E002     chr1 12058-12178      + |             1
##   ENSG00000290825:E003     chr1 12698-12721      + |             1
##   ENSG00000223972:E001     chr1 12975-13052      + |             2
##   ENSG00000290825:E004     chr1 13375-13452      + |             1
##                    ...      ...         ...    ... .           ...
##   ENSG00000210144:E001     chrM   5827-5891      - |        252831
##   ENSG00000210151:E001     chrM   7446-7514      - |        252832
##   ENSG00000198695:E001     chrM 14149-14673      - |        252833
##   ENSG00000210194:E001     chrM 14674-14742      - |        252834
##   ENSG00000210196:E001     chrM 15956-16023      - |        252835
##                                tx_name         gene_id       exon_id
##                        <CharacterList>     <character> <IntegerList>
##   ENSG00000290825:E001 ENST00000456328 ENSG00000290825             1
##   ENSG00000290825:E002 ENST00000456328 ENSG00000290825             1
##   ENSG00000290825:E003 ENST00000456328 ENSG00000290825             5
##   ENSG00000223972:E001 ENST00000450305 ENSG00000223972             6
##   ENSG00000290825:E004 ENST00000456328 ENSG00000290825             8
##                    ...             ...             ...           ...
##   ENSG00000210144:E001 ENST00000387409 ENSG00000210144        797948
##   ENSG00000210151:E001 ENST00000387416 ENSG00000210151        797949
##   ENSG00000198695:E001 ENST00000361681 ENSG00000198695        797950
##   ENSG00000210194:E001 ENST00000387459 ENSG00000210194        797951
##   ENSG00000210196:E001 ENST00000387461 ENSG00000210196        797952
##                              exon_name     exon_rank exonic_part
##                        <CharacterList> <IntegerList>   <integer>
##   ENSG00000290825:E001 ENSE00002234944             1           1
##   ENSG00000290825:E002 ENSE00002234944             1           2
##   ENSG00000290825:E003 ENSE00003582793             2           3
##   ENSG00000223972:E001 ENSE00001799933             4           1
##   ENSG00000290825:E004 ENSE00002312635             3           4
##                    ...             ...           ...         ...
##   ENSG00000210144:E001 ENSE00001544488             1           1
##   ENSG00000210151:E001 ENSE00001544487             1           1
##   ENSG00000198695:E001 ENSE00001434974             1           1
##   ENSG00000210194:E001 ENSE00001544476             1           1
##   ENSG00000210196:E001 ENSE00001544473             1           1
##   -------
##   seqinfo: 25 sequences (1 circular) from an unspecified genome; no seqlengths

Sample annotation

The sample annotations provided by data generators are also provided in MAE. For most of the sequencing techniques, the sample annotations for all experiments are identical. For antibody-assisted techniques, the sample annotations for m6A methylome data are from IP samples, and those for transcriptomic data are from input samples. If you access sample annotation from MAE object, you can obtain a complete sample annotation tables for both IP and input samples. Here, eTAM-seq is a enzyme-assisted techniques. Therefore, their sample annotations are the same.

head(colData(MAE))
## DataFrame with 4 rows and 10 columns
##                SampleID                  Title SourceDatabase TissueOrCellLine
##             <character>            <character>    <character>      <character>
## SRR21070403 SRR21070403 HeLa_polyA_WT_FTO-_r..            GEO             HeLa
## SRR21070404 SRR21070404 HeLa_polyA_WT_FTO-_r..            GEO             HeLa
## SRR21070405 SRR21070405 HeLa_polyA_WT_FTO-_r..            GEO             HeLa
## SRR21070406 SRR21070406 HeLa_polyA_WT_FTO-_r..            GEO             HeLa
##                 Organism   Treatment DetectionTechnique         DataProcessing
##              <character> <character>        <character>            <character>
## SRR21070403 Homo Sapiens          WT           eTAM-seq Cutadapt|Hisat3N|fin..
## SRR21070404 Homo Sapiens          WT           eTAM-seq Cutadapt|Hisat3N|fin..
## SRR21070405 Homo Sapiens          WT           eTAM-seq Cutadapt|Hisat3N|fin..
## SRR21070406 Homo Sapiens          WT           eTAM-seq Cutadapt|Hisat3N|fin..
##             CurationDate    BioSample
##              <character>  <character>
## SRR21070403 Oct 24, 2022 SAMN30324096
## SRR21070404 Oct 24, 2022 SAMN30324097
## SRR21070405 Oct 24, 2022 SAMN30324098
## SRR21070406 Oct 24, 2022 SAMN30324098

Metadata

Finally, the MultiAssayExperiment object also contains fitted parameters of the beta-binomial mixture models.

metadata(MAE)[[1]]
##             bg_proportion fg_proportion alpha_m6A_bg beta_m6A_bg alpha_m6A_fg
## SRR21070403     0.6259963     0.3740037    1.0403727    45.46696    0.3501398
## SRR21070404     0.6067907     0.3932093    0.9823051    50.64032    0.3159781
## SRR21070405     0.6110785     0.3889215    0.9205881    44.92017    0.3238199
## SRR21070406     0.6103329     0.3896671    0.9397112    45.87532    0.3478345
##             beta_m6A_fg
## SRR21070403   0.9485091
## SRR21070404   0.9112292
## SRR21070405   0.9308952
## SRR21070406   0.9743458

This can also be accessed through the metadata of site summary SE:

metadata(m6AOmics)[[1]]
##             bg_proportion fg_proportion alpha_m6A_bg beta_m6A_bg alpha_m6A_fg
## SRR21070403     0.6259963     0.3740037    1.0403727    45.46696    0.3501398
## SRR21070404     0.6067907     0.3932093    0.9823051    50.64032    0.3159781
## SRR21070405     0.6110785     0.3889215    0.9205881    44.92017    0.3238199
## SRR21070406     0.6103329     0.3896671    0.9397112    45.87532    0.3478345
##             beta_m6A_fg
## SRR21070403   0.9485091
## SRR21070404   0.9112292
## SRR21070405   0.9308952
## SRR21070406   0.9743458

IDR analysis-based integration result

m6AConquer provides both Genomic Ranges and CSV files for IDR analysis-based integration result. Both file types contain the same content. Currently, we only provide the individual and merged integration results across orthogonal techniques. You are free to use any of them.

For each technique pair, we also provide two kinds of integration results: positive reproducible sites and all reproducible sites.

Genomic Ranges

To access the individual integration results in Genomic Ranges file format, please download the corresponding R object from the corresponding column. As an illustration, here we use the positive integration results between GLORI and eTAM-seq. First, load the GenomicRanges packages and read the file into R:

suppressWarnings(suppressPackageStartupMessages(library(GenomicRanges)))
gr <- readRDS("m6A_HighConfSites_eTAM_GLORI_hg38.rds")
head(gr)
## GRanges object with 6 ranges and 7 metadata columns:
##       seqnames    ranges strand | m6A_ratio_eTAM-seq   m6A_ratio_GLORI
##          <Rle> <IRanges>  <Rle> |        <character>       <character>
##   [1]     chr5 172339669      * |                  1 0.988571428571429
##   [2]    chr20  30512433      * |                  1 0.993157380254154
##   [3]     chr7   4764049      * |                  1                 1
##   [4]     chr9  84002004      * |                  1                 1
##   [5]     chr8  10725587      * |                  1 0.994117647058824
##   [6]    chr12  53945469      * |                  1 0.991087344028521
##       m6A_probability_eTAM-seq m6A_probability_GLORI Pvalue_adjusted_eTAM-seq
##                    <character>           <character>              <character>
##   [1]                        1                     1     6.59523340105527e-65
##   [2]                        1                     1      9.1957035297511e-40
##   [3]                        1                     1     4.90269299673246e-31
##   [4]                        1                     1     2.41502528942755e-58
##   [5]                        1                     1     9.10217850073897e-28
##   [6]                        1                     1     2.07642323706231e-47
##       Pvalue_adjusted_GLORI Irreproducible_discovery_rate
##                 <character>                   <character>
##   [1] 1.05596909776191e-180          3.64423036192107e-09
##   [2]                     0          3.64423036192107e-09
##   [3]  4.85119768337239e-39          3.64423036192107e-09
##   [4]  2.67073957390016e-27          3.64423036192107e-09
##   [5] 2.55277641665495e-178          3.64423036192107e-09
##   [6]                     0          3.64423036192107e-09
##   -------
##   seqinfo: 25 sequences from an unspecified genome; no seqlengths

The Genomic Ranges file shows the genomic locations of all the positive reproducible sites identified by IDR between GLORI and eTAM-seq, with a total site number of 91979. The meta columns of the results contain the m6A methylation ratios, posterior probabilities, and BH corrected p-values modelled in each technique, along with the IDR value evaluated.

To access the merged integration results, you may load the merged genomic ranges file into R:

gr_merged <- readRDS("m6A_HighConfSites_Combined_hg38.rds")
head(gr_merged)
## GRanges object with 6 ranges and 2 metadata columns:
##       seqnames    ranges strand | support_number      support_technique
##          <Rle> <IRanges>  <Rle> |      <integer>            <character>
##   [1]     chr1    139472      * |              1         eTAM-seq&GLORI
##   [2]     chr1    185038      * |              2 GLORI&m6A-SAC-seq, G..
##   [3]     chr1    258559      * |              1        GLORI&MeRIP-seq
##   [4]     chr1    348032      * |              1        GLORI&MeRIP-seq
##   [5]     chr1    348080      * |              1      GLORI&m6A-SAC-seq
##   [6]     chr1    826939      * |              2 GLORI&m6ACE-seq, GLO..
##   -------
##   seqinfo: 25 sequences from an unspecified genome; no seqlengths

The meta columns of the merged results contain the names and numbers of supported orthogonal technique pair. Totally, 146250 reproducible sites are identified across all orthogonal technique pairs.

CSV

To access the integration results through CSV tables, simply read the CSV tables into R:

csv <- read.csv("m6A_HighConfSites_eTAM_GLORI_hg38.csv", row.names = NULL)
head(csv)
##   seqnames     start       end width strand m6A_ratio_eTAM_seq m6A_ratio_GLORI
## 1     chr5 172339669 172339669     1      *                  1       0.9885714
## 2    chr20  30512433  30512433     1      *                  1       0.9931574
## 3     chr7   4764049   4764049     1      *                  1       1.0000000
## 4     chr9  84002004  84002004     1      *                  1       1.0000000
## 5     chr8  10725587  10725587     1      *                  1       0.9941176
## 6    chr12  53945469  53945469     1      *                  1       0.9910873
##   m6A_probability_eTAM_seq m6A_probability_GLORI Pvalue_adjusted_eTAM_seq
## 1                        1                     1             6.595233e-65
## 2                        1                     1             9.195704e-40
## 3                        1                     1             4.902693e-31
## 4                        1                     1             2.415025e-58
## 5                        1                     1             9.102179e-28
## 6                        1                     1             2.076423e-47
##   Pvalue_adjusted_GLORI Irreproducible_discovery_rate
## 1         1.055969e-180                   3.64423e-09
## 2          0.000000e+00                   3.64423e-09
## 3          4.851198e-39                   3.64423e-09
## 4          2.670740e-27                   3.64423e-09
## 5         2.552776e-178                   3.64423e-09
## 6          0.000000e+00                   3.64423e-09
csv_merged <- read.csv("m6A_HighConfSites_Combined_hg38.csv", row.names = NULL)
head(csv_merged)
##   seqnames  start    end width strand support_number
## 1     chr1 139472 139472     1      *              1
## 2     chr1 185038 185038     1      *              2
## 3     chr1 258559 258559     1      *              1
## 4     chr1 348032 348032     1      *              1
## 5     chr1 348080 348080     1      *              1
## 6     chr1 826939 826939     1      *              2
##                    support_technique
## 1                     eTAM-seq&GLORI
## 2 GLORI&m6A-SAC-seq, GLORI&m6ACE-seq
## 3                    GLORI&MeRIP-seq
## 4                    GLORI&MeRIP-seq
## 5                  GLORI&m6A-SAC-seq
## 6   GLORI&m6ACE-seq, GLORI&MeRIP-seq