Normalization for Illumina GoldenGate and Infinium methylation profiling arrays.
Tool can handle two different input formats: a FinalReport file, or two text files, one for samples and one for controls.
The “Final Report” format includes all the data in a single file.
The tool will look for “[Header]” in the file to determine when that file format is to be used.
The data blocks “[Sample Methylation Profile]” and "[Control Probe Profile]"" need to be present in the “Final Report” format.
The second format can be a simple tab-delimited text file with headers from BeadStudio.
If this format is used, the sample data and the QC data need to be in separate files.
The filename must end with ".txt", and you should NOT use the Import tool
when bringing the data into Chipster (in the Import files -window change the action to "Import directly").
Note that dot (.) needs to be used as the decimal separator instead of comma (,)!
This tool adjusts color balance and background, and normalizes Illumina methylation data provided
either in GenomeStudio FinalReport format or as two separate tab separated files (one for samples, one for controls).
The Illumina platform shows a significant dye bias in the two channels and therefore
some normalization is required. Before the normalisation, some clean-up is done: those samples
with average detection p-values higher than the p-val parameter value (default: 0.05)
are left out of the analysis.
The function normalizeMethyLumiSet is used for normalisation:
this function looks at the median intensities in the methylated and unmethylated channels
(each measured in one color on the GoldenGate platform) at very low and very high beta values
and sets these medians equal. Using the transformed unmethylated and methylated values,
new beta values are calculated using the ratio function
(the same as used by Illumina in the BeadStudio software).
Four tab-delimited text files (normalized.tsv, unmethylated.tsv, methylated.tsv, phenodata.tsv) and a quality control plot (QC-plot.pdf). The file normalized.tsv contains normalized methylated / unmethylated ratios on log2-scale, and it is suitable for all further analyses. Unmethylated.tsv and methylated.tsv contain data for unmethylated and methylated probes.
The QC plot shows p-values for detection for the samples; should some of the samples have clearly higher detection p-values, they are failing the QC testing and one should consider removing them from the analysis. When using the FinalReport format as input, the QC plot also shows distributions of methylated and unmethylated probes where unmethylated data is shown in green and methylated data is shown in red.
This tool uses Bioconductor package methylumi. Please cite the articles:
Du P, Kibbe WA, Lin SM (2008) lumi: a pipeline for processing Illumina microarray, Bioinformatics, 24, 1547-1548.
Du P, Zhang X, Huang CC, Jafari N, Kibbe WA, Lin SM (2010) Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis, BMC bioinformatics, 11, 587.
More information of the methylumi package: https://bioconductor.org/packages/release/bioc/html/methylumi.html