Gene Sets Net Correlations Analysis (GSNCA): a multivariate differential coexpression test for gene sets

Yasir Rahmatallah, Frank Emmert-Streib, Galina Glazko

Research output: Contribution to journalArticle

40 Citations (Scopus)
150 Downloads (Pure)

Abstract

Motivation: To date, Gene Set Analysis (GSA) approaches primarily focus on identifying differentially expressed gene sets (pathways). Methods for identifying differentially coexpressed pathways also exist but are mostly based on aggregated pairwise correlations, or other pairwise measures of coexpression. Instead, we propose Gene Sets Net Correlations Analysis (GSNCA), a multivariate differential coexpression test that accounts for the complete correlation structure between genes.

Results: In GSNCA, weight factors are assigned to genes in proportion to the genes' cross-correlations (intergene correlations). The problem of finding the weight vectors is formulated as an eigenvector problem with a unique solution. GSNCA tests the null hypothesis that for a gene set there is no difference in the weight vectors of the genes between two conditions. In simulation studies and the analyses of experimental data, we demonstrate that GSNCA, indeed, captures changes in the structure of genes' cross-correlations rather than differences in the averaged pairwise correlations. Thus, GSNCA infers differences in coexpression networks, however, bypassing method-dependent steps of network inference. As an additional result from GSNCA, we define hub genes as genes with the largest weights and show that these genes correspond frequently to major and specific pathway regulators, as well as to genes that are most affected by the biological difference between two conditions. In summary, GSNCA is a new approach for the analysis of differentially coexpressed pathways that also evaluates the importance of the genes in the pathways, thus providing unique information that may result in the generation of novel biological hypotheses.
Original languageEnglish
Pages (from-to)360-268
Number of pages9
JournalBioinformatics
Volume30
Issue number3
Early online date30 Nov 2013
DOIs
Publication statusPublished - Feb 2014

Fingerprint

Correlation Analysis
Multivariate Analysis
Genes
Gene
Pathway
Weights and Measures
Pairwise
Cross-correlation

Cite this

Rahmatallah, Yasir ; Emmert-Streib, Frank ; Glazko, Galina. / Gene Sets Net Correlations Analysis (GSNCA): a multivariate differential coexpression test for gene sets. In: Bioinformatics. 2014 ; Vol. 30, No. 3. pp. 360-268.
@article{cdf9da6e6b3547379ac4183526234f2b,
title = "Gene Sets Net Correlations Analysis (GSNCA): a multivariate differential coexpression test for gene sets",
abstract = "Motivation: To date, Gene Set Analysis (GSA) approaches primarily focus on identifying differentially expressed gene sets (pathways). Methods for identifying differentially coexpressed pathways also exist but are mostly based on aggregated pairwise correlations, or other pairwise measures of coexpression. Instead, we propose Gene Sets Net Correlations Analysis (GSNCA), a multivariate differential coexpression test that accounts for the complete correlation structure between genes.Results: In GSNCA, weight factors are assigned to genes in proportion to the genes' cross-correlations (intergene correlations). The problem of finding the weight vectors is formulated as an eigenvector problem with a unique solution. GSNCA tests the null hypothesis that for a gene set there is no difference in the weight vectors of the genes between two conditions. In simulation studies and the analyses of experimental data, we demonstrate that GSNCA, indeed, captures changes in the structure of genes' cross-correlations rather than differences in the averaged pairwise correlations. Thus, GSNCA infers differences in coexpression networks, however, bypassing method-dependent steps of network inference. As an additional result from GSNCA, we define hub genes as genes with the largest weights and show that these genes correspond frequently to major and specific pathway regulators, as well as to genes that are most affected by the biological difference between two conditions. In summary, GSNCA is a new approach for the analysis of differentially coexpressed pathways that also evaluates the importance of the genes in the pathways, thus providing unique information that may result in the generation of novel biological hypotheses.",
author = "Yasir Rahmatallah and Frank Emmert-Streib and Galina Glazko",
year = "2014",
month = "2",
doi = "10.1093/bioinformatics/btt687",
language = "English",
volume = "30",
pages = "360--268",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "3",

}

Gene Sets Net Correlations Analysis (GSNCA): a multivariate differential coexpression test for gene sets. / Rahmatallah, Yasir; Emmert-Streib, Frank; Glazko, Galina.

In: Bioinformatics, Vol. 30, No. 3, 02.2014, p. 360-268.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Gene Sets Net Correlations Analysis (GSNCA): a multivariate differential coexpression test for gene sets

AU - Rahmatallah, Yasir

AU - Emmert-Streib, Frank

AU - Glazko, Galina

PY - 2014/2

Y1 - 2014/2

N2 - Motivation: To date, Gene Set Analysis (GSA) approaches primarily focus on identifying differentially expressed gene sets (pathways). Methods for identifying differentially coexpressed pathways also exist but are mostly based on aggregated pairwise correlations, or other pairwise measures of coexpression. Instead, we propose Gene Sets Net Correlations Analysis (GSNCA), a multivariate differential coexpression test that accounts for the complete correlation structure between genes.Results: In GSNCA, weight factors are assigned to genes in proportion to the genes' cross-correlations (intergene correlations). The problem of finding the weight vectors is formulated as an eigenvector problem with a unique solution. GSNCA tests the null hypothesis that for a gene set there is no difference in the weight vectors of the genes between two conditions. In simulation studies and the analyses of experimental data, we demonstrate that GSNCA, indeed, captures changes in the structure of genes' cross-correlations rather than differences in the averaged pairwise correlations. Thus, GSNCA infers differences in coexpression networks, however, bypassing method-dependent steps of network inference. As an additional result from GSNCA, we define hub genes as genes with the largest weights and show that these genes correspond frequently to major and specific pathway regulators, as well as to genes that are most affected by the biological difference between two conditions. In summary, GSNCA is a new approach for the analysis of differentially coexpressed pathways that also evaluates the importance of the genes in the pathways, thus providing unique information that may result in the generation of novel biological hypotheses.

AB - Motivation: To date, Gene Set Analysis (GSA) approaches primarily focus on identifying differentially expressed gene sets (pathways). Methods for identifying differentially coexpressed pathways also exist but are mostly based on aggregated pairwise correlations, or other pairwise measures of coexpression. Instead, we propose Gene Sets Net Correlations Analysis (GSNCA), a multivariate differential coexpression test that accounts for the complete correlation structure between genes.Results: In GSNCA, weight factors are assigned to genes in proportion to the genes' cross-correlations (intergene correlations). The problem of finding the weight vectors is formulated as an eigenvector problem with a unique solution. GSNCA tests the null hypothesis that for a gene set there is no difference in the weight vectors of the genes between two conditions. In simulation studies and the analyses of experimental data, we demonstrate that GSNCA, indeed, captures changes in the structure of genes' cross-correlations rather than differences in the averaged pairwise correlations. Thus, GSNCA infers differences in coexpression networks, however, bypassing method-dependent steps of network inference. As an additional result from GSNCA, we define hub genes as genes with the largest weights and show that these genes correspond frequently to major and specific pathway regulators, as well as to genes that are most affected by the biological difference between two conditions. In summary, GSNCA is a new approach for the analysis of differentially coexpressed pathways that also evaluates the importance of the genes in the pathways, thus providing unique information that may result in the generation of novel biological hypotheses.

U2 - 10.1093/bioinformatics/btt687

DO - 10.1093/bioinformatics/btt687

M3 - Article

VL - 30

SP - 360

EP - 268

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 3

ER -