Frequently Asked Questions

  1. Why Graphite web networks are different from those reported in the original database ?
  2. I have selected CliPPER method but I do not see the results.
  3. Why for some pathways I cannot see the interactive network-based tool ?
  4. Why Graphite web support only EntrezGene ID ?
  5. Why different analysis gives different results ?
  6. I do not find a gene/protein in the network, while I see the gene/protein in the original pathway (KEGG/Reactome), why ?
  7. Why nodes in the same pathway but from different analysis are coloured differently ?
  8. Which is the best method to do pathway analysis ?
  9. Which is the p-value associated to a path within a pathway of CliPPER ?
  10. I have to compare more than two groups, how can I use Graphite web ?
  11. When running GSEA I found a pathway significant but clicking on the pathway I do not see any nodes of the network colored, why ?






  1. Why Graphite web networks are different from those reported in the original database ?
    The conversion of pathway formats to gene-gene networks requires a series of transformations that are well documented in Sales et al (2012) and here. The visual representation of a pathway in KEGG or Reactome does not necessarily correspond to its machine-readable format (KGML for KEGG and BioPax for Reactome). Some inaccuracies or missing may occur in the xml format that leads to a different network based representation. Furthermore, before visualization pathway annotation IDs are converted to EntrezGene and this conversion can lead to a slightly different topology.
  2. I have selected CliPPER method but I do not see the results.
    Given the model complexity of CliPPER, its implementation is computer intensive. This means that the results require several minutes to be displayed (at last one hour). In the meanwhile the result page will be automatically refreshed until results will be ready. Then, please do not close the web page waiting the results.
  3. Why for some pathways I cannot see the interactive network-based tool ?
    Graphite web uses Cytoscape web to visualize networks. In case of pathways greater than 150 nodes Cytoscape web becomes slow and may crash. Then in case of pathway larger than 150 Graphite web returns the sif file for an in-house visualization using Cytoscape.
  4. Why Graphite web support only EntrezGene ID ?
    Pathway annotations are based heterogeneous protein IDs that usually point to different isoforms of the same genes. On the other hand microarray data is gene-based not transcript-based. To map gene expression on transcript-based networks would require a multiplication of expression profiles (if a gene has two alternative transcripts the expression profiles would be duplicated). However in statistical models the duplication of variables leads to multicollinearity (the model is not able to estimate the parameters of the model). For this reason we decided to use network-based pathways simplified using EntrezGene IDs.
  5. Why different analysis gives different results ?
    The analyses provided by Graphite web are based on completely different null hypothesis (see the Background section of the Tutorial for more details): different methods answer to different biological questions. The analysis has to be selected according to the biological question you want to answer.
  6. I do not find a gene/protein in the network, while I see the gene/protein in the original pathway (KEGG/Reactome), why ?
    See answer 1.
  7. Why nodes in the same pathway but from different analysis are coloured differently ?
    The colour of a node is necessarily associated to the analysis selected. Clearly if a gene is up-regulated in one class with respect to the other, this up-regulation is maintained throughout all the analysis, but the contribution of this gene on the final results will be different according to the analysis. This different contribution is given by the different model and null hypothesis underneath the analyses.
  8. Which is the best method to do pathway analysis ?
    As reported previously, the selection of the pathway analysis is strictly associated to the biological question to be answered. Competitive methods answer to the question: do the genes within a pathway have the same levels of association to the phenotype as genes in the complement of the gene set ? Self contained methods answer to the question: is there at least one gene in the pathway associated with the phenotype ? For a more detailed description of the difference between these methods see the tutorial, Goeman et al. (2004) and Tian et al (2005).
  9. Which is the p-value associated to a path within a pathway in CliPPER analysis?
    CliPPER is based on a two-step empirical approach. In the first step CliPPER select only pathway that have significant different means and concentration matrices between the two groups. In the second step it identified the paths within the significant pathway mostly associated to the phenotype. This score is a weighted combination of the p-values of the cliques composing the path. Currently these paths do not have a p-value (the null distribution of this score is unknown) they are ranked according to this score; highest the score more associated to the phenotype is the path.
  10. I have to compare more than two groups, how can I use Graphite web ?
    Currently Graphite web is working on two-sample design, in case of more than two groups the user has to upload different matrices (or lists of genes) for each comparison to be performed.
  11. When running GSEA I found a pathway significant but clicking on the pathway I do not see any nodes of the network colored, why ?
    GSEA works on the distribution of t-test statistics of genes belonging to a pathway. Although not strictly significant, the statistics could be sufficiently large to significantly shift the mean of the distribution of the entire pathway with respect to the other genes. This indicate a possible moderate but coordinate involvement of the pathway in the biological problem.