Getting Started

Hello! Start in the left-hand sidebar by:

1. entering the statistic for gene scoring, for example:

FDR from DE analysis results

module number from WGCNA results

2. entering the expression for gene scoring, for example:

< 0.05 for specifying significant DE genes using a FDR cut off

== 1 for specifying a specific module number from WGCNA

3. uploading a gene score table .csv file with the unfiltered results table from DE analysis or WGCNA

4. uploading a mappings table .txt file with the gene-to-GO term annotation mappings formatted as either:

topGO expected gene-to-GO mappings

PANNZER2 resulting GO prediction details

5. clicking the Run Analysis button, which appears after the format of the input files are checked and verified as valid for analysis

Note that the functional analysis results and plots may take several moments to process depending on the size of the input data tables.

Helpful Tips

Tip 1: The topGO package expects gene-to-GO mappings files to be specifically formatted where:

the first column must contain gene IDs and the second column GO terms

the second column of GO terms must be in a comma separated list format

the first column must be tab or space separated from the second column

the first column of gene IDs must match the gene IDs contained in the gene score table

Tip 2: It is possible to create a gene-to-GO term annotations table with PANNZER2 by:

first, navigating to the Annotate tab

second, uploading a list of protein sequences where the sequence names must match the gene names in the input gene score table

third, selecting Batch queue and entering your email

fourth, selecting the GO prediction details link after recieving the PANNZER2 results

fifth, right clicking and selecting Save As... to download the GO.out.txt annotations table

Tip 3: The input gene score table should not be filtered in advance. The functional analysis requires the complete gene universe, which includes all genes detected in the experiment regardless of signifigance in DE analysis or WGCNA.

Tip 4: The input gene score statistic must match the name of a column in the input gene score table.

Tip 5: The first column of the gene score table is expected to contain gene IDs.

Tip 6: The gene score tables are required to contain two columns with gene IDs and gene scores at minimum.

Data Formatting

Example gene score and gene-to-GO annotation mappings tables are displayed below.

Example Gene Score Tables

Example DE analysis gene score table for five genes:

Example WGCNA gene score table for five genes:

Example DE analysis gene score table of three genes with the minimum expected columns:

Example WGCNA gene score table of three genes with the minimum expected columns:

Example Gene-to-GO Term Mapping Tables

Example topGO gene-to-GO term mapping tables for four genes:

Example two column CSV with gene-to-GO term mapping tables for four genes:

Example PANNZER2 gene-to-GO annotations for two genes:

Processing

The functional analysis results and plots may take several moments to process depending on the size of the input tables.

Helpful Tips

Tip 1: The plots and results may take several moments to appear depending on the size of the input data tables.

Tip 2: Navigate to the Analysis, Exploration, or Results steps by clicking the tabs above.

Tip 3: Further details about the available types of enrichment tests can be found in the topGO manual (e.g., section 6).

Tip 4: It is possible to use both the fisher's and KS tests since each gene has a score, which represents how it is diferentially expressed.

Tip 5: Refer to the topGO manual for more information regarding the available algorithms and test statistics.

Functional Analysis

Begin the functional enrichment or over-representation analysis by selecting a test statistic, algorithm, and p-value cut off.

Select an Algorithm:

Default

Classic

Elim

Available Algorithms:

1. The default (a.k.a weight01) algorithm used by the topGO package is a mixture between the elim and weight algorithms

2. The classic algorithm performs functional analysis by testing the over-representation of GO terms within the group of diferentially expressed genes

3. The elim algorithm is more conservative then the classic method and you may expect the p-values returned by the former method to be lower bounded by the p-values returned by the later method

Select a Test Statistic:

Fisher

Kolmogorov-Smirnov

Available Test Statistics:

1. The fisher's exact test is based on gene counts and can be used to perform over representation analysis of GO terms

2. The Kolmogorov-Smirnov (KS) like test computes enrichment or rank based on gene scores and can be used to perform gene set enrichment analysis (GSEA)

Select P-Value Cut Off:

Click to Update Analysis:

Note that the computed p-values are unadjusted for multiple testing.

Keep in mind that the plots and results may take several moments to update depending on the size of the input data tables.

Exploration

Begin exploring the GO term data and functional analysis results by selecting a GO term category (e.g., ontology level) below.

Note that it may take a moment for the analysis results to appear.

Select GO Term Category:

Biological Process

Molecular Function

Cellular Component

Click to Analyze:

Helpful Tips

Tip 1: Only significant GO terms may be plotted.

Tip 2: Make sure that the GO category is valid for the input GO term IDs.

Range of GO Term P-Values

Download Plot

The above histogram shows the range and frequency of p-values from the enrichment tests for the selected GO level (BP, MF, or CC).

Results for the Top Significant GO Terms:

The above table shows the funcational analysis results for up to the top 5 most significant (lowest p-value) GO terms for selected ontology level (BP, MF, or CC). The significance is determined by the input unadjusted p-value cut off.

Density Plots of GO Terms

Enter GO Term ID:

Click to Analyze:

Density Plot

Download Plot

The above density plot shows the distribution of the gene's rank for the top GO term of each GO level (BP, MF, or CC). The gene's rank is compared with the null distribution.

Table of Gene IDs

Download Table

The table of gene IDs associated with the selected GO term may be downloaded above.

Table of Gene Data

Download Table

The table of gene data associated with the selected GO term may be downloaded above.

Euler Diagrams of GO Terms

Enter First GO Term ID:

Enter Second GO Term ID:

Click to Analyze:

Download Plot

The above euler diagram shows the relationship between the sets of genes associated with the selected GO terms.

Tables of Gene IDs

Gene IDs for First GO Term:

Download Table

Gene IDs for Second GO Term:

Download Table

The tables of gene IDs associated with each of the selected GO terms may be downloaded above.

Tables of Gene Data

Gene Data for First GO Term:

Download Table

Gene Data for Second GO Term:

Download Table

The tables of gene data associated with each of the selected GO terms may be downloaded above.

Subgraphs of Significant GO Terms

Select the Number of Nodes:

Download Subgraphs:

Download PDF

The subgraph induced by the selected number of significant GO terms identifed by the selected algorithm for scoring GO terms for enrichment. Rectangles indicate the signifcant terms with colors representing the relative signifcance, which ranges from dark red (most signifcant) to bright yellow (least signifcant).

For each node, some basic information is displayed. The frst two lines show the GO identifer and a trimmed GO name. In the third line the raw p-value is shown. The forth line is showing the number of signifcant genes and the total number of genes annotated to the respective GO term.

Functional Analysis Results

Results from the GO term functional analysis may be viewed or downloaded below.

Note that it may take a moment for the analysis results to appear.

Dot Plot of Top Significant GO Terms

Download Plot

The above dot plot shows up to the top 5 most significant (lowest p-value) GO terms for each ontology level (BP, MF, CC). The significance is determined by the input unadjusted p-value cut off. The size of the dots indicate the number of observed significant features (e.g., genes) annotated to the GO term, which is compared to the expected number based on the null hypothesis. The dots are colored by the enrichment test p-values.

Table of GO Term Results

Results for BP GO Terms:

Download BP Table

Results for MF GO Terms:

Download MF Table

Results for CC GO Terms:

Download CC Table

Above are the unfiltered tables of enriched or overrepresented GO terms for each of the ontology categories.

Tables of Significant GO Term Results

Results for Significant BP GO Terms:

Download BP Table

Results for Significant MF GO Terms:

Download MF Table

Results for Significant CC GO Terms:

Download CC Table

Above are the tables of significantly enriched or overrepresented GO terms for each ontology category, which have been filtered by the input p-value cut off.

Tables of Gene IDs for All GO Terms

Gene IDs for BP GO Terms:

Download BP Table

Gene IDs for MF GO Terms:

Download MF Table

Gene IDs for CC GO Terms:

Download CC Table

Above are the tables of genes associated with the significantly enriched or overrepresented GO terms for each ontology category, which have been filtered by the input p-value cut off.

Table of Formatted Gene-to-GO Term Mappings

Download Table

The above table of gene-to-GO term annotation mappings has been formatted for use with topGO.

Helpful Information

A tutorial for this application can be found here in the scripts directory of the freeCount GitHub.

The latest version of this application may be downloaded from the freeCount GitHub .

Example gene scores and mappings tables are also provided on GitHub .

DE analysis gene score tables may be created from RNA-seq data as described in Bioinformatics Analysis of Omics Data with the Shell & R .

More information about the analysis performed in this application is provided in Gene set enrichment analysis with topGO .

Cite

Elizabeth Mae Brooks, Sheri A Sanders, and Michael E Pfrender. 2024. FreeCount: A Coding Free Framework for Guided Count Data Visualization and Analysis. In Practice and Experience in Advanced Research Computing 2024: Human Powered Computing (PEARC '24). Association for Computing Machinery, New York, NY, USA, Article 37, 1–4. https://doi.org/10.1145/3626203.3670605

freeCount FA

Getting Started

Processing

Helpful Tips

Functional Analysis

Exploration

Functional Analysis Results

Helpful Information

Cite