ArrayAnalysis version 0.1.0

Welcome to ArrayAnalysis!

Do you want to analyze microarray or RNA-Seq data?

Raw counts

Processed counts

CEL files

Processed intensities

Data upload

Before you can run the analysis, you first need to upload the expression data and metadata.

1. Upload expression data

The expression data should be supplied as an .zip folder containing all .CEL / .CEL.gz files. The file names should match with the sample IDs in the metadata table. Click here for an example zip file.

Browse...

2. Upload metadata

The metadata includes relevant information about the samples (e.g., genotype or experimental group). You can upload the metadata as a .csv/.tsv file or upload a Series Matrix file. Click here for an example .csv metadata file. A Series Matrix File can be downloaded from the GEO website.

.tsv/.csv file

Series Matrix File

Browse...

Pre-processing

In this pre-processing step, you can remove samples (e.g., outliers), perform normalization, and choose your desired probeset annotation.

1. Remove samples

Keep all samples

2. Select experimental group

3. Normalization

Use all arrays

Per experimental group

4. Annotation

Custom

Upload

None

Annotation format

Upload annotation file

Browse...

5. Pre-processing

Statistical analysis

In the statistical analysis step, you can find differentially expressed genes by selecting which groups to compare to each other and which covariates to add to the statistical model.

1. Make comparisons

2. Add covariates

3. Add gene annotation

Add gene annotations

4. Perform statistical analysis

Gene set analysis

ORA

GSEA

With Overrepresentation Analysis (ORA), dysregulated processes and pathways can be idenified. These processes/pathways are identified by testing whether their genes are overrepresented among the (most) significant genes.

With Gene Set Enrichment Analysis (GSEA), you can find dysregulated processes and pathways. These processes/pathways are identified by testing whether their genes show concordant changes in the data.

1. Select comparison

2. Select gene set collection

3. Select genes

Perform ORA on ...

Upregulated genes only

Downregulated genes only

Both

Select genes based on ...

Threshold

Top N

P threshold

Raw P value

Adjusted P value

logFC threshold

Top N most significant genes

3. Select ranking variable

logFC

-log p-value

-log p-value x sign logFC

4. Select gene identifier

Organism

Which gene ID to use?

5. Perform analysis

Data upload

Before you can run the analysis, you first need to upload the expression data and metadata.

1. Upload expression data

The expression data should be supplied as a .tsv/.csv file or as a Series Matrix File. Click here for an example expression matrix in csv format. A Series Matrix File can be downloaded from the GEO website.

.tsv/.csv file

Series Matrix File

Browse...

2. Upload metadata

The metadata includes relevant information about the samples (e.g., genotype or experimental group). You can upload the metadata as a .csv/.tsv file or upload a Series Matrix file. Click here for an example .csv metadata file. A Series Matrix File can be downloaded from the GEO website.

.tsv/.csv file

Series Matrix File

Browse...

Pre-processing

In this pre-processing step, you can remove samples (e.g., outliers) and perform transformation and normalization.

1. Remove samples

Keep all samples

2. Select experimental group

3. Transformation

4. Normalization

Use all arrays

Per experimental group

5. Pre-processing

Statistical analysis

In the statistical analysis step, you can select which groups to compare to each other and which covariates to add to the statistical model.

1. Make comparisons

2. Add covariated

3. Add gene annotation

Add gene annotations

4. Perform statistical analysis

Gene set analysis

ORA

GSEA

With Overrepresentation Analysis (ORA), dysregulated processes and pathways can be idenified. These processes/pathways are identified by testing whether their genes are overrepresented among the (most) significant genes.

With Gene Set Enrichment Analysis (GSEA), you can find dysregulated processes and pathways. These processes/pathways are identified by testing whether their genes show concordant changes in the data.

1. Select comparison

2. Select gene set collection

3. Select genes

Perform ORA on ...

Upregulated genes only

Downregulated genes only

Both

Select genes based on ...

Threshold

Top N

P threshold

Raw P value

Adjusted P value

logFC threshold

Top N most significant genes

3. Select ranking variable

logFC

-log p-value

-log p-value x sign logFC

4. Select gene identifier

Organism

Which gene ID to use?

5. Perform analysis

Data upload

Before you can run the analysis, you first need to upload the expression data and metadata.

1. Upload expression data

The expression data should be supplied as a .tsv/.csv file. The expression data is a matrix with the genes in the rows and the samples in the columns. Click here for an example expression matrix.

Browse...

2. Upload metadata

The metadata includes relevant information about the samples (e.g., genotype or experimental group). You can upload the metadata as a .csv/.tsv file or upload a Series Matrix file. Click here for an example .csv metadata file. A Series Matrix File can be downloaded from the GEO website.

.tsv/.csv file

Series Matrix File

Browse...

Pre-processing

In this pre-processing step, you can remove samples (e.g., outliers) and perform gene filtering and normalization.

1. Remove samples

Keep all samples

2. Select experimental group

3. Filtering

Minimum number of counts in smallest group size

4. Pre-processing

Statistical analysis

In the statistical analysis step, you can select which groups to compare to each other and which covariates to add to the statistical model.

1. Make comparisons

2. Add covariates

3. Perform logFC shrinkage

4. Add gene annotation

5. Perform statistical analysis

Gene set analysis

ORA

GSEA

With Overrepresentation Analysis (ORA), dysregulated processes and pathways can be idenified. These processes/pathways are identified by testing whether their genes are overrepresented among the (most) significant genes.

With Gene Set Enrichment Analysis (GSEA), you can find dysregulated processes and pathways. These processes/pathways are identified by testing whether their genes show concordant changes in the data.

1. Select comparison

2. Select gene set collection

3. Select genes

Perform ORA on ...

Upregulated genes only

Downregulated genes only

Both

Select genes based on ...

Threshold

Top N

P threshold

Raw P value

Adjusted P value

logFC threshold

Top N most significant genes

3. Select ranking variable

logFC

-log p-value

-log p-value x sign logFC

4. Select gene identifier

Organism

Which gene ID to use?

5. Perform analysis

Data upload

Before you can run the analysis, you first need to upload the expression data and metadata.

1. Upload expression data

The expression data should be supplied as a .tsv/.csv file. The expression data is a matrix with the genes in the rows and the samples in the columns. Click here for an example expression matrix.

Browse...

2. Upload metadata

The metadata includes relevant information about the samples (e.g., genotype or experimental group). You can upload the metadata as a .csv/.tsv file or upload a Series Matrix file. Click here for an example .csv metadata file. A Series Matrix File can be downloaded from the GEO website.

.tsv/.csv file

Series Matrix File

Browse...

Pre-processing

In this pre-processing step, you can remove samples (e.g., outliers) and perform gene filtering and normalization.

1. Remove samples

Keep all samples

2. Select experimental group

3. Transformation

4. Filtering

4. Normalization

Use all samples

Per experimental group