The 34837 gene sets in the Human Molecular Signatures Database (MSigDB) are
divided into 9 major collections, and several subcollections.
See the table below for a brief description of each, and the
Human MSigDB Collections: Details and Acknowledgments page for more
detailed descriptions. See also the latest MSigDB Release Notes.
Click on the "browse gene sets" links in the table below to view the gene sets in a collection. Or download
the gene sets in a collection by clicking on the links below the "Download Files" headings.
For a description of the
GMT file
format see the Data Formats guide in the
Documentation section.
The gene sets can be downloaded as NCBI (Entrez) Gene Identifiers or HUGO (HGNC) Gene Symbols. There are also JSON bundles containing the
Human gene sets using HUGO (HGNC) Gene Symbols along with some useful metadata.
A SQLite database containing all the Human MSigDB gene sets is available as well.
H: hallmark gene sets
(browse
50 gene sets)
|
Hallmark gene sets summarize and represent specific well-defined biological states or
processes and display coherent expression. These gene sets were generated by a computational
methodology based on identifying overlaps between gene sets in other MSigDB collections and
retaining genes that display coordinate expression.
details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
C1: positional gene sets
(browse
302 gene sets)
|
Gene sets corresponding to human chromosome cytogenetic bands.
details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
C2: curated gene sets
(browse
7411 gene sets)
|
Gene sets in this collection are curated from various sources, including online pathway
databases and the biomedical literature. Many sets are also contributed by individual domain
experts. The gene set page for each gene set lists its source. The C2 collection is divided
into the following two subcollections: Chemical and genetic perturbations (CGP) and Canonical
pathways (CP).
details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
CGP: chemical and genetic perturbations
(browse
3494 gene sets)
|
Gene sets represent expression signatures of genetic and chemical perturbations. A number of these
gene sets come in pairs: xxx_UP (and xxx_DN) gene set representing genes induced (and repressed)
by the perturbation.
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
CP: Canonical pathways
(browse
3917 gene sets)
|
Gene sets from pathway databases. Usually, these gene sets are canonical representations of a
biological process compiled by domain experts.
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
BioCarta subset of CP
(browse
292 gene sets)
|
Canonical Pathways gene sets derived from the BioCarta pathway database.
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
KEGG_MEDICUS subset of CP
(browse
658 gene sets)
|
Canonical Pathways gene sets derived from the KEGG MEDICUS pathway database.
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
PID subset of CP
(browse
196 gene sets)
|
Canonical Pathways gene sets derived from the PID pathway database.
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
Reactome subset of CP
(browse
1736 gene sets)
|
Canonical Pathways gene sets derived from the Reactome pathway database.
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
WikiPathways subset of CP
(browse
830 gene sets)
|
Canonical Pathways gene sets derived from the WikiPathways pathway database.
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
KEGG_LEGACY subset of CP
(browse
186 gene sets)
|
Canonical Pathways gene sets derived from the KEGG pathway database. These are considered Legacy gene sets since the introduction of the gene sets based on the more recent KEGG MEDICUS data.
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
C3: regulatory target gene sets
(browse
3713 gene sets)
|
Gene sets representing potential targets of regulation by transcription factors or microRNAs. The sets
consist of genes grouped by elements they share in their non-protein coding regions. The elements represent
known or likely cis-regulatory elements in promoters and 3'-UTRs.
The C3 collection is divided into two subcollections: microRNA targets (MIR) and transcription
factor targets (TFT). details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
MIR: microRNA targets
(browse
2598 gene sets)
|
All miRNA target prediction gene sets. Combined superset of both miRDB prediction methods and legacy sets. |
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
miRDB subset of MIR
(browse
2377 gene sets)
|
Gene sets containing high-confidence gene-level predictions of human miRNA targets as catalogued by miRDB v6.0 algorithm
(Chen and Wang, 2020).
details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
MIR_LEGACY subset of MIR
(browse
221 gene sets)
|
Older gene sets that contain genes sharing putative target sites (seed matches) of human mature miRNA in their 3'-UTRs.
details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
TFT: transcription factor targets
(browse
1115 gene sets)
|
All transcription factor target prediction gene sets. Combined superset of both GTRD prediction methods and legacy sets. |
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
GTRD subset of TFT
(browse
505 gene sets)
|
Genes that share GTRD (Kolmykov et al. 2021)
predicted transcription factor binding sites in the region -1000,+100 bp around the TSS for the indicated transcription factor.
details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
TFT_LEGACY subset of TFT
(browse
610 gene sets)
|
Older gene sets that share upstream cis-regulatory motifs which can function as potential transcription factor
binding sites. Based on work by Xie et al. 2005
details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
C4: computational gene sets
(browse
1006 gene sets)
|
Computational gene sets defined by mining large collections of cancer-oriented expression data.
The C4 collection is divided into three subcollections: 3CA, CGN and CM.
details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
3CA: Curated Cancer Cell Atlas
(browse
148 gene sets)
|
Gene sets mined from the Curated Cancer Cell Atlas (3CA) metaprograms. These sets consist of genes that are coordinately upregulated in subpopulations of cells within 24 tumor types,
covering both generic and lineage specifc cellular processes. The resource underlying this collection is described in Gavish et al. 2023
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
CGN: cancer gene neighborhoods
(browse
427 gene sets)
|
Gene sets defined by expression neighborhoods centered on 380 cancer-associated genes. This collection is described
in Subramanian, Tamayo et al. 2005
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
CM: cancer modules
(browse
431 gene sets)
|
Gene sets defined by Segal et al. 2004.
Briefly, the authors compiled gene sets ('modules') from a variety of resources such as KEGG, GO, and others. By mining
a large compendium of cancer-related microarray data, they identified 456 such modules as significantly changed in a
variety of cancer conditions.
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
C5: ontology gene sets
(browse
16107 gene sets)
|
Gene sets that contain genes annotated by the same ontology term. The C5 collection is divided into two subcollections,
the first derived from the Gene Ontology resource (GO) which contains BP, CC, and MF components and a second derived
from the Human Phenotype Ontology (HPO).
details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
GO: Gene Ontology gene sets
(browse
10454 gene sets)
|
All gene sets derived from Gene Ontology.
details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
BP: subset of GO
(browse
7608 gene sets)
|
Gene sets derived from the GO Biological Process ontology.
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
CC: subset of GO
(browse
1026 gene sets)
|
Gene sets derived from the GO Cellular Component ontology.
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
MF: subset of GO
(browse
1820 gene sets)
|
Gene sets derived from the GO Molecular Function ontology.
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
HPO: Human Phenotype Ontology
(browse
5653 gene sets)
|
Gene sets derived from the Human Phenotype ontology.
details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
C6: oncogenic signature gene sets
(browse
189 gene sets)
|
Gene sets that represent signatures of cellular pathways which are often dis-regulated in cancer.
The majority of signatures were generated directly from microarray data from NCBI GEO or from internal
unpublished profiling experiments involving perturbation of known cancer genes.
details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
C7: immunologic signature gene sets
(browse
5219 gene sets)
|
Gene sets that represent cell states and perturbations within the immune system.
details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
ImmuneSigDB subset of C7
(browse
4872 gene sets)
|
Gene sets representing chemical and genetic perturbations of the immune system
generated by manual curation of published studies in human and mouse immunology.
details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
VAX: vaccine response gene sets
(browse
347 gene sets)
|
Gene sets curated by the Human Immunology Project Consortium (HIPC) describing human
transcriptomic immune responses to vaccinations.
details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|
C8: cell type signature gene sets
(browse
840 gene sets)
|
Gene sets that contain curated cluster markers for cell types identified in single-cell sequencing studies of human tissue.
details
|
Download GMT Files
Gene Symbols
NCBI (Entrez) Gene IDs
JSON bundle
|