|
Gene Sets from Community ContributorsThis page contains references to gene sets and collections from community contributors. These are not part of MSigDB but may be useful for certain analyses. The descriptions and all other information given below are courtesy of the contributors. Note that these contributions are under copyright and license terms as specified by the authors rather than the MSigDB license terms. If you have gene sets to contribute that might benefit others, either for this page or for inclusion in MSigDB, feel free to contact us at genesets@broadinstitute.org. SysMyo Muscle Gene SetsSysMyo has contributed a collection of Muscle Gene Sets:
"More than ten thousand samples of muscle transcriptomic data have been uploaded to the public Gene
Expression Omnibus in the past ten years, representing many millions of dollars of research expenditure
and incalculable hours of research effort. These data ought to serve as a massive reference set for
ongoing and future studies of neuromuscular disorders. One way to distil the data and render them more
accessible to bench researchers is to extract from each study lists of genes ("gene sets") that were
differentially expressed. With careful curation, each transcriptomic dataset may yield multiple comparisons,
not only relating to the primary focus of that study, such as a pathology or an experimental treatment, but
also more general comparisons not necessarily envisaged by the study's authors, but relating to factors
such as age, sex, and muscle group."
See their website for more information. PorSignDBNicolaas Van Renne et al. have contributed PorSignDB:
"The Porcine Signature Database (PorSignDB) is a collection of annotated gene sets for use with GSEA
software. These gene sets were mostly derived from in vivo derived transcriptomic data, and describe a wide
spectrum of (patho)physiological states of different tissue types. Only a minority of gene sets describe cell
culture systems. Although the original data stems from pigs (Sus Scrofa), gene identifiers were adapted to human
orthologs in order to fit into the MSigDB collection and facilitate its application to data from any mammalian
species..."
See their website for more information. BrainCortex_CellTypeSpecificGenes
Megan Hastings Hagenauer et al. have contributed the BrainCortex_CellTypeSpecificGenes gene sets, described in
https://www.biorxiv.org/content/early/2017/12/20/089391.full.pdf+html (preprint).
"Psychiatric illness is unlikely to arise from pathology occurring uniformly across all cell types in
affected brain regions. Despite this, transcriptomic analyses of the human brain have typically been
conducted using macro-dissected tissue due to the difficulty of performing single-cell type analyses with
donated post-mortem brains. To address this issue statistically, we compiled a database of several thousand
transcripts that were specifically-enriched in one of 10 primary cortical cell types, as identified in
previous publications... "
See their website for more information. DSigDB
Minjae Yoo et al. have contributed DSigDB, described in
https://academic.oup.com/bioinformatics/article/31/18/3069/241009.
"We report the creation of Drug Signatures Database (DSigDB), a new gene set resource that relates
drugs/compounds and their target genes, for gene set enrichment analysis (GSEA). DSigDB currently holds
22527 gene sets, consists of 17389 unique compounds covering 19531 genes. We also developed an online
DSigDB resource that allows users to search, view and download drugs/compounds and gene sets. DSigDB gene
sets provide seamless integration to GSEA software for linking gene expressions with drugs/compounds for
drug repurposing and translational research. "
See their website for more information. Caenorhabditis elegans Co-Expression Cliques
Lukas Schmauder and Klaus Richter have contributed a clique map of the C. Elegans transcriptome, described in
https://www.nature.com/articles/s41598-021-91690-6.
"Nematode development is characterized by progression through several larval stages. Thousands of genes were found in large scale RNAi-experiments to block this development at certain steps, two of which target the molecular chaperone HSP-90 and its cofactor UNC-45. Aiming to define the cause of arrest, we here investigate the status of nematodes after treatment with RNAi against hsp-90 and unc-45 by employing an in-depth transcriptional analysis of the arrested larvae. To identify misregulated transcriptional units, we calculate and validate genome-wide coexpression cliques covering the entire nematode genome. We define 307 coexpression cliques and more than half of these can be related to organismal functions by GO-term enrichment, phenotype enrichment or tissue enrichment analysis..... With most of the defined gene cliques showing concerted behaviour at some stage of development from embryo to late adult, the “clique map” together with the clique-specific GO-terms, tissue and phenotype assignments will be a valuable tool in understanding concerted responses on the genome-wide level in Caenorhabditis elegans."
See their GitHub repository for more information and to obtain the clique gene sets. Saccharomyces cerevisiae Co-Expression Cliques
Siyuan Sima, Lukas Schmauder, Klaus Richter have contributed a clique map of the S. cerevisiae transcriptome, described in
http://microbialcell.com/researcharticles/2019a-sima-microbial-cell/.
"We generated a set of 72 co-regulation cliques using the information from S. cerevisiae 3196 microarray experiments. The obtained cliques performed highly significant in gene ontology and transcription factor enrichment analyses. We then tested the clique set on individual microarray experiments reporting on responses to pheromone, glycerol versus glucose based growth and the cellular response to heat. In all cases a highly significant determination of affected expression cliques was possible based on their average expression differences, the positions of their genes within hit rankings (UpRegScore) or the enrichment of the Top200 hits in certain cliques. The 72 cliques were finally used to compare experiments, which reported on the transcriptional response to polyglutamine proteins of different lengths. Using the predefined clique set it is possible to identify with high sensitivity and good significance sample and condition specific changes to gene expression. We thus conclude that an analysis, starting with these 72 preformed expression cliques, can complement traditional microarray analyses by visualizing the entire response on a static genome-wide gene set."
See their GitHub repository for more information and to obtain the clique gene sets. |