PsychENCODE Project Phase II (2019-2024)

Publications by PsychENCODE Consortium Members during Phase II

Single-Cell Genomics and Regulatory Networks for 388 Human Brains

Prashant Emani, Jason Liu, Declan Clarke, Matthew Jensen, Jonathan Warrell, Chirag Gupta, Ran Meng, Che Yu Lee, Siwei Xu, Cagatay Dursun, Shaoke Lou, et al., PsychENCODE Consortium, Matthew Girgenti, Jing Zhang, Daifeng Wang, Daniel Geschwind, Mark Gerstein

Science, 2024

Single-cell genomics is a powerful tool for studying heterogeneous tissues such as the brain. Yet, little is understood about how genetic variants influence cell-level gene expression. Addressing this, we uniformly processed single-nuclei, multi-omics datasets into a resource comprising >2.8M nuclei from the prefrontal cortex across 388 individuals. For 28 cell types, we assessed population-level variation in expression and chromatin across gene families and drug targets. We identified >550K cell-type-specific regulatory elements and >1.4M single-cell expression-quantitative-trait loci, which we used to build cell-type regulatory and cell-to-cell communication networks. These networks manifest cellular changes in aging and neuropsychiatric disorders. We further constructed an integrative model accurately imputing single-cell expression and simulating perturbations; the model prioritized ∼250 disease-risk genes and drug targets with associated cell types.

Data and Materials Availability:

Code and Data

Molecular Cascades and Cell Type-Specific Signatures in ASD Revealed by Single Cell Genomics

Brie Wamsley, et al., PsychENCODE Consortium, Daniel Geschwind

Science, 2024

Understanding how genetic variation exerts its effects on the human brain in health and disease has been greatly informed by functional genomic characterization. Studies over the last decade have demonstrated robust evidence of convergent transcriptional and epigenetic profiles in post-mortem cerebral cortex from individuals with Autism Spectrum Disorder (ASD). Here, we perform deep single nuclear (sn) RNAseq to elucidate changes in cell composition, cellular transcriptomes and putative candidate drivers associated with ASD, which we corroborate using snATAC-seq and spatial profiling. We find changes in cell state composition representing transitions from homeostatic to reactive profiles in microglia and astrocytes, a pattern extending to oligodendrocytes and blood brain barrier cells. We identify profound changes in differential expression involving thousands of genes across neuronal and glial subtypes, of which a substantial portion can be accounted for by specific transcription factor networks that are significantly enriched in common and rare genetic risk for ASD. These data, which are available as part of the PsychENCODE consortium, provide robust causal anchors and resultant molecular phenotypes for understanding ASD changes in human brain. 


Data and Materials Availability:

Transcriptional Data ASD single-cell browser

Analysis Code

Data and Content

Systems Biology Dissection of PTSD and MDD Across Brain Regions, Cell Types, and Blood

Nikolaos Daskalakis, et al., PsychENCODE Consortium, Charles Nemeroff, Joel Kleinman, Kerry Ressler

Science, 2024

The molecular pathology of stress-related disorders remains elusive. Our brain multiregion study of posttraumatic stress disorder (PTSD) and major depressive disorder (MDD) included the central nucleus of the amygdala, hippocampal dentate gyrus, and medial prefrontal cortex (mPFC). Genes and exons within the mPFC carried most disease signals replicated across two independent cohorts. Pathways pointed to immune function, neuronal and synaptic regulation, and stress hormones. Multiomic factor and gene network analyses provided the underlying genomic structure. Single nucleus RNA sequencing in dorsolateral PFC revealed dysregulated (stress-related) signals. Analyses of brain-blood intersections in >50,000 UK Biobank participants were conducted along with fine-mapping of the results of PTSD and MDD genome-wide association studies. Our data suggest shared and distinct molecular pathology in both disorders and propose potential therapeutic targets and biomarkers.


Data and Materials Availability:

GitHub

Zenodo

MVP GWAS summary statistics

Data

Results

Single-Cell Multi-Cohort Dissection of the Schizophrenia Transcriptome

W. Brad Ruzicka, Shahin Mohammadi, John Fullard, Jose Davila-Velderrain, et al., PsychENCODE Consortium, Panos Roussos, Manolis Kellis

Science, 2024

The complexity and heterogeneity of schizophrenia have hindered mechanistic elucidation and the development of more effective therapies. Here, we performed single-cell dissection of schizophrenia-associated transcriptomic changes in the human prefrontal cortex across 140 individuals in two independent cohorts. Excitatory neurons were the most affected cell group, with transcriptional changes converging on neurodevelopment and synapse-related molecular pathways. Transcriptional alterations included known genetic risk factors, suggesting convergence of rare and common genomic variants on neuronal population-specific alterations in schizophrenia. Based on the magnitude of schizophrenia-associated transcriptional change, we identified two populations of individuals with schizophrenia marked by expression of specific excitatory and inhibitory neuronal cell states. This single-cell atlas links transcriptomic changes to etiological genetic risk factors, contextualizing established knowledge within the human cortical cytoarchitecture and facilitating mechanistic understanding of schizophrenia pathophysiology and heterogeneity.

Data and Materials Availability:

Data

A Data-Driven Single-Cell and Spatial Transcriptomic Map of the Human Prefrontal Cortex 

Louise Huuki-Myers, et al., PsychENCODE Consortium, Leonardo Collado-Torres, Kristen Maynard

Science, 2024

The molecular organization of the human neocortex historically has been studied in the context of its histological layers. However, emerging spatial transcriptomic technologies have enabled unbiased identification of transcriptionally defined spatial domains that move beyond classic cytoarchitecture. We used the Visium spatial gene expression platform to generate a data-driven molecular neuroanatomical atlas across the anterior-posterior axis of the human dorsolateral prefrontal cortex. Integration with paired single-nucleus RNA-sequencing data revealed distinct cell type compositions and cell-cell interactions across spatial domains. Using PsychENCODE and publicly available data, we mapped the enrichment of cell types and genes associated with neuropsychiatric disorders to discrete spatial domains.


Data and Materials Availability:

spatialDLPFC

Raw Data

Figures and Tables

Source Data jhpce#spatialDLPFC jhpce#DLPFC_snRNAseq

Analysis Code

Genetic Regulation of Cell Type Specific Chromatin Accessibility Shapes Etiology of Brain Diseases

Biao Zeng, Jaroslav Bendi, et al., PsychENCODE Consortium, John Fullard, Gabriel Hoffman, Panos Roussos

Science, 2024

Nucleotide variants in cell type-specific gene regulatory elements in the human brain are risk factors for human disease. We measured chromatin accessibility in 1,932 aliquots of sorted neurons and non-neurons from 616 human postmortem brains, and identified 34,539 open chromatin regions with chromatin accessibility quantitative trait loci (caQTLs). Only 10.4% of caQTL are shared between neurons and non-neurons, supporting cell type specific genetic regulation of the brain regulome. Incorporating allele specific chromatin accessibility improves statistical fine-mapping and underlying disease risk. Using 19,893 brain QTLs, identifying the function impact of 476 regulatory variants. Combinde, this comprehensive resource captures variation in the human brain regulome and provides insights into disease etiology.


Data and Materials Availability:

Source Data AMP-AD cohort

Data Access Requirements AMP-AD cohort

Source Data CMP cohort

Massively Parallel Characterization of Regulatory Elements in the Developing Human Cortex  

Chengyu Deng, Sean Whalen, et al., PsychENCODE Consortium, Nadav Ahituv, Katherine Pollard

Science, 2024

Nucleotide changes in gene regulatory elements are important determinants of neuronal development and disease. Using massively parallel reporter assays in primary human cells from mid-gestation cortex and cerebral organoids, we interrogated the cis-regulatory activity of 102,767 open chromatin regions, including thousands of sequences with cell-type specific accessibility and variants associated with brain gene regulation. In primary cells, we identified 46,802 active enhancer sequences and 164 variants that alter enhancer activity. Activity was comparable in organoids and primary cells, suggesting that organoids provide an adequate model for the developing cortex. Using deep learning, we decoded the sequence basis and upstream regulators of enhancer activity. This work establishes a comprehensive catalog of functional gene regulatory elements and variants in human neuronal development.

Data and Materials Availability:

Data

Cross-Ancestry Atlas of Gene, Isoform, and Splicing Regulation in the Developing Human Brain

Cindy Wen, et al., PsychENCODE Consortium, Chunyu Liu, Michael Gandal

Science, 2024

Neuropsychiatric genome-wide association (GWAS) studies, including for autism, schizophrenia, and bipolar disorder, are strongly enriched for genomic regulatory elements in the developing brain. However, prioritizing candidate risk genes and mechanisms is challenging without a unified regulatory atlas. Across 672 diverse developing human brain samples, we identified 15,752 genes harboring gene, isoform, and/or splicing quantitative trait loci (xQTLs) and mapped 3,739 QTLs to cellular contexts. Gene expression heritability drops during brain development, likely reflecting both increasing cellular heterogeneity as well as intrinsic properties of neurons as they mature. Isoform-level regulation, particularly in the second trimester, mediated the largest proportion of GWAS heritability. Via colocalization, we prioritized candidate mechanisms for ~60% of GWAS loci across disorders, exceeding adult brain findings. Finally, we contextualized results within gene/isoform co-expression networks, revealing the comprehensive landscape of genetic regulation in development and disease.

Data and Materials Availability:

Interactive Portal

Brain Expression Data

BioProject

Extended Data

Analysis Code

Synapse Knowledge Portal

Developmental Isoform Diversity in the Human Neocortex Informs Neuropsychiatric Risk Mechanisms

Ashok Patowary, Pan Zhang, Connor Jops, et al., PsychENCODE Consortium, Michael Gandal, Luis de la Torre-Ubieta

Science, 2024

RNA splicing is highly prevalent in the brain and has strong links to neuropsychiatric disorders, yet the role of cell-type-specific splicing or transcript-isoform diversity during human brain development has not been systematically investigated. Here, we leveraged single-molecule long-read sequencing to deeply profile the full-length transcriptome of the germinal zone (GZ) and cortical plate (CP) regions of the developing human neocortex at tissue and single-cell resolution. We identified 214,516 unique isoforms, of which 72.6% are novel (unannotated in Gencode-v33), and uncovered a substantial contribution of transcript-isoform diversity, regulated by RNA binding proteins, in defining cellular identity in the developing neocortex. We leveraged this comprehensive isoform-centric gene annotation to re-prioritize thousands of rare de novo risk variants and elucidate genetic risk mechanisms for neuropsychiatric disorders.

Data and Materials Availability:

Exploration of Dataset

IsoSeq Bulk and Single-cell

Mid-gestation Neocortest Isoform UCSC track hub

Using a Comprehensive Atlas and Predictive Models to Reveal the Complexity and Evolution of Brain-Active Regulatory Elements

Henry Pratt, Greg Andrews, Nicole Shedd, et al., PsychENCODE Consortium, Zhiping Weng

Science Advances, 2024

Most genetic variants associated with psychiatric disorders are located in non-coding regions of the genome. To enhance understanding of their functional implications, we integrate epigenetic data from the PsychENCODE Consortium and other published sources to construct a comprehensive atlas of candidate brain cis-regulatory elements. Using deep learning, we model these elements' sequence syntax, and predict how binding sites for lineage-specific transcription factors contribute to cell type-specific gene regulation in various types of glia and neurons. The elements' evolutionary history suggests new regulatory information in the brain emerges primarily via smaller sequence mutations within conserved mammalian elements, rather than entirely new human- or primate-specific sequences. However, primate-specific candidate elements, particularly those active during fetal brain development and in excitatory neurons and astrocytes, are implicated in the heritability of brain-related human traits. Additionally, we introduce PsychSCREEN, a web- based platform offering interactive visualization of PsychENCODE-generated genetic and epigenetic data from diverse brain cell types in individuals with psychiatric disorders and healthy controls.

Data and Materials Availability:

PsychSCREEN Web Tool

Data

Brain Cell-Type Shifts in Alzheimer’s Disease, Autism, and Schizophrenia Interrogated using Methylomics and Genetics

Chloe Yap, et al., PsychENCODE Consortium, Michael Gandal

Science Advances, 2024

Most neuropsychiatric disorders lack clearly defined cellular or molecular markers. Previous studies investigating neuropathologic signatures of psychiatric diagnoses have often relied on small cohorts, discrete cell-type markers, and have been unable to disentangle cause from consequence. Furthermore, efforts to investigate brain cell-type proportion (CTP) shifts have been hampered by the expense and sampling bias of single-cell experiments and a focus on deconvolution of bulk RNA- seq, whose biological and statistical properties make it an inferior data type for CTP deconvolution. Here, we leverage advances in brain single-cell methylomics and develop a novel framework to deconvolve 7 brain CTPs from bulk DNA methylation data. We apply this framework to uniformly- processed bulk methylation data from 1,270 postmortem human brain samples, including donors diagnosed with autism (n=31), schizophrenia (n=186), and Alzheimer’s disease (n=300). We observe subtle but global diagnosis-associated CTP shifts for Alzheimer’s disease (endothelial cell loss), autism (increased microglia) and schizophrenia (decreased oligodendrocytes), with the former two robust to replication. There were also substantial sex- and age-related CTP shifts. We found significant associations between endothelial cell loss and increased common variant risk for Alzheimer’s disease, implying that endothelial cell loss may play a causal or pleiotropic role in Alzheimer’s disease. In a genome-wide association study, we identify 5 loci significantly associated with cell-type compositional shifts, which in turn mapped to MYT1 (inhibitory neurons), CSF1 (astrocytes), SIPA1L2, GLRX5 and SHPK-TRPV1. These results systematically characterize cell-type vulnerability across neurodevelopmental and neurodegenerative diagnoses and provide a framework for investigation of cellular compositional shifts in the biology of neuropsychiatric traits.

Data and Materials Availability:

Analysis Code

GWAS Summary Statistics

Processed Methylations Beta Matrix

ROSMAP, LIBD

UCLA-ASD Genotypes

AD Knowledge Portal

SNP genotypes were downloaded from dbGaP accession phs000417.v2.p1. UCLA-ASD

Evaluating Performance and Applications of Sample-Wise Cell Deconvolution Methods on Human Brain Transcriptomic Data

Rujia Dai, et al., PsychENCODE Consortium, Chao Chen, Chunyu Liu

Science Advances, 2024

Sample-wise deconvolution methods estimate cell-type proportions and gene expressions in bulk-tissue samples, yet their performance and biological applications remain unexplored, particularly in human brain transcriptomic data. Here, nine deconvolution methods were evaluated with sample-matched data from bulk-tissue RNAseq, single-cel/nuclei (sc/sn) RNAseq, and immunohistochemistry. A total of 1,130,767 nucleicells from 149 adult postmortem brains and 72 organoid samples were used. The results showed the best performance of dtangle for estimating cell proportions and bMIND for estimating sample-wise cell-type gene expressions. For eight brain cell types, 25,273 cell-type eQTLs were identified with deconvoluted expressions (decon-eQTLs). The result showed that decon-eQTLs explained more schizophrenia GWAS heritability than bulk-tissue or single-cell eQTLs did alone. Differential gene expressions associated with Alzheimer's disease, schizophrenia, and brain development were also examined using the doconvoluted data. Our findings, which were replicated in bulk-tissue and single-cell data, provided insights into the biological applications of doconvoluted data in multple brain disorders.


Data and Materials Availability:

scRNAseq_HybridCT_BICCNandSestan/CMC

HumanStudies/Yale-ASD/Data/RNAseq

Yale-ASD/Data/RNAseq/SingleNuclei

BrainGVEX/Data/RNA-seq

CommonMind

Analysis Code 

Analysis Results

Characterization of Enhancer Activity in Early Human Neurodevelopment using Massively Parallel Reporter Assay (MPRA) and Forebrain Organoids

Davide Capauto, et al., PsychENCODE Consortium, Alexej Abyzov, Flora Vaccarino

Scientific Reports, 2024

Regulation of gene expression through enhancers is one of the major processes shaping the structure and function of the human brain during development. High-throughput assays have predicted thousands of enhancers involved in neurodevelopment, and confirming their activity through orthogonal functional assays is crucial. Here, we utilized Massively Parallel Reporter Assays (MPRAs) in stem cells and forebrain organoids to evaluate the activity of ~7,000 gene-linked enhancers previously identified in human fetal tissues and brain organoids. We used a Gaussian mixture model to evaluate the contribution of background noise in the measured activity signal to confirm the activity of ~35% of the tested enhancers, with most showing temporal-specific activity, suggesting their evolving role in neurodevelopment. The temporal specificity was further supported by the correlation of activity with gene expression. Our findings provide a valuable gene regulatory resource to the scientific community. 

Data and Materials Availability:

Content

Source Bulk RNA-seq

scENCORE: Leveraging Single-Cell Epigenetic Data to Predict Chromatin Conformation using Graph Embedding

Ziheng Duan, et al., PsychENCODE Consortium, Jing Zhang

Briefings in Bioinformatics, 2024

Dynamic compartmentalization of eukaryotic DNA into active and repressed states enables diverse transcriptional programs to arise from a single genetic blueprint, whereas its dysregulation can be strongly linked to a broad spectrum of diseases. While single-cell Hi-C experiments allow for chromosome conformation profiling across many cells, they are still expensive and not widely available for most labs. Here, we propose an alternate approach, scENCORE, to computationally reconstruct chromatin compartments from the more affordable and widely accessible single-cell epigenetic data. First, scENCORE constructs a long-range epigenetic correlation graph to mimic chromatin interaction frequencies, where nodes and edges represent genome bins and their correlations. Then, it learns the node embeddings to cluster genome regions into A/B compartments and aligns different graphs to quantify chromatin conformation changes across conditions. Benchmarking using cell-type-matched Hi-C experiments demonstrates that scENCORE can robustly reconstruct A/B compartments in a cell-type-specific manner. Furthermore, our chromatin confirmation switching studies highlight substantial compartment-switching events that may introduce substantial regulatory and transcriptional changes in psychiatric disease. In summary, scENCORE allows accurate and cost-effective A/B compartment reconstruction to delineate higher-order chromatin structure heterogeneity in complex tissues.

Data and Materials Availability:

Supplementary Data