Transcriptional Changes following Cellular Knockdown of the Schizophrenia Risk Gene SETD1A Are Enriched for Common Variant Association with the Disorder

Loss of function mutations in SETD1A are the first experiment-wide significant findings to emerge from exome sequencing studies of schizophrenia. Although SETD1Ais known to encode a histone methyltransferase, the consequences of reduced SETD1A activity on gene expression in neural cells have, to date, been unknown. To explore transcriptional changes through which genetic perturbation of SETD1A could confer risk for schizophrenia, we have performed genome-wide gene expression profiling of a commonly used human neuroblastoma cell line in which SETD1A expression has been experimentally reduced using RNA interference (RNAi). We identified 1,031 gene expression changes that were significant in two separate RNAi conditions compared with control, including effects on genes of known neurodevelopmental importance such as DCX and DLX5. Genes that were differentially expressed following SETD1A knockdown were enriched for annotation to metabolic pathways, peptidase regulator activity and integrin-mediated regulation of cell adhesion. Moreover, differentially expressed genes were enriched for common variant association with schizophrenia, suggesting a degree of molecular convergence between this rare schizophrenia risk factor and susceptibility variants for the disorder operating more generally.


Introduction
Large-scale genomic studies have successfully identified many risk loci for schizophrenia [1]. These include common genetic variants with individually weak effects on susceptibility, as well as rare variants of stronger effect. Building on earlier observations [2], a recent large-scale exome sequencing study identified loss of function (LoF) variants in the SETD1A gene (encoding SET Domain Containing 1A) as the first exome-wide significant genetic risk factor for schizophrenia [3]. Although rare in the disorder, LoF mutations in SETD1A are substantially depleted in people without a neuropsychiatric diagnosis, identified in only 2 of 45,376 such individuals (compared This article is licensed under the Creative Commons Attribution 4.0 International License (CC BY) (http://www.karger.com/Services/ OpenAccessLicense). Usage, derivative works and distribution are permitted provided that proper credit is given to the author and the original publisher. DOI: 10.1159/000497181 to a frequency of 0.13% in schizophrenia cases) in the study of Singh et al. [3]. Furthermore, rare, heterozygous LoF mutations in SETD1A have been found in individuals diagnosed as having developmental disorders and childhood apraxia of speech [3,4].
These mutations (nonsense, frameshifting and splice site mutations) are broadly distributed throughout SETD1A and are predicted to lead to premature translational termination before the conserved SET domain encoded by exons [16][17][18][19]. Although the mutant allele may be transcribed, it is likely to be degraded by nonsense mediated decay effectively resulting in SETD1A haplo-insufficiency. It is therefore unlikely that the protein produced from the mutant allele may behave in a dominant negative manner. These heterozygous mutations are therefore predicted to result in only 1 functional copy of SETD1A.
In contrast to the majority of loci identified by genome-wide association studies (GWAS) where the target gene(s) and pathogenic mechanism are often unclear [5], this finding establishes haplo-insufficiency of SETD1A as a defined genetic risk factor for schizophrenia.
Functional investigation of genes at schizophrenia risk loci (e.g. [6][7][8][9]) can serve as a starting point for uncovering the biological basis of the disorder. SETD1A is known to encode a broadly expressed protein that regulates chromatin structure by catalyzing the methylation of histone H3 at lysine residue 4 (H3K4me). Methylation of H3K4 is indicative of positive gene regulation, with monomethylation (H3K4me1) enriched at active and primed enhancers, while trimethylation (H3K4me3) is usually found at the promoters of actively transcribed genes [10]. Reduced SETD1A activity through pathogenic LoF mutations is therefore likely to have a major impact on gene expression via alterations to the histone code. While gene expression changes have been described following SETD1A knockdown in non-neural cells [11][12][13], to date, no such investigations have been reported in cells of neural origin. Therefore, in order to explore transcriptional changes through which genetic perturbation of SETD1A could confer risk for schizophrenia, we have performed genome-wide gene expression profiling of a human neural cell line in which SETD1A expression has been experimentally reduced.

Materials and Methods
Cell Culture and siRNA Transfection SH-SY5Y cells (ATCC) were maintained in DMEM/F 12 containing glutamine and 10% fetal bovine serum. Knockdown of endogenous SETD1A expression was achieved by RNA interference (RNAi), using two non-overlapping siRNA targeting full-length SETD1A mRNA (Ensembl ENSG00000099381) in two separate RNAi conditions (ThermoFisher Scientific). Silencer select SETD1A siRNA #1 (s18789) targets exons 2/3 of SETD1A, and silencer select SETD1A siRNA #2 (s18790) targets exon 14. A negative control siRNA (ThermoFisher Scientific), with no sequence similarity with any human transcript, was used for the control comparison condition. Twenty-four hours before transfection, cells were seeded into T25 flasks at a density of 300,000 cells/flask. 10 nM control siRNA, SETD1A siRNA #1 or SETD1A siRNA #2 were transfected into individual flasks using Lipofectamine RNAiMAX (ThermoFisher Scientific), with 4 biological replicates per condition. In order to estimate transfection efficiency, an additional T25 flask of seeded cells was transfected with 10 nM BLOCK-iT TM Alexa Fluor ® Red Fluorescent Control oligonucleotide (ThermoFisher Scientific), using the same reagent, and visualized after 24 h by fluorescence microscopy. Cells were harvested for RNA extraction 96 h after transfection.

RNA Extraction and cDNA Synthesis
Total RNA was extracted from control and SETD1A RNAi conditions using Trizol (ThermoFisher Scientific), following the manufacturer's instructions. Residual genomic DNA was removed by addition of 2 μL Turbo DNA-free TM (ThermoFisher Scientific) and incubation at 37 ° C for 30 min. Integrity of total RNA was assessed using the Agilent 2100 Bioanalyzer (Agilent Technologies). All samples had an RNA integrity number > 9.5.

Assessment of SETD1A RNA Knockdown
Quantitative PCR (qPCR) was used to assess the level of SETD1A RNA knockdown in each SETD1A siRNA condition (relative to the control siRNA condition) prior to genome-wide gene expression profiling. Total RNA was converted to cDNA using random decamers and SuperScript ® III (ThermoFisher Scientific). SETD1A RNA expression was measured using the qPCR primers: F: 5′-ACCTGGCCAGATTCATCAAC-3′ and 5′-CGATC TTC-TT CTGGGACTCG-3′. Expression of GAPDH and ACTB was simultaneously measured as internal control genes. The expression of these housekeeping genes was subsequently found not to differ between siRNA and control conditions in either cell line in the microarray data (all p > 0.05). Reactions were performed using SYBR TM Green PCR Master Mix (ThermoFisher Scientific) on an Applied Biosystems StepOnePlus Real-Time PCR system. Triplicate qPCR reactions were performed to measure expression of each gene in each cDNA sample. Expression of each gene was measured against a standard curve constructed by serial dilution of pooled cDNA. Mean measures of SETD1A expression were divided by the geometric average of the mean measures for the 2 internal control genes to yield a normalized SETD1A expression value for each sample. Normalized SETD1A expression values were compared between each SETD1A siRNA condition and the negative control siRNA condition by t tests (2-tailed). of the resulting cRNA were measured by SpectroStar and the Agilent Bioanalyser. Finally cRNA was hybridized to the HumanHT-12 v4 BeadChip overnight followed by washing, staining and scanning using the Bead Array Reader (Illumina). All microarray data from this study have been submitted to the Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/), with the accession No. GSE118941.

Identification of Differentially Expressed Probes
Data were extracted from GenomeStudio software (Illumina) and variance stabilizing transformation and robust spline normalization applied using the lumi Bioconductor package [14]. Data were then log 2 transformed. Probes that were detected (detection p value < 0.05) in all control replicates and in all replicates of at least one of the knockdown conditions were retained for further analysis. Probes at which gene expression in each SETD1A siRNA condition differed from the siRNA control condition were identified by individual t tests (two-tailed). Consistently with previous studies [6,7] gene expression changes with p value < 0.05 in both SETD1A knockdown conditions and a concordant direction of effect were used for all subsequent analyses.

Validation of Gene Expression Changes
We used qPCR to confirm expression changes in selected genes following SETD1A knockdown using RNA extracted from the same samples as sent for microarray. The 10 samples that passed microarray QC were included for all analyses. Expression of EIF4A1 and FUBP3 was simultaneously measured as internal control genes (selected as control genes from the microarray results). Primer sequences are listed in online supplementary Table S1 (see www.karger.com/doi/10.1159/000/497181 for all online suppl. material). Reactions were carried out in triplicate, with expression of each gene measured against a standard curve, as described above. Expression changes that occurred in the same direction as in the microarray and with a t test p < 0.05 (1-tailed) were considered significant.

Gene Ontology/Pathway Analysis
We tested for potential enrichment of high confidence gene expression changes within gene ontology (GO) terms and KEGG pathways using the DAVID Bioinformatics Resource [15]. We tested for enrichment of (1) all genes shared by the two SETD1A siRNA conditions, (2) genes shared by the two siRNA conditions that were downregulated in association with SETD1A knockdown, and (3) genes shared by the two siRNA conditions that were upregulated in association with SETD1A knockdown. All gene probes that had a detection p value < 0.05 in all samples were used as the background comparison. We highlight enrichments with a false discovery rate (FDR) < 5%.
Grouping of categories was performed using Enrichment Map [16]. Nominally significant (uncorrected p < 0.05) categories identified by DAVID analysis of the downregulated genes were used as input for grouping by Enrichment Map using default setting. The cluster containing the KEGG pathway "hsa01100: metabolic pathways" was retained. Category size (i.e., number of genes) was used to scale the size of each node. The overlap coefficient was used to scale the size of the edges.
Overlap with brain co-expression modules [17] was performed using Fisher exact tests. Genes common to both studies were used as the background.

Gene Set Enrichment Analysis Using Schizophrenia GWAS Data
Differentially expressed genes shared by the two SETD1A siRNA conditions were used as a gene set for enrichment testing with MAGMA [18]. MAGMA performs tests of competitive gene set enrichment using summary results from GWAS, controlling for confounders such as gene size, gene density and linkage disequilibrium. We tested for enrichment of schizophrenia common variant signal using summary data from a recent schizophrenia GWAS containing a total of 105,318 participants [19]. A window of 10 kb around each gene was included for single-nucleotide polymorphism annotations. MAGMA was run with other parameters set to the default. In order to control for any inflation of signal arising from use of a neural cell line, we used the set of protein-coding genes detected as expressed (p value < 0.05), as well as the default set of all protein-coding genes, for background comparisons.

Results
We used two non-overlapping siRNA (SETD1A siRNA #1 and siRNA #2) to knock down endogenous SETD1A expression in a human neuroblastoma cell line (SH-SY5Y). Transfection efficiency, as indexed by uptake of BLOCK-iT TM Alexa Fluor ® Red Fluorescent Control oligonucleotide, was estimated to be > 90% (online suppl. Fig. S1). At harvest, mean SETD1A RNA expression (as indexed by qPCR) was reduced by 49.5% (p = 0.007) in the SETD1A siRNA #1 condition and 57% (p = 0.007) in the SETD1A siRNA #2 condition, relative to the control siRNA condition (Fig. 1).
We detected gene expression at 16,906 microarray probes, corresponding to 12,514 unique Entrez gene IDs. Gene expression differed significantly (p < 0.05) between the SETD1A siRNA #1 condition and the control siRNA condition at 2,384 microarray probes, and between the SETD1A siRNA #2 and the control siRNA condition at 4,369 microarray probes. We produced a high confidence set of genes affected by SETD1A knockdown by first identifying 1,184 probes that were differentially expressed in both siRNA conditions compared to the control condition (overlap p = 2.3 × 10 -162 , hypergeometric test). Next, we restricted this list to probes for which the direction of expression change was consistent between the two siRNA conditions. Of the 1,184 overlapping probes at p < 0.05, 1,131 (96%) had a concordant direction of effect between the two siRNA conditions. Of these, 514 were upregulated and 617 downregulated (online suppl. Table S2).
We sought to validate selected gene expression changes using qPCR. Of the 7 selected genes, we were able to confirm significant directional expression changes associated with both SETD1A siRNA conditions for 5, and significant directional changes associated with one SETD1A siRNA condition for the remaining 2 (Table 1). These include expression changes in genes of known neurodevelopmental importance such as DCX, encoding doublecortin, and DLX5, encoding Distal-Less Homeobox 5 [20,21].
We used GO/KEGG pathway enrichment analysis to investigate potential overrepresentation of differentially expressed genes shared by the two SETD1A siRNA conditions within particular biological annotations. Compared to the background of all detected genes, this gene set was significantly enriched for annotation to the KEGG pathway "hsa01100: metabolic pathways" (FDR = 0.07%) as well as the GO terms "GO: 0033628∼regulation of cell adhesion mediated by integrin" (FDR = 2.07%) and "GO: 0061134∼peptidase regulator activity" (FDR = 4.49%). A full list of nominally significant annotations among differentially expressed genes can be found in online supplementary Table S3. We also performed separate enrichment analyses for genes shared by the two SETD1A siRNA conditions that were either down-or upregulated in association with SETD1A knockdown (online suppl. Tables S4 and S5). Downregulated genes were also enriched for the KEGG pathway "hsa01100: metabolic pathways" (FDR = 0.23%) as well as the GO term "GO: 0061134∼peptidase regulator activity" (FDR = 2.58%), but were additionally enriched for the GO terms "GO: 0005739∼mitochondrion" (FDR = 0.29%) and "GO: 0006506∼GPI anchor biosynthetic process" (FDR = 3.21%). Upregulated genes were enriched for terms relating to phosphatase activity, with "GO: 0050308∼sugarphosphatase activity" the most significant term (FDR = 1.36%). Along with several of the enriched GO categories, the KEGG pathway "hsa01100: metabolic pathways" contains a large number of genes. We therefore investigated the relationship between all nominally significant (uncorrected p < 0.05) categories identified from the DAVID analysis of the downregulated genes using Enrichment Map. The KEGG pathway "hsa01100: metabolic pathways" shares genes with KEGG and GO categories relating to carbon and carbohydrate metabolism, which are then linked to mitochondrial terms (online suppl. Fig. S2).
Although SETD1A is part of a brain gene co-expression module enriched for gene regulators [17], we observed no enrichment of dysregulated genes in these brain developmental modules (online suppl. Table S6). We also observed that its expression was apparent in many tissues, with little evidence for enrichment in brain tissue (online suppl. Fig. S3). Rare LoF mutations in SETD1A appear to be a strong risk factor for schizophrenia, but the extent to which their effects converge on molecules affected by common risk variants for the disorder is unknown. We therefore tested for enrichment of schizophrenia association signal in genes differentially expressed following SETD1A knockdown using summary statistics from the largest schizophrenia GWAS published to date [19]. Genes that were differentially expressed in association with both SETD1A siRNA conditions were enriched for schizophrenia association using either the default of all protein-coding genes as the background comparison (p = 0.016) or all genes expressed in the tested SH-SY5Y cells (p = 0.036). MAG-MA schizophrenia association statistics for differentially expressed genes are provided in online supplementary Table S7.

Discussion
LoF variants in SETD1A are the first findings from exome sequencing studies of schizophrenia to reach experiment-wide significance [3]. Although the role of SETD1A in regulating gene expression can be inferred through its function as a histone methyltransferase, the consequences of reduced SETD1A activity on the neural transcriptome have, to date, been unknown. In order to investigate transcriptional changes through which LoF mutations in SETD1A could confer risk for schizophrenia, we have mimicked the 50% reduction in functional SETD1A expression that is expected to result from such variants in a human neural cell line. SETD1A knockdown was found to bring about widespread changes in gene expression that included genes of known neurodevelopmental importance (e.g., DCX and DLX5). The differentially expressed genes were also enriched for common variant association with schizophrenia. These findings thus suggest a degree of molecular convergence between this rare, strong effect schizophrenia risk factor and common susceptibility variants for the disorder.
Consistent with its predicted role as a fundamental regulator of gene expression, knockdown of SETD1A RNA by individual siRNAs resulted in thousands of changes in gene expression in SH-SY5Y cells. We note that many of these gene expression changes are likely to be downstream of direct regulatory effects of SETD1A at individual gene loci. Although widespread, high confidence differentially expressed genes showed significant enrichment within specific pathways and cellular components, most notably metabolic pathways and mitochon-dria. SETD1A manipulation in other cell types resulted in expression changes to genes involved in DNA repair [12]. We found some supportive evidence for the DNA repair pathways in our data. "GO: 0006298∼mismatch repair" was nominally enriched (uncorrected p = 0.03) in the all and downregulated gene classes. To fully understand cell type specific consequences of reduced SETD1A expression, additional models will need to be used (e.g., stem cell-derived cell types), including those derived from different common variant backgrounds.
SETD1A mutations add to the growing list of histone lysine methyltransferases and demethylases linked to human disease [22]. The majority of LoF mutations within these genes have been identified in patients with intellectual disability. We observed pronounced expression changes in several genes important for brain development following SETD1A knockdown, including increased expression of DCX, which plays a crucial role in neural migration [20], and reduced expression of DLX5, which has been reported to regulate the development of cortical interneurons [21]. It is possible that disturbances in these as well as other differentially expressed neurodevelopmental genes contribute to the developmental learning difficulties as well as increased schizophrenia risk described in SETD1A LoF mutation carriers [3].
In recent years, there have been major advances in our understanding of the genetic architecture of schizophrenia, with identification of risk alleles spanning the entire frequency spectrum. An important question is whether rare risk variants of strong effect operate through pathways of general relevance to schizophrenia. The data presented here suggest that effects of rare SETD1A mutations do indeed impinge on molecules that are more commonly affected in the disorder.