Analysis of 200,000 Exome-Sequenced UK Biobank Subjects Implicates Genes Involved in Increased and Decreased Risk of Hypertension

Background: Previous analyses have identified common variants along with some specific genes and rare variants which are associated with risk of hypertension, but much remains to be discovered. Methods and Results: Exome-sequenced UK Biobank participants were phenotyped based on having a diagnosis of hypertension or taking anti-hypertensive medication to produce a sample of 66,123 cases and 134,504 controls. Variants with minor allele frequency (MAF) <0.01 were subjected to a gene-wise weighted burden analysis, with higher weights assigned to variants which are rarer and/or predicted to have more severe effects. Of 20,384 genes analysed, 2 genes were exome-wide significant, DNMT3A and FES. Also strongly implicated were GUCY1A1 and GUCY1B1, which code for the subunits of soluble guanylate cyclase. There was further support for the previously reported effects of variants in NPR1 and protective effects of variants in DBH. An inframe deletion in CACNA1D with MAF = 0.005, rs72556363, is associated with modestly increased risk of hypertension. Other biologically plausible genes highlighted consist of CSK, AGTR1, ZYX, and PREP. All variants implicated were rare, and cumulatively they are not predicted to make a large contribution to the population risk of hypertension. Conclusions: This approach confirms and clarifies previously reported findings and also offers novel insights into biological processes influencing hypertension risk, potentially facilitating the development of improved therapeutic interventions. This research has been conducted using the UK Biobank Resource.


Introduction
Hypertension is an important risk factor for disease that has a heritable component, and a recent large genome-wide association study of common variant effects identified 901 loci with enrichment in relevant tissues (blood vessels, heart, adrenal tissue, and adipose tissue) and pathways (angiotensinogen, calcium channels, progesterone, natriuretic peptide receptor, angiotensin-converting enzyme, angiotensin receptors, and endothelin receptors) [1]. Selection pressures tend to mean that common variants individually have small effect sizes and it can be difficult to interpret their biological effects [2]. By contrast, rare variants can potentially have large effect sizes and clear biological mechanisms, as exemplified by a number of monogenic causes of hypertension such as congenital adrenal hyperplasia, familial hyperaldosteronism and pseudohypoaldosteronism, which can be caused by variants in CYP11B1, CYP11B2 KLHL3, CUL3, SCNN1B, SCNN1G, CYP17A1, HSD11B2, NR3C2, and KCNJ5 [3]. Additionally, variants in CAC-NA1H, CACNA1D, and CLCN2 have now also been identified as causes of familial hyperaldosteronism, while somatic mutations in ATP1A1 or ATP2B3 can produce aldosterone-producing adrenal adenomas with consequent hypertension [4]. Recessively acting variants in GU-CY1A1 (previously labelled GUCY1A3) can cause moyamoya disease, and 2 unrelated subjects with moyamoya disease who also had achalasia and hypertension were found to have compound heterozygote variants in this gene [5]. A more recent study has shown that moyamoya disease is itself a risk factor for hypertension [6]. Using hypertension or blood pressure as the phenotype, genebased analyses aggregating rare, non-synonymous variants implicated PTMT1, DBH and NPR1 in a large metaanalysis and also showed that the minor allele of a rare non-synonymous variant in DBH, rs3025380, was associated with lower blood pressure [7]. Another study reported that 3 individual non-synonymous variants in NPR1 were associated with increased (rs35479618 and rs116245325) and decreased (rs61757359) blood pressure and showed that this could be explained by the effects of these variants on guanylate cyclase activity [8]. KDM1A codes for LSD1, which removes methyl groups from the methylated lysine 4 residue of histone 3 (H3K4), and there have been reports of association between variants in KDM1A and salt-sensitive hypertension in humans, while heterozygous Lsd1 knock-out mice have salt-sensitive hypertension with increased aldosterone production [9][10][11].
The growing availability of sequence data means that it may become possible to study the wider effects of rare functional variants in the general population. This may implicate novel genes or may demonstrate a wider role for genes already implicated in severe familial disorders. Exome sequence data is now available for 200,000 of the 500,000 UK Biobank subjects [12]. We have recently analysed this in order to illuminate the effect of rare coding variants on susceptibility to hyperlipidaemia and a number of other common traits with complex inheritance, and we now apply the same approach to study the contribution of rare variants to risk of developing hypertension [13].

Methods
The UK Biobank dataset was downloaded along with the variant call files for 200,632 subjects who had undergone exome sequencing and genotyping by the UK Biobank Exome Sequencing Consortium using the GRCh38 assembly with coverage 20× at 95.6% of sites on average [12]. The UK Biobank participants are volunteers intended to be broadly representative of the UK population and are not selected on the basis of having any health condition. The UK Biobank had obtained ethics approval from the North West Multi-Centre Research Ethics Committee, which covers the UK (approval number: 11/NW/0382) and had obtained informed consent from all participants. The UK Biobank approved an application for use of the data (ID 51119) and ethics approval for the analyses was obtained from the UCL Research Ethics Committee (11527/001). All variants were annotated using the standard software packages VEP, PolyPhen, and SIFT [14][15][16]. To obtain population principal components reflecting ancestry, version 2.0 of plink (https://www.cog-genomics.org/plink/2.0/) was run with the options -maf 0.1 -pca 20 approx [17,18].
The UK Biobank sample contains 503,317 subjects, of whom 94.6% are of white ethnicity. As we have discussed previously, it has become standard practice for investigators to simply discard data from participants with other ancestries and we regard this as regrettable [19]. We demonstrated that if population principal components are included as covariates, then it is possible to include all participants, regardless of ancestry, in the type of weighted burden analysis described here without inflation of the test statistic.
To define cases, a similar approach was used as was previously implemented for the investigation of hyperlipidaemia and T2D [13,20,21]. The hypertension phenotype was determined from 4 sources in the dataset: self-reported diagnosis recorded as hypertension or essential hypertension; reporting taking medication for high blood pressure; reporting taking any of a list of named medications commonly used to treat high blood pressure (https://www. nhs.uk/conditions/high-blood-pressure-hypertension/); having an ICD-10 diagnosis of essential hypertension, hypertensive heart disease, or hypertensive renal disease in hospital records or as a cause of death. Subjects in any of these categories were deemed to be cases with hypertension, while all other subjects were taken to be controls.
The same analytic methods as had been used previously were applied, with the description repeated here for the reader's convenience. The SCOREASSOC program was used to carry out a weighted burden analysis to test whether, in each gene, sequence variants that were rarer and/or predicted to have more severe functional effects occurred more commonly in cases than in controls. Attention was restricted to rare variants with minor allele frequency (MAF) ≤0.01 in both cases and controls. As previously described, variants were weighted by overall MAF, so that variants with MAF = 0.01 were given a weight of 1, while very rare variants with MAF close to 0 were given a weight of 10 [19]. Variants were also weighted according to their functional annotation using the GE-NEVARASSOC program, which was used to generate input files for weighted burden analysis by SCOREASSOC [22,23]. The weights were informed from the analysis of the effects of different categories of variant in LDLR on hyperlipidaemia risk [13]. Variants predicted to cause complete loss of function (LOF) of the gene were assigned a weight of 100. Non-synonymous variants were assigned a weight of 5 but if PolyPhen annotated them as possibly or probably damaging, then 5 or 10 was added to this, and if SIFT annotated them as deleterious, then 20 was added. In order to allow exploration of the effects of different types of variants on disease risk, the variants were also grouped into broader categories to be used in Pulse 2021;9:17-29 DOI: 10.1159/000517419 multivariate analyses as described below. The full set of weights and categories is displayed in Table 1. As described previously, the weight due to MAF and the weight due to functional annotation were multiplied together to provide an overall weight for each variant. Variants were excluded if there were >10% of genotypes missing in the controls or cases or if the heterozygote count was smaller than both homozygote counts in the controls or cases. If a subject was not genotyped for a variant, then they were assigned the subject-wise average score for that variant. For each subject, a genewise weighted burden score was derived as the sum of the variantwise weights, each multiplied by the number of alleles of the variant that the given subject possessed. For variants on the X chromosome, hemizygous males were treated as homozygotes.
For each gene, logistic regression analysis was carried out with hypertension as the dependent variable, including the first 20 population principal components and sex as covariates and a likelihood ratio test was performed comparing the likelihoods of the models with and without the gene-wise burden score. This is a test for association between the gene-wise burden score and caseness, and the statistical significance was summarized as a signed log p value (SLP), which is the log 10 of the p value given a positive sign if the score is higher in cases and negative if it is higher in controls. No other clinical or demographic covariates were included in the analyses.
Gene set analyses were carried out as before using the 1454 "all GO gene sets, gene symbols" pathways as listed in the file c5.all. v5.0.symbols.gmt downloaded from the Molecular Signatures Database at http://www.broadinstitute.org/gsea/msigdb/collections. jsp [24]. For each set of genes, the natural logs of the gene-wise p values were summed according to Fisher's method to produce a χ 2 statistic with degrees of freedom equal to twice the number of genes in the set. The p value associated with this χ 2 statistic was expressed as a minus log 10 p as a test of association of the set with the hyperlipidaemia phenotype. For selected genes, additional analyses were carried out to clarify the contribution of different categories of variant. As described previously, logistic regression analyses were performed on the counts of the separate categories of variant as listed in Table 1, again with hypertension as the dependent variable and including principal components and sex as covariates, to estimate the effect size for each category [13]. The odds ratios (ORs) associated with each category were estimated along with their standard errors, and the Wald statistic was used to obtain a p value, except for categories in which variants occurred fewer than 50 times in which case Fisher's exact test was applied to the variant counts. The associated p value was converted to an SLP again with the sign being positive if the OR was >1, indicating that variants in that category tended to increase risk. Data manipulation and statistical analyses were performed using GENEVARASSOC, SCOREASSOC, and R [25].

Results
There were 66,123 cases and 134,504 controls. There were 20,384 genes for which there were qualifying variants. Given that there were 20,384 informative genes, the critical threshold for the absolute value of the SLP to declare a result as formally statistically significant is −log 10 (0.05/20,384) = 5.61 and this was achieved by 2 genes, DNMT3A (SLP = 8.21) and FES (SLP = 6.10). The quan-tile-quantile plot for the SLPs obtained for all genes except DNMT3A is shown in Figure 1. This shows that the test appears to be well-behaved and conforms well with the expected distribution. Omitting the genes with the 100 highest and 100 lowest SLPs, which might be capturing a real biological effect, the gradient for positive SLPs is 1.096 with intercept at 0.006 and the gradient for negative SLPs is 1.080 with intercept at 0.02, indicating only modest inflation of the test statistic in spite of the fact that participants of all ancestries are included. Table 2 shows all the genes achieving SLP with absolute value >3, equivalent to an uncorrected p value of 0.001. Given that 20,384 genes were tested, one would expect that by chance about 20 would reach this level of significance, whereas in fact there are 42. Thus, it is possible that some of these highly ranked genes do demonstrate a biological signal, which fails to reach statistical significance after correction for multiple testing. For NPR1, the analysis was repeated excluding data from the previously reported individually significant variants rs35479618, rs116245325, and rs61757359. This resulted in a reduction of the SLP from 5.14 to 4.38. For DBH¸ the analysis was repeated without rs3025380, and this resulted in a change in SLP from −3.40 to −2.11. The full list of results for all genes is provided in online suppl. Table 1; see www.karger.com/doi/10.1159/000517419 for all online suppl. material.
In order to see if any additional genes were highlighted by analysing gene sets, gene set analysis was performed as described above after first dividing the gene-wise log p values by the average inflation factor of 1.09 before combining them using Fisher's method. Given that 1,454 sets were tested, a critical minus log 10 p to achieve to declare results significant after correction for multiple testing would be log 10 (1,454 × 20) = 4.46, and this was achieved by just one set, labelled CHROMATIN. This set contains 35 genes, including DNMT3A. The full results of the gene set analyses are listed in online suppl. Table 2. Four other sets achieved SLP >3 and these and for these all genes with an absolute value SLP >1.3 (equivalent to p < 0.05) are listed. All of these sets also contained DNMT3A, and none of the other nominally significant genes appeared to be obviously relevant to hypertension. These results are presented in online suppl. Table 3.
For the genes listed in Table 2 which appeared to be of interest, additional multivariate analyses were performed to elucidate the contribution to the overall result from different categories of variant. The results of this analysis for DNMT3A are shown in Table 3A. From this, it can be seen that the signal comes from disruptive and splice site variants, which are predicted to cause LOF and which are between them associated with an OR of about 1.9. However, variants annotated as probably damaging by Poly-Phen are also common in cases and are associated with an OR of 1.5. Table 3B shows that the result for FES is driven by a small number of disruptive variants which are commoner in cases with OR of 2.8. It is striking that both GU-CY1A1 and GUCY1B1 are ranked among the top 7 genes since they code for subunits of the same guanylate cyclase, and their results are shown in full in Table 3C and D. This shows that while the effect for GUCY1A1 is driven by LOF variants, these are very rare in GUCY1B1, and for this gene there seems to be an additional contribution from an excess of 5′ UTR variants among cases. These occur at 44 different locations, and detailed inspection of the output file showed these all to be individually rare, such that there was no single variant which could be seen to making a significant contribution to the overall effect. For the   other genes thought to be of interest, Table 4 provides a summary of the results for LOF variants along with any other variant category individually significant at p < 0.05. Full results for analyses of variant categories are presented in online suppl. Table 4.
Of the previously implicated genes listed in the Introduction, aside from GUCY1A1, only the following were significant at p < 0.05: CYP11B1 (SLP = 1.63), NR3C2 (SLP = 1.67), and CACNA1H (SLP = −2.05). Analyses of the variant categories were carried out for these genes, and a summary of the results is shown in Table 5, which provides the results for LOF variants and for other categories that were significant at p < 0.05. These are mostly unremarkable, and it is difficult to draw firm conclusions although a few results are worth noting. For most genes, LOF variants are very rare so that one cannot gain a clear estimate of their effect. However, the results for WNK1, WNK4, CLCN2, and ATP2B3 suggest that LOF variants in these genes do not have a very major effect on risk of hypertension. The most striking result is that for CAC-NA1D, the InDel category produces SLP = 8.10 with OR = 1.30. InDel variants occur at 9 locations in this gene,  and inspection of the detailed results revealed that 8 of these are very rare, so the result reflects the effect of a single inframe deletion, 3:53808664-CCTT>C. This is rs72556363, which results in the loss of a phenylalanine residue, p.Phe1923del, and according to gnomAD, it has allele frequency of 0.0043 in non-Finnish Europeans and is extremely rare or absent from other populations. In the UK Biobank subjects, we observe a frequency of 0.0049 in controls and 0.0064 in cases. Full results for analyses of variant categories in these genes are presented in online suppl. Table 5.

Discussion
These analyses provide a broad overview of some impacts of very rare genetic variants in a large sample broadly representative of the population. It should be pointed out that the analytic approach used assumes that all vari-ants in gene have the same direction of effect. Since any random variant is more likely to impair the function of a gene than to enhance it, weighted burden analysis is not expected to detect effects of gain of function variants because these are likely to be swamped by other variants. Likewise, the method is expected to be relatively insensitive at detecting variants that have a purely recessive effect. Of course, it is quite possible that the effect of complete LOF of one gene may be modified by functional variants affecting the other copy of the gene, but such complex heterozygous effects are difficult to detect. Additionally, a population sample differs from a specially recruited case-control sample in that it is not enriched for the phenotype in question and hence will have less power to detect either rare recessively acting variants or extremely rare variants with a dominant effect. Thus, we expect that there may well be important rare variant effects in addition to the ones that this study has highlighted. Results are shown for disruptive and splice site variants and for any other variant categories significant at p < 0.05. ORs, odds ratios; SLP, signed log p value. Another contrast with specifically defined case-control studies is that the phenotype needs to be derived from measures that are provided, and, in order to improve power, it is desirable to use information that is available for a large number of participants. The phenotype used here attempts to reflect a clinical diagnosis of hypertension but of course will not do this as accurately as could be achieved in a specifically ascertained sample. Some participants would be on antihypertensive medication for indications other than hypertension, some would have undiagnosed hypertension, and for some, the diagnosis would simply be incorrect. Measured blood pressure itself was not used to inform the phenotype, in part because it might reflect blood pressure on medication and in part because single measurements may be less informative than indirectly relying on the clinical decisions which have fed into assigning a diagnosis or starting a prescription. It is certainly the case that the phenotype used here does not correspond exactly with what one would use for a systematically recruited sample in a case-control study focussing explicitly on hypertension. Any lack of accuracy in the phenotype is likely to reduce power rather than to produce false-positive results. The advantages of having a large sample to some extent balance the disadvantage of a less accurate phenotype.
When considering the contribution of risk within the population, we should note that the variants involved are rare. Although an MAF threshold of 0.01 was used, the majority of variants analysed are very much rarer than this, and for the variant categories with the most severe consequences, the cumulative frequency of variants in the category is also low. This means that few subjects will carry more than one variant with a severe consequence, and we can say that the mean count of variants of a particular category is a good approximation for the proportion of the subjects who carry a variant of that category.
The result for DNMT3A seems unlikely to be due to chance since it would remain significant at p = 0.00016 even after correction for the number of genes tested. The results show that LOF variants in this gene are associated with nearly doubling the OR for hypertension and are present in about 1 in 1,000 people, while a slightly larger number will have a variant annotated as probably damaging by Poly-Phen, which moderately increases hypertension risk. DN-MT3A is a DNA methyltransferase, and non-synonymous variants in it have been reported as causes of 2 different syndromes, Tatton-Brown-Rahman syndrome with overgrowth and intellectual disability and Heyn-Sproul-Jackson syndrome with microcephalic dwarfism, neither of which has hypertension as part of the phenotype [26,27]. Missense variants in DNMT3A have also been reported in autism spectrum disorder, and mice with Dnmt3a haploinsufficiency produced by heterozygous deletion of exon 19 have increased body weight and some behavioural alterations [28]. In a series of 210 patients with an overgrowth syndrome similar to Sotos syndrome but with no NSD1 mutation, 4 had de novo non-synonymous mutations in DNMT3A and 2 had stop variants [29]. One of the stop variants was inherited from a normal mother in whom it was thought a somatic mutation had occurred and for the other the father's DNA was not available. Given the frequency of LOF variants in DNMT3A observed in our samples, it seems possible that the observation of 2 patients with stop variants was coincidental. Thus, it may be that while certain specific non-synonymous variants can cause severe phenotypes, generally reduced functioning of DNMT3A does not cause marked problems but is associated with increased risk of hypertension.
Although DNMT3A variants have not previously been reported to be associated with hypertension, as reviewed recently these findings are consistent with a body of results relating to the histone lysine demethylase LSD1, which together implicate LSD1 hypofunction in salt-sensitive hypertension [30]. In our data, disruptive variants in KDM1A, the gene for LSD1, are associated with an OR of 1.9, but they are too rare for firm conclusions to be drawn. DNMT3A methylates DNA conditional on the associated H3K4 residue being unmethylated and LSD1 accomplishes this demethylation, meaning that reduced function of LSD1 is expected to reduce DNMT3A-dependent DNA methylation [31,32]. In mice, Dnmt3a deficiency, which can be produced by knockdown or as a consequence of foetal exposure to dexamethasone or low-protein maternal diet, has been shown to lead to reduced methylation of the gene for angiotensin receptor type 1a, Agtr1a, leading to increased Agtr1a expression and salt-induced hypertension [33]. Thus, a consistent picture emerges that reduced DNMT3A activity can increase risk for hypertension, whether this is due to genetic variants in DNMT3A itself, LSD1 hypofunction, or the uterine environment.
Although FES (SLP = 6.10) only just reaches criteria for exome-wide significance, confidence in this result is somewhat increased by the fact that a nearby SNP, rs2521501, shows robust evidence for association [34]. However, the mechanisms by which it might influence hypertension risk are unclear as it codes for a tyrosine kinase which is involved in various signalling pathways and which may have a role in haematopoiesis and regulating the innate immune response [35].
The results for GUCY1A1 (SLP = 5.54) and GUCY1B1 (SLP = 3.92) are more compelling, given that they code for Pulse 2021;9:17-29 DOI: 10.1159/000517419 2 different subunits of the same protein, soluble guanylate cyclase, and given that recessively acting variants in GU-CY1A1 have previously been reported in cases of moyamoya disease with hypertension [5]. Soluble guanylate cyclase is responsible for detecting NO signalling in order to produce vasodilation and other responses, and the central role of this pathway in the control of blood pressure is well established from animal studies, while guanylate cyclase stimulators have been developed as treatments for pulmonary hypertension [36]. The findings reported here are the first to directly demonstrate that impaired functioning of either of these genes represents a risk factor for systemic hypertension in the general population, with nearly 1 in 1,000 people carrying a LOF variant in one of them associated with an OR of approximately 2.
The results for NPR1 (SLP = 5.14) represent a replication of the previously reported findings and confirm that although certain non-synonymous variants such as rs61757359 may be associated with reduced blood pressure, in general variants that impair the functioning of this gene increase the risk of hypertension [7,8]. Around 1 in 500 people carries a variant annotated by PolyPhen as probably damaging, and overall, such variants are associated with a modest increase in risk with OR = 1.32 (1.05-1.66).
SMAD6 (SLP = 4.10) has a role in signalling pathways and has not previously been clearly implicated in hypertension risk although a recent report describes how exome sequencing of 37 children with renovascular hypertension revealed a frameshift variant classified as likely pathogenic variant in SMAD6 in 1 patient [37]. SMAD6 variants are known to predispose to cardiovascular malformations including bicuspid aortic valve-related aortopathy [38][39][40]. The results reported here suggest that LOF variants in this gene may have a moderate effect on increased risk of hypertension in the general population.
Recessively acting variants in IFT172 (SLP = 3.39) can cause ciliopathies, and there is a report of a child with compound heterozygous variant who presented with growth retardation and subsequently developed retinopathy, metaphyseal dysplasia, and, at the age of 11 years, hypertension [41]. However, there does not seem to be other evidence to implicate IFT172 in hypertension risk, so this result would require replication in other samples.
CSK (SLP = 3.37) is located in the 15q24 locus, which, as reviewed recently, is implicated by multiple GWAS for hypertension [42]. Following up the results of eQTL analyses, these authors demonstrated that mice with gene silencing or haploinsufficiency of Csk had increased blood pressure and showed that this effect could be moderated by PP2, an inhibitor of Src. Although the results reported here are not formally significant after correction for multiple testing, the additional support provided by these GWAS findings and functional studies does suggest that variants in CSK, in particular those annotated as deleterious by SIFT, might be a risk factor for hypertension.
The results for DBH (SLP = −3.40) provide further support for the previously reported findings that variants in this gene are associated with blood pressure [7]. In particular, the results suggest that variants annotated as deleterious by SIFT are on average associated with a slightly reduced risk of developing hypertension. Although these variants are individually rare, about 1 person in 20 carries one of them.
AGTR1 (SLP = −3.77) codes for a receptor for angiotensin II, so it seems very plausible from a biological point of view that variants impairing its function might be protective against hypertension in spite of the fact that no association with common variants has been detected [43]. The results suggest that very rare gene disruptive variants can about halve the OR for hypertension.
ZYX (SLP = −3.83) is potentially of interest because it codes for zyxin, the protein responsible for sensing stretch in endothelial cells and vascular smooth muscle cells, as occurs in hypertension, and mediating their response to this by changing the expression of other genes [44]. The findings reported here suggest that impaired functioning of this gene may reduce risk of hypertension.
PREP (SLP = −5.03) narrowly fails to meet conventional criteria for exome-wide significance but is clearly of interest because its product, prolyl endopeptidase, also known as prolyl oligopeptidase or post-proline cleaving enzyme, has recently been shown to be responsible for converting circulating angiotensin II to angiotensin- (1)(2)(3)(4)(5)(6)(7) in the circulation and in the lungs, a process which is largely independent of ACE2, which carries out this conversion in the kidney [45]. The results we report suggest that rare functional variants in PREP are protective against hypertension, but it is not clear which categories of variant are responsible, and, although LOF variants are commoner in controls, they are too rare for conclusions to be drawn. In mice, loss of this gene results in reduced ability to metabolize ACE2 and hence produces a more prolonged systemic hypertensive response to exogenously administered ACE2 [45]. While it is not obvious why impaired functioning of this gene might be protective against hypertension, these findings do seem worthy of further exploration.
The finding that an inframe deletion in CACNA1D, rs72556363, is associated with increased hypertension is Curtis Pulse 2021;9:17-29 28 DOI: 10.1159/000517419 consistent with reports that very rare germline and somatic variants in this gene can result in aldosterone-producing adenomas and primary aldosteronism [46]. Although the variant varies in frequency between populations, essentially being restricted to those with European ancestry, this result does not seem likely to be due to an artefact of population stratification, because the frequency in controls is similar to that reported in non-Finnish Europeans, whereas the frequency in cases is even higher. It seems to represent a modest risk factor for hypertension without producing severe hyperaldosteronism, which is found in about 1% of subjects with European ancestry.
Overall, these analyses provide an overview of some of the impacts rare coding variants may have on the risk of hypertension in the general population. The validity of some novel findings will become clearer when exome sequence data is released for the remaining 300,000 UK Biobank participants or if they can be tested in other samples or followed up in functional studies. All the variants implicated are very rare and in view of their effect sizes arguably do not make an important contribution to risk from a public health point of view. Nor are they probably helpful as individual measures of risk, partly because they can only be detected by sequencing. Although it may be reasonable to assume that LOF variants within a given gene will tend to have a similar effect on phenotype, the same cannot be said of non-synonymous variants, and even within a given category the effect of such variants is likely to vary considerably. Thus, for variants that are individually extremely rare, it is in general not possible to make a clear interpretation regarding their likely effect. The main value of these findings is probably in highlighting genes and biological pathways of relevance in order to ultimately inform improved therapeutic approaches.