Publications

Changes in intestinal microbiome composition are associated with inflammatory, metabolic, and malignant disorders. We studied how exocrine pancreatic function affects intestinal microbiota. We performed 16S ribosomal RNA gene sequencing analysis of stool samples from 1795 volunteers from the population-based Study of Health in Pomerania who had no history of pancreatic disease. We also measured fecal pancreatic elastase by enzyme-linked immunosorbent assay and performed quantitative imaging of secretin-stimulated pancreatic fluid secretion. Associations of exocrine pancreatic function with microbial diversity or individual genera were calculated by permutational analysis of variance or linear regression, respectively. Differences in pancreatic elastase levels associated with significantly (P < .0001) greater changes in microbiota diversity than with participant age, body mass index, sex, smoking, alcohol consumption, or dietary factors. Significant changes in the abundance of 30 taxa, such as an increase in Prevotella (q < .0001) and a decrease of Bacteroides (q < .0001), indicated a shift from a type-1 to a type-2 enterotype. Changes in pancreatic fluid secretion alone were also associated with changes in microbial diversity (P = .0002), although to a lesser degree. In an analysis of fecal samples from 1795 volunteers, pancreatic acinar cell, rather than duct cell, function is presently the single most significant host factor to be associated with changes in intestinal microbiota composition.

C-reactive protein (CRP) is a sensitive biomarker of chronic low-grade inflammation and is associated with multiple complex diseases. The genetic determinants of chronic inflammation remain largely unknown, and the causal role of CRP in several clinical outcomes is debated. We performed two genome-wide association studies (GWASs), on HapMap and 1000 Genomes imputed data, of circulating amounts of CRP by using data from 88 studies comprising 204,402 European individuals. Additionally, we performed in silico functional analyses and Mendelian randomization analyses with several clinical outcomes. The GWAS meta-analyses of CRP revealed 58 distinct genetic loci (p < 5 × 10-8). After adjustment for body mass index in the regression analysis, the associations at all except three loci remained. The lead variants at the distinct loci explained up to 7.0% of the variance in circulating amounts of CRP. We identified 66 gene sets that were organized in two substantially correlated clusters, one mainly composed of immune pathways and the other characterized by metabolic pathways in the liver. Mendelian randomization analyses revealed a causal protective effect of CRP on schizophrenia and a risk-increasing effect on bipolar disorder. Our findings provide further insights into the biology of inflammation and could lead to interventions for treating inflammation and its clinical consequences.

Irritable bowel syndrome (IBS) shows genetic predisposition, however, large-scale, powered gene mapping studies are lacking. We sought to exploit existing genetic (genotype) and epidemiological (questionnaire) data from a series of population-based cohorts for IBS genome-wide association studies (GWAS) and their meta-analysis. Based on questionnaire data compatible with Rome III Criteria, we identified a total of 1335 IBS cases and 9768 asymptomatic individuals from 5 independent European genotyped cohorts. Individual GWAS were carried out with sex-adjusted logistic regression under an additive model, followed by meta-analysis using the inverse variance method. Functional annotation of significant results was obtained via a computational pipeline exploiting ontology and interaction networks, and tissue-specific and gene set enrichment analyses. Suggestive GWAS signals (P ≤ 5.0 × 10-6 ) were detected for 7 genomic regions, harboring 64 gene candidates to affect IBS risk via functional or expression changes. Functional annotation of this gene set convincingly (best FDR-corrected P = 3.1 × 10-10 ) highlighted regulation of ion channel activity as the most plausible pathway affecting IBS risk. Our results confirm the feasibility of population-based studies for gene-discovery efforts in IBS, identify risk genes and loci to be prioritized in independent follow-ups, and pinpoint ion channels as important players and potential therapeutic targets warranting further investigation.

OMICs subsume different physiological layers including the genome, transcriptome, proteome and metabolome. Recent advances in analytical techniques allow for the exhaustive determination of biomolecules in all OMICs levels from less invasive human specimens such as blood and urine. Investigating OMICs in deeply characterized population-based or experimental studies has led to seminal improvement of our understanding of genetic determinants of thyroid function, identified putative thyroid hormone target genes and thyroid hormone-induced shifts in the plasma protein and metabolite content. Consequently, plasma biomolecules have been suggested as surrogates of tissue-specific action of thyroid hormones. This review provides a brief introduction to OMICs in thyroid research with a particular focus on metabolomics studies in humans elucidating the important role of thyroid hormones for whole body metabolism in adults.

The incidence of neuroendocrine neoplasias (NEN) continues to increase. Since the primary tumor cannot be diagnosed in some cases of metastatic disease, new biomarkers are clearly needed to find the most probable site of origin. Tissue samples from 79 patients were analyzed and microRNA profiles were generated from a total of 76 primary tumors, 31 lymph node and 14 solid organ metastases. NEN metastases were associated with elevated levels of miR-30a-5p, miR-210, miR-339-3p, miR-345 and miR-660. Three microRNAs showed a strong correlation between proliferation index and metastatic disease in general (miR-150, miR-21 and miR-660). Further, each anatomic location (primary or metastatic) had one or more site-specific microRNAs more highly expressed in these tissues. Comparison between primary tumors and metastases revealed an overlap only in pancreatic (miR-127) and ileal tumors (let-7g, miR-200a and miR-331). This thorough analysis of gastroenteropancreatic neuroendocrine tumors demonstrates site-specific microRNA profiles, correlation with proliferation indices as well as corresponding nodal and distant metastases. Using microRNA profiling might improve NEN diagnostics by linking metastases to a most probable site of origin.

In recent years, human microbiota, especially gut microbiota, have emerged as an important yet complex trait influencing human metabolism, immunology, and diseases. Many studies are investigating the forces underlying the observed variation, including the human genetic variants that shape human microbiota. Several preliminary genome-wide association studies (GWAS) have been completed, but more are necessary to achieve a fuller picture. Here, we announce the MiBioGen consortium initiative, which has assembled 18 population-level cohorts and some 19,000 participants. Its aim is to generate new knowledge for the rapidly developing field of microbiota research. Each cohort has surveyed the gut microbiome via 16S rRNA sequencing and genotyped their participants with full-genome SNP arrays. We have standardized the analytical pipelines for both the microbiota phenotypes and genotypes, and all the data have been processed using identical approaches. Our analysis of microbiome composition shows that we can reduce the potential artifacts introduced by technical differences in generating microbiota data. We are now in the process of benchmarking the association tests and performing meta-analyses of genome-wide associations. All pipeline and summary statistics results will be shared using public data repositories. We present the largest consortium to date devoted to microbiota-GWAS. We have adapted our analytical pipelines to suit multi-cohort analyses and expect to gain insight into host-microbiota cross-talk at the genome-wide level. And, as an open consortium, we invite more cohorts to join us (by contacting one of the corresponding authors) and to follow the analytical pipeline we have developed.

The human leukocyte antigen (HLA) haplotype DRB1*15:01 is the major risk factor for multiple sclerosis (MS). Here, we find that DRB1*15:01 is hypomethylated and predominantly expressed in monocytes among carriers of DRB1*15:01. A differentially methylated region (DMR) encompassing HLA-DRB1 exon 2 is particularly affected and displays methylation-sensitive regulatory properties in vitro. Causal inference and Mendelian randomization provide evidence that HLA variants mediate risk for MS via changes in the HLA-DRB1 DMR that modify HLA-DRB1 expression. Meta-analysis of 14,259 cases and 171,347 controls confirms that these variants confer risk from DRB1*15:01 and also identifies a protective variant (rs9267649, p < 3.32 × 10-8, odds ratio = 0.86) after conditioning for all MS-associated variants in the region. rs9267649 is associated with increased DNA methylation at the HLA-DRB1 DMR and reduced expression of HLA-DRB1, suggesting a modulation of the DRB1*15:01 effect. Our integrative approach provides insights into the molecular mechanisms of MS susceptibility and suggests putative therapeutic strategies targeting a methylation-mediated regulation of the major risk gene.

Hypertension represents a major cardiovascular risk factor. The pathophysiology of increased blood pressure (BP) is not yet completely understood. Transcriptome profiling offers possibilities to uncover genetics effects on BP. Based on 2 populations including 2549 individuals, a meta-analyses of monocytic transcriptome-wide profiles were performed to identify transcripts associated with BP. Replication was performed in 2 independent studies of whole-blood transcriptome data including 1990 individuals. For identified candidate genes, a direct link between long-term changes in BP and gene expression over time and by treatment with BP-lowering therapy was assessed. The predictive value of protein levels encoded by candidate genes for subsequent cardiovascular disease was investigated. Eight transcripts (CRIP1, MYADM, TIPARP, TSC22D3, CEBPA, F12, LMNA, and TPPP3) were identified jointly accounting for up to 13% (95% confidence interval, 8.7-16.2) of BP variability. Changes in CRIP1, MYADM, TIPARP, LMNA, TSC22D3, CEBPA, and TPPP3 expression associated with BP changes-among these, CRIP1 gene expression was additionally correlated to measures of cardiac hypertrophy. Assessment of circulating CRIP1 (cystein-rich protein 1) levels as biomarkers showed a strong association with increased risk for incident stroke (hazard ratio, 1.06; 95% confidence interval, 1.03-1.09; P=5.0×10-5). Our comprehensive analysis of global gene expression highlights 8 novel transcripts significantly associated with BP, providing a link between gene expression and BP. Translational approaches further established evidence for the potential use of CRIP1 as emerging disease-related biomarker.

Using oral contraceptives has been implicated in the aetiology of stress-related disorders like depression. Here, we followed the hypothesis that oral contraceptives deregulate the HPA-axis by elevating circulating cortisol levels. We report for a sample of 233 pre-menopausal women increased circulating cortisol levels in those using oral contraceptives. For women taking oral contraceptives, we observed alterations in circulating phospholipid levels and elevated triglycerides and found evidence for increased glucocorticoid signalling as the transcript levels of the glucocorticoid-regulated genes DDIT4 and FKBP5 were increased in whole blood. The effects were statistically mediated by cortisol. The associations of oral contraceptives with higher FKBP5 mRNA and altered phospholipid levels were modified by rs1360780, a genetic variance implicated in psychiatric diseases. Accordingly, the methylation pattern of FKBP5 intron 7 was altered in women taking oral contraceptives depending on the rs1360780 genotype. Moreover, oral contraceptives modified the association of circulating cortisol with depressive symptoms, potentially explaining conflicting results in the literature. Finally, women taking oral contraceptives displayed smaller hippocampal volumes than non-using women. In conclusion, the integrative analyses of different types of physiological data provided converging evidence indicating that oral contraceptives may cause effects analogous to chronic psychological stressors regarding the regulation of the HPA axis.

MicroRNAs (miRNA) are important non-coding modulators controlling patterns of gene expression. However, profiling and validation of circulating miRNA levels related to adverse cardiovascular outcome has not been performed in patients with an acute coronary syndrome (ACS). In a multicentre, prospective ACS cohort, 1002 out of 2168 patients presented with ST-segment elevation myocardial infarction (STEMI). Sixty-three STEMI patients experienced an adjudicated major cardiovascular event (MACE, defined as cardiac death or recurrent myocardial infarction) within 1 year of follow-up. From a miRNA profiling in a matched derivation case-control cohort, 14 miRNAs were selected for validation. Comparing 63 cases vs. 126 controls, 3 miRNAs were significantly differentially abundant. In patients with MACE, miR-26b-5p levels (P = 0.038) were decreased, whereas miR-320a (P = 0.047) and miR-660-5p (P = 0.01) levels were increased. MiR-26b-5p has been suggested to prevent adverse cardiomyocyte hypertrophy, whereas miR-320a promotes cardiomyocyte death and apoptosis, and miR-660-5p has been related to active platelet production. This suggests that miR-26b-5p, miR-320a, and miR-660-5p may reflect alterations of different pathophysiological pathways involved in clinical outcome after ACS. Consistently, these three miRNAs reliably discriminated cases from controls [area under the receiver-operating characteristic curve (AUC) in age- and sex-adjusted Cox regression for miR-26b-5p = 0.707, miR-660-5p = 0.683, and miR-320a =0.672]. Combination of the three miRNAs further increased AUC to 0.718. Importantly, addition of the three miRNAs to both, the Global Registry of Acute Coronary Events (GRACE) score and a clinical model increased AUC from 0.679 to 0.720 and 0.722 to 0.732, respectively, with a net reclassification improvement of 0.20 in both cases. This is the first study performing profiling and validation of miRNAs that are associated with adverse cardiovascular outcome in patients with STEMI. MiR-26b-5p, miR-320a, and miR-660-5p discriminated for MACE and increased risk prediction when added to the GRACE score and a clinical model. These findings suggest that the release of specific miRNAs into circulation may reflect the activation of molecular pathways that impact on clinical outcome after STEMI.

Determinations of thyrotropin (TSH) and free thyroxine (FT4) represent the gold standard in evaluation of thyroid function. To screen for novel peripheral biomarkers of thyroid function and to characterize FT4-associated physiological signatures in human plasma we used an untargeted OMICS approach in a thyrotoxicosis model. A sample of 16 healthy young men were treated with levothyroxine for 8 weeks and plasma was sampled before the intake was started as well as at two points during treatment and after its completion, respectively. Mass spectrometry-derived metabolite and protein levels were related to FT4 serum concentrations using mixed-effect linear regression models in a robust setting. To compile a molecular signature discriminating between thyrotoxicosis and euthyroidism, a random forest was trained and validated in a two-stage cross-validation procedure. Despite the absence of obvious clinical symptoms, mass spectrometry analyses detected 65 metabolites and 63 proteins exhibiting significant associations with serum FT4. A subset of 15 molecules allowed a robust and good prediction of thyroid hormone function (AUC = 0.86) without prior information on TSH or FT4. Main FT4-associated signatures indicated increased resting energy expenditure, augmented defense against systemic oxidative stress, decreased lipoprotein particle levels, and increased levels of complement system proteins and coagulation factors. Further association findings question the reliability of kidney function assessment under hyperthyroid conditions and suggest a link between hyperthyroidism and cardiovascular diseases via increased dimethylarginine levels. Our results emphasize the power of untargeted OMICs approaches to detect novel pathways of thyroid hormone action. Furthermore, beyond TSH and FT4, we demonstrated the potential of such analyses to identify new molecular signatures for diagnosis and treatment of thyroid disorders. This study was registered at the German Clinical Trials Register (DRKS) [DRKS00011275] on the 16th of November 2016.

Variation in body fat distribution contributes to the metabolic sequelae of obesity. The genetic determinants of body fat distribution are poorly understood. The goal of this study was to gain new insights into the underlying genetics of body fat distribution by conducting sample-size-weighted fixed-effects genome-wide association meta-analyses in up to 9,594 women and 8,738 men of European, African, Hispanic and Chinese ancestry, with and without sex stratification, for six traits associated with ectopic fat (hereinafter referred to as ectopic-fat traits). In total, we identified seven new loci associated with ectopic-fat traits (ATXN1, UBE2E2, EBF1, RREB1, GSDMB, GRAMD3 and ENSA; P < 5 × 10-8; false discovery rate < 1%). Functional analysis of these genes showed that loss of function of either Atxn1 or Ube2e2 in primary mouse adipose progenitor cells impaired adipocyte differentiation, suggesting physiological roles for ATXN1 and UBE2E2 in adipogenesis. Future studies are necessary to further explore the mechanisms by which these genes affect adipocyte biology and how their perturbations contribute to systemic metabolic disease.

Periodontitis is characterized by inflammation of the gingival tissue. The main risk factors are socioeconomic factors, sex, age, smoking, and diabetes, but periodontal disease has also a genetic background. Previous genome-wide association studies failed to reveal genome-wide significant associations of single common single-nucleotide polymorphisms with chronic periodontitis. Using the Illumina ExomeChip data of 6,576 participants of the German population-based cohort studies Study of Health in Pomerania (SHIP) and SHIP-Trend, the authors performed single variant and also gene-based association studies of rare and common exonic variations on different periodontal case definitions. Although our study comprised the largest sample size to date to assess genetic predisposition for chronic periodontitis, the authors found no significant association. This study emphasizes that for chronic periodontitis, large sample sizes will be necessary to find genetic associations, even when examining rare genetic variants.

Caffeine is the most widely consumed psychoactive substance in the world and presents with wide interindividual variation in metabolism. This variation may modify potential adverse or beneficial effects of caffeine on health. We conducted a genome-wide association study (GWAS) of plasma caffeine, paraxanthine, theophylline, theobromine and paraxanthine/caffeine ratio among up to 9,876 individuals of European ancestry from six population-based studies. A single SNP at 6p23 (near CD83) and several SNPs at 7p21 (near AHR), 15q24 (near CYP1A2) and 19q13.2 (near CYP2A6) met GW-significance (P < 5 × 10-8) and were associated with one or more metabolites. Variants at 7p21 and 15q24 associated with higher plasma caffeine and lower plasma paraxanthine/caffeine (slow caffeine metabolism) were previously associated with lower coffee and caffeine consumption behavior in GWAS. Variants at 19q13.2 associated with higher plasma paraxanthine/caffeine (slow paraxanthine metabolism) were also associated with lower coffee consumption in the UK Biobank (n = 94 343, P < 1.0 × 10-6). Variants at 2p24 (in GCKR), 4q22 (in ABCG2) and 7q11.23 (near POR) that were previously associated with coffee consumption in GWAS were nominally associated with plasma caffeine or its metabolites. Taken together, we have identified genetic factors contributing to variation in caffeine metabolism and confirm an important modulating role of systemic caffeine levels in dietary caffeine consumption behavior. Moreover, candidate genes identified encode proteins with important clinical functions that extend beyond caffeine metabolism.

Atrial fibrillation (AF) is a heritable disease that affects more than thirty million individuals worldwide. Extensive efforts have been devoted to the study of genetic determinants of AF. The objective of our study is to examine the effect of gene-gene interaction on AF susceptibility. We performed a large-scale association analysis of gene-gene interactions with AF in 8,173 AF cases, and 65,237 AF-free referents collected from 15 studies for discovery. We examined putative interactions between genome-wide SNPs and 17 known AF-related SNPs. The top interactions were then tested for association in an independent cohort for replication, which included more than 2,363 AF cases and 114,746 AF-free referents. One interaction, between rs7164883 at the HCN4 locus and rs4980345 at the SLC28A1 locus, was found to be significantly associated with AF in the discovery cohorts (interaction OR = 1.44, 95% CI: 1.27-1.65, P = 4.3 × 10-8). Eight additional gene-gene interactions were also marginally significant (P < 5 × 10-7). However, none of the top interactions were replicated. In summary, we did not find significant interactions that were associated with AF susceptibility. Future increases in sample size and denser genotyping might facilitate the identification of gene-gene interactions associated with AF.

Ionizing radiation is known to induce genomic lesions, such as DNA double strand breaks, whose repair can lead to mutations that can modulate cellular and organismal fate. Soon after radiation exposure, cells induce transcriptional changes and alterations of cell cycle programs to respond to the received DNA damage. Radiation-induced mutations occur through misrepair in a stochastic manner and increase the risk of developing cancers years after the incident, especially after high dose radiation exposures. Here, the authors analyzed the transcriptomic response of primary human gingival fibroblasts exposed to increasing doses of acute high dose-rate x rays. In the dataset obtained after 0.5 and 5 Gy x-ray exposures and two different repair intervals (0.5 h and 16 h), the authors discovered several radiation-induced fusion transcripts in conjunction with dose-dependent gene expression changes involving a total of 3,383 genes. Principal component analysis of repeated experiments revealed that the duration of the post-exposure repair intervals had a stronger impact than irradiation dose. Subsequent overrepresentation analyses showed a number of KEGG gene sets and WikiPathways, including pathways known to relate to radioresistance in fibroblasts (Wnt, integrin signaling). Moreover, a significant radiation-induced modulation of microRNA targets was detected. The data sets on IR-induced transcriptomic alterations in primary gingival fibroblasts will facilitate genomic comparisons in various genotoxic exposure scenarios.

Platelet production, maintenance, and clearance are tightly controlled processes indicative of platelets’ important roles in hemostasis and thrombosis. Platelets are common targets for primary and secondary prevention of several conditions. They are monitored clinically by complete blood counts, specifically with measurements of platelet count (PLT) and mean platelet volume (MPV). Identifying genetic effects on PLT and MPV can provide mechanistic insights into platelet biology and their role in disease. Therefore, we formed the Blood Cell Consortium (BCX) to perform a large-scale meta-analysis of Exomechip association results for PLT and MPV in 157,293 and 57,617 individuals, respectively. Using the low-frequency/rare coding variant-enriched Exomechip genotyping array, we sought to identify genetic variants associated with PLT and MPV. In addition to confirming 47 known PLT and 20 known MPV associations, we identified 32 PLT and 18 MPV associations not previously observed in the literature across the allele frequency spectrum, including rare large effect (FCER1A), low-frequency (IQGAP2, MAP1A, LY75), and common (ZMIZ2, SMG6, PEAR1, ARFGAP3/PACSIN2) variants. Several variants associated with PLT/MPV (PEAR1, MRVI1, PTGES3) were also associated with platelet reactivity. In concurrent BCX analyses, there was overlap of platelet-associated variants with red (MAP1A, TMPRSS6, ZMIZ2) and white (PEAR1, ZMIZ2, LY75) blood cell traits, suggesting common regulatory pathways with shared genetic architecture among these hematopoietic lineages. Our large-scale Exomechip analyses identified previously undocumented associations with platelet traits and further indicate that several complex quantitative hematological, lipid, and cardiovascular traits share genetic factors.

White blood cells play diverse roles in innate and adaptive immunity. Genetic association analyses of phenotypic variation in circulating white blood cell (WBC) counts from large samples of otherwise healthy individuals can provide insights into genes and biologic pathways involved in production, differentiation, or clearance of particular WBC lineages (myeloid, lymphoid) and also potentially inform the genetic basis of autoimmune, allergic, and blood diseases. We performed an exome array-based meta-analysis of total WBC and subtype counts (neutrophils, monocytes, lymphocytes, basophils, and eosinophils) in a multi-ancestry discovery and replication sample of ∼157,622 individuals from 25 studies. We identified 16 common variants (8 of which were coding variants) associated with one or more WBC traits, the majority of which are pleiotropically associated with autoimmune diseases. Based on functional annotation, these loci included genes encoding surface markers of myeloid, lymphoid, or hematopoietic stem cell differentiation (CD69, CD33, CD87), transcription factors regulating lineage specification during hematopoiesis (ASXL1, IRF8, IKZF1, JMJD1C, ETS2-PSMG1), and molecules involved in neutrophil clearance/apoptosis (C10orf54, LTA), adhesion (TNXB), or centrosome and microtubule structure/function (KIF9, TUBD1). Together with recent reports of somatic ASXL1 mutations among individuals with idiopathic cytopenias or clonal hematopoiesis of undetermined significance, the identification of a common regulatory 3’ UTR variant of ASXL1 suggests that both germline and somatic ASXL1 mutations contribute to lower blood counts in otherwise asymptomatic individuals. These association results shed light on genetic mechanisms that regulate circulating WBC counts and suggest a prominent shared genetic architecture with inflammatory and autoimmune diseases.

Red blood cell (RBC) traits are important heritable clinical biomarkers and modifiers of disease severity. To identify coding genetic variants associated with these traits, we conducted meta-analyses of seven RBC phenotypes in 130,273 multi-ethnic individuals from studies genotyped on an exome array. After conditional analyses and replication in 27,480 independent individuals, we identified 16 new RBC variants. We found low-frequency missense variants in MAP1A (rs55707100, minor allele frequency [MAF] = 3.3%, p = 2 × 10(-10) for hemoglobin [HGB]) and HNF4A (rs1800961, MAF = 2.4%, p < 3 × 10(-8) for hematocrit [HCT] and HGB). In African Americans, we identified a nonsense variant in CD36 associated with higher RBC distribution width (rs3211938, MAF = 8.7%, p = 7 × 10(-11)) and showed that it is associated with lower CD36 expression and strong allelic imbalance in ex vivo differentiated human erythroblasts. We also identified a rare missense variant in ALAS2 (rs201062903, MAF = 0.2%) associated with lower mean corpuscular volume and mean corpuscular hemoglobin (p < 8 × 10(-9)). Mendelian mutations in ALAS2 are a cause of sideroblastic anemia and erythropoietic protoporphyria. Gene-based testing highlighted three rare missense variants in PKLR, a gene mutated in Mendelian non-spherocytic hemolytic anemia, associated with HGB and HCT (SKAT p < 8 × 10(-7)). These rare, low-frequency, and common RBC variants showed pleiotropy, being also associated with platelet, white blood cell, and lipid traits. Our association results and functional annotation suggest the involvement of new genes in human erythropoiesis. We also confirm that rare and low-frequency variants play a role in the architecture of complex human traits, although their phenotypic effect is generally smaller than originally anticipated.

We conducted a genome-wide association study (GWAS) on multiple sclerosis (MS) susceptibility in German cohorts with 4888 cases and 10,395 controls. In addition to associations within the major histocompatibility complex (MHC) region, 15 non-MHC loci reached genome-wide significance. Four of these loci are novel MS susceptibility loci. They map to the genes L3MBTL3, MAZ, ERG, and SHMT1. The lead variant at SHMT1 was replicated in an independent Sardinian cohort. Products of the genes L3MBTL3, MAZ, and ERG play important roles in immune cell regulation. SHMT1 encodes a serine hydroxymethyltransferase catalyzing the transfer of a carbon unit to the folate cycle. This reaction is required for regulation of methylation homeostasis, which is important for establishment and maintenance of epigenetic signatures. Our GWAS approach in a defined population with limited genetic substructure detected associations not found in larger, more heterogeneous cohorts, thus providing new clues regarding MS pathogenesis.

It remains unclear whether causal, rather than merely correlational, relationships in molecular networks can be inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge, which focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective, and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess inferred molecular networks in a causal sense.

Lower muscle strength in midlife predicts disability and mortality in later life. Blood-borne factors, including growth differentiation factor 11 (GDF11), have been linked to muscle regeneration in animal models. We aimed to identify gene transcripts associated with muscle strength in adults. Meta-analysis of whole blood gene expression (overall 17,534 unique genes measured by microarray) and hand-grip strength in four independent cohorts (n = 7,781, ages: 20-104 yr, weighted mean = 56), adjusted for age, sex, height, weight, and leukocyte subtypes. Separate analyses were performed in subsets (older/younger than 60, men/women). Expression levels of 221 genes were associated with strength after adjustment for cofactors and for multiple statistical testing, including ALAS2 (rate-limiting enzyme in heme synthesis), PRF1 (perforin, a cytotoxic protein associated with inflammation), IGF1R, and IGF2BP2 (both insulin like growth factor related). We identified statistical enrichment for hemoglobin biosynthesis, innate immune activation, and the stress response. Ten genes were associated only in younger individuals, four in men only and one in women only. For example, PIK3R2 (a negative regulator of PI3K/AKT growth pathway) was negatively associated with muscle strength in younger (<60 yr) individuals but not older (≥ 60 yr). We also show that 115 genes (52%) have not previously been linked to muscle in NCBI PubMed abstracts. This first large-scale transcriptome study of muscle strength in human adults confirmed associations with known pathways and provides new evidence for over half of the genes identified. There may be age- and sex-specific gene expression signatures in blood for muscle strength.

Non-cellular blood circulating microRNAs (plasma miRNAs) represent a promising source for the development of prognostic and diagnostic tools owing to their minimally invasive sampling, high stability, and simple quantification by standard techniques such as RT-qPCR. So far, the majority of association studies involving plasma miRNAs were disease-specific case-control analyses. In contrast, in the present study, plasma miRNAs were analysed in a sample of 372 individuals from a population-based cohort study, the Study of Health in Pomerania (SHIP). Quantification of miRNA levels was performed by RT-qPCR using the Exiqon Serum/Plasma Focus microRNA PCR Panel V3.M covering 179 different miRNAs. Of these, 155 were included in our analyses after quality-control. Associations between plasma miRNAs and the phenotypes age, body mass index (BMI), and sex were assessed via a two-step linear regression approach per miRNA. The first step regressed out the technical parameters and the second step determined the remaining associations between the respective plasma miRNA and the phenotypes of interest. After regressing out technical parameters and adjusting for the respective other two phenotypes, 7, 15, and 35 plasma miRNAs were significantly (q < 0.05) associated with age, BMI, and sex, respectively. Additional adjustment for the blood cell parameters identified 12 and 19 miRNAs to be significantly associated with age and BMI, respectively. Most of the BMI-associated miRNAs likely originate from liver. Sex-associated differences in miRNA levels were largely determined by differences in blood cell parameters. Thus, only 7 as compared to originally 35 sex-associated miRNAs displayed sex-specific differences after adjustment for blood cell parameters. These findings emphasize that circulating miRNAs are strongly impacted by age, BMI, and sex. Hence, these parameters should be considered as covariates in association studies based on plasma miRNA levels. The established experimental and computational workflow can now be used in future screening studies to determine associations of plasma miRNAs with defined disease phenotypes.

Fibrinogen, coagulation factor VII (FVII), and factor VIII (FVIII) and its carrier von Willebrand factor (vWF) play key roles in hemostasis. Previously identified common variants explain only a small fraction of the trait heritabilities, and additional variations may be explained by associations with rarer variants with larger effects. The aim of this study was to identify low-frequency (minor allele frequency [MAF] ≥0.01 and <0.05) and rare (MAF <0.01) variants that influence plasma concentrations of these 4 hemostatic factors by meta-analyzing exome chip data from up to 76,000 participants of 4 ancestries. We identified 12 novel associations of low-frequency (n = 2) and rare (n = 10) variants across the fibrinogen, FVII, FVIII, and vWF traits that were independent of previously identified associations. Novel loci were found within previously reported genes and had effect sizes much larger than and independent of previously identified common variants. In addition, associations at KCNT1, HID1, and KATNB1 identified new candidate genes related to hemostasis for follow-up replication and functional genomic analysis. Newly identified low-frequency and rare-variant associations accounted for modest amounts of trait variance and therefore are unlikely to increase predicted trait heritability but provide new information for understanding individual variation in hemostasis pathways.

Genome-wide association studies with metabolic traits (mGWAS) uncovered many genetic variants that influence human metabolism. These genetically influenced metabotypes (GIMs) contribute to our metabolic individuality, our capacity to respond to environmental challenges, and our susceptibility to specific diseases. While metabolic homeostasis in blood is a well investigated topic in large mGWAS with over 150 known loci, metabolic detoxification through urinary excretion has only been addressed by few small mGWAS with only 11 associated loci so far. Here we report the largest mGWAS to date, combining targeted and non-targeted 1H NMR analysis of urine samples from 3,861 participants of the SHIP-0 cohort and 1,691 subjects of the KORA F4 cohort. We identified and replicated 22 loci with significant associations with urinary traits, 15 of which are new (HIBCH, CPS1, AGXT, XYLB, TKT, ETNPPL, SLC6A19, DMGDH, SLC36A2, GLDC, SLC6A13, ACSM3, SLC5A11, PNMT, SLC13A3). Two-thirds of the urinary loci also have a metabolite association in blood. For all but one of the 6 loci where significant associations target the same metabolite in blood and urine, the genetic effects have the same direction in both fluids. In contrast, for the SLC5A11 locus, we found increased levels of myo-inositol in urine whereas mGWAS in blood reported decreased levels for the same genetic variant. This might indicate less effective re-absorption of myo-inositol in the kidneys of carriers. In summary, our study more than doubles the number of known loci that influence urinary phenotypes. It thus allows novel insights into the relationship between blood homeostasis and its regulation through excretion. The newly discovered loci also include variants previously linked to chronic kidney disease (CPS1, SLC6A13), pulmonary hypertension (CPS1), and ischemic stroke (XYLB). By establishing connections from gene to disease via metabolic traits our results provide novel hypotheses about molecular mechanisms involved in the etiology of diseases.

Homozygosity has long been associated with rare, often devastating, Mendelian disorders, and Darwin was one of the first to recognize that inbreeding reduces evolutionary fitness. However, the effect of the more distant parental relatedness that is common in modern human populations is less well understood. Genomic data now allow us to investigate the effects of homozygosity on traits of public health importance by observing contiguous homozygous segments (runs of homozygosity), which are inferred to be homozygous along their complete length. Given the low levels of genome-wide homozygosity prevalent in most human populations, information is required on very large numbers of people to provide sufficient power. Here we use runs of homozygosity to study 16 health-related quantitative traits in 354,224 individuals from 102 cohorts, and find statistically significant associations between summed runs of homozygosity and four complex traits: height, forced expiratory lung volume in one second, general cognitive ability and educational attainment (P < 1 × 10(-300), 2.1 × 10(-6), 2.5 × 10(-10) and 1.8 × 10(-10), respectively). In each case, increased homozygosity was associated with decreased trait value, equivalent to the offspring of first cousins being 1.2 cm shorter and having 10 months’ less education. Similar effect sizes were found across four continental groups and populations with different degrees of genome-wide homozygosity, providing evidence that homozygosity, rather than confounding, directly contributes to phenotypic variance. Contrary to earlier reports in substantially smaller samples, no evidence was seen of an influence of genome-wide homozygosity on blood pressure and low density lipoprotein cholesterol, or ten other cardio-metabolic traits. Since directional dominance is predicted for traits under directional evolutionary selection, this study provides evidence that increased stature and cognitive function have been positively selected in human evolution, whereas many important risk factors for late-onset complex diseases may not have been.

Excess body weight is a major risk factor for cardiometabolic diseases. The complex molecular mechanisms of body weight change-induced metabolic perturbations are not fully understood. Specifically, in-depth molecular characterization of long-term body weight change in the general population is lacking. Here, we pursued a multi-omic approach to comprehensively study metabolic consequences of body weight change during a seven-year follow-up in a large prospective study. We used data from the population-based Cooperative Health Research in the Region of Augsburg (KORA) S4/F4 cohort. At follow-up (F4), two-platform serum metabolomics and whole blood gene expression measurements were obtained for 1,631 and 689 participants, respectively. Using weighted correlation network analysis, omics data were clustered into modules of closely connected molecules, followed by the formation of a partial correlation network from the modules. Association of the omics modules with previous annual percentage weight change was then determined using linear models. In addition, we performed pathway enrichment analyses, stability analyses, and assessed the relation of the omics modules with clinical traits. Four metabolite and two gene expression modules were significantly and stably associated with body weight change (P-values ranging from 1.9 × 10(-4) to 1.2 × 10(-24)). The four metabolite modules covered major branches of metabolism, with VLDL, LDL and large HDL subclasses, triglycerides, branched-chain amino acids and markers of energy metabolism among the main representative molecules. One gene expression module suggests a role of weight change in red blood cell development. The other gene expression module largely overlaps with the lipid-leukocyte (LL) module previously reported to interact with serum metabolites, for which we identify additional co-expressed genes. The omics modules were interrelated and showed cross-sectional associations with clinical traits. Moreover, weight gain and weight loss showed largely opposing associations with the omics modules. Long-term weight change in the general population globally associates with serum metabolite concentrations. An integrated metabolomics and transcriptomics approach improved the understanding of molecular mechanisms underlying the association of weight gain with changes in lipid and amino acid metabolism, insulin sensitivity, mitochondrial function as well as blood cell development and function.

Alzheimer’s disease (AD) is a devastating neurodegenerative disorder characterized by early intraneuronal amyloid-β (Aβ) accumulation, extracellular deposition of Aβ peptides, and intracellular hyperphosphorylated tau aggregates. These lesions cause dendritic and synaptic alterations and induce an inflammatory response in the diseased brain. Although the neuropathological characteristics of AD have been known for decades, the molecular mechanisms causing the disease are still under investigation. Studying gene expression changes in postmortem AD brain tissue can yield new insights into the molecular disease mechanisms. To that end, one can employ transgenic AD mouse models and the next-generation sequencing technology. In this study, a whole-brain transcriptome analysis was carried out using the well-characterized APP/PS1KI mouse model for AD. These mice display a robust phenotype reflected by working memory deficits at 6 months of age, a significant neuron loss in a variety of brain areas including the CA1 region of the hippocampus and a severe amyloid pathology. Based on deep sequencing, differentially expressed genes (DEGs) between 6-month-old WT or PS1KI and APP/PS1KI were identified and verified by qRT-PCR. Compared to WT mice, 250 DEGs were found in APP/PS1KI mice, while 186 DEGs could be found compared to PS1KI control mice. Most of the DEGs were upregulated in APP/PS1KI mice and belong to either inflammation-associated pathways or lysosomal activation, which is likely due to the robust intraneuronal accumulation of Aβ in this mouse model. Our comprehensive brain transcriptome study further highlights APP/PS1KI mice as a valuable model for AD, covering molecular inflammatory and immune responses.

Individualized Medicine aims at providing optimal treatment for an individual patient at a given time based on his specific genetic and molecular characteristics. This requires excellent clinical stratification of patients as well as the availability of genomic data and biomarkers as prerequisites for the development of novel diagnostic tools and therapeutic strategies. The University Medicine Greifswald, Germany, has launched the “Greifswald Approach to Individualized Medicine” (GANI_MED) project to address major challenges of Individualized Medicine. Herein, we describe the implementation of the scientific and clinical infrastructure that allows future translation of findings relevant to Individualized Medicine into clinical practice. Clinical patient cohorts (N > 5,000) with an emphasis on metabolic and cardiovascular diseases are being established following a standardized protocol for the assessment of medical history, laboratory biomarkers, and the collection of various biosamples for bio-banking purposes. A multi-omics based biomarker assessment including genome-wide genotyping, transcriptome, metabolome, and proteome analyses complements the multi-level approach of GANI_MED. Comparisons with the general background population as characterized by our Study of Health in Pomerania (SHIP) are performed. A central data management structure has been implemented to capture and integrate all relevant clinical data for research purposes. Ethical research projects on informed consent procedures, reporting of incidental findings, and economic evaluations were launched in parallel.

One of the central research questions on the etiology of Alzheimer’s disease (AD) is the elucidation of the molecular signatures triggered by the amyloid cascade of pathological events. Next-generation sequencing allows the identification of genes involved in disease processes in an unbiased manner. We have combined this technique with the analysis of two AD mouse models: (1) The 5XFAD model develops early plaque formation, intraneuronal Aβ aggregation, neuron loss, and behavioral deficits. (2) The Tg4-42 model expresses N-truncated Aβ4-42 and develops neuron loss and behavioral deficits albeit without plaque formation. Our results show that learning and memory deficits in the Morris water maze and fear conditioning tasks in Tg4-42 mice at 12 months of age are similar to the deficits in 5XFAD animals. This suggested that comparative gene expression analysis between the models would allow the dissection of plaque-related and -unrelated disease relevant factors. Using deep sequencing differentially expressed genes (DEGs) were identified and subsequently verified by quantitative PCR. Nineteen DEGs were identified in pre-symptomatic young 5XFAD mice, and none in young Tg4-42 mice. In the aged cohort, 131 DEGs were found in 5XFAD and 56 DEGs in Tg4-42 mice. Many of the DEGs specific to the 5XFAD model belong to neuroinflammatory processes typically associated with plaques. Interestingly, 36 DEGs were identified in both mouse models indicating common disease pathways associated with behavioral deficits and neuron loss.

Growth factor receptor mediated signaling is meanwhile recognized as a complex signaling network, which is initiated by recruiting specific patterns of adaptor proteins to the intracellular domain of epidermal growth factor receptor (EGFR). Approaches to globally identify EGFR-binding proteins are required to elucidate this network. We affinity-purified EGFR with its interacting proteins by coprecipitation from lysates of A431 cells. A total of 183 proteins were repeatedly detected in high-resolution MS measurements. For 15 of these, direct interactions with EGFR were listed in the iRefIndex interaction database, including Grb2, shc-1, SOS1 and 2, STAT 1 and 3, AP2, UBS3B, and ERRFI. The newly developed Cytoscape plugin ModuleGraph allowed retrieving and visualizing 93 well-described protein complexes that contained at least one of the proteins found to interact with EGFR in our experiments. Abundances of 14 proteins were modulated more than twofold upon EGFR activation whereof clathrin-associated adaptor complex AP-2 showed 4.6-fold enrichment. These proteins were further annotated with different cellular compartments. Finally, interactions of AP-2 proteins and the newly discovered interaction of CIP2A could be verified. In conclusion, a powerful technique is presented that allowed identification and quantitative assessment of the EGFR interactome to provide further insight into EGFR signaling.

The prioritization of candidate disease genes is often based on integrated datasets and their network representation with genes as nodes connected by edges for biological relationships. However, the majority of prioritization methods does not allow for a straightforward integration of the user’s own input data. Therefore, we developed the Cytoscape plugin NetworkPrioritizer that particularly supports the integrative network-based prioritization of candidate disease genes or other molecules. Our versatile software tool computes a number of important centrality measures to rank nodes based on their relevance for network connectivity and provides different methods to aggregate and compare rankings. NetworkPrioritizer and the online documentation are freely available at http://www.networkprioritizer.de

Many efforts are still devoted to the discovery of genes involved with specific phenotypes, in particular, diseases. High-throughput techniques are thus applied frequently to detect dozens or even hundreds of candidate genes. However, the experimental validation of many candidates is often an expensive and time-consuming task. Therefore, a great variety of computational approaches has been developed to support the identification of the most promising candidates for follow-up studies. The biomedical knowledge already available about the disease of interest and related genes is commonly exploited to find new gene-disease associations and to prioritize candidates. In this review, we highlight recent methodological advances in this research field of candidate gene prioritization. We focus on approaches that use network information and integrate heterogeneous data sources. Furthermore, we discuss current benchmarking procedures for evaluating and comparing different prioritization methods.

Proteins and their interactions are essential for the survival of each human cell. Knowledge of their tissue occurrence is important for understanding biological processes. Therefore, we analyzed microarray and high-throughput RNA-sequencing data to identify tissue-specific and universally expressed genes. Gene expression data were used to investigate the presence of proteins, protein interactions and protein complexes in different tissues. Our comparison shows that the detection of tissue-specific genes and proteins strongly depends on the applied measurement technique. We found that microarrays are less sensitive for low expressed genes than high-throughput sequencing. Functional analyses based on microarray data are thus biased towards high expressed genes. This also means that previous biological findings based on microarrays might have to be re-examined using high-throughput sequencing results.