Linkage Analysis

Linkage analysis is the way in which to measure genetic distances between loci of interest and to determine the order of more than two loci on a chromosomal segment.

From: The Laboratory Mouse , 2004

Linkage Analysis

Magnus Dehli Vigeland , in Pedigree Analysis in R, 2021

9.1.5 Multipoint Analysis

A practical challenge in linkage analysis is the limited information carried by a single marker. Recall from the example in Section 9.1 that heterozygosity at the marker locus is a prerequisite for inferring recombination. This is not well supported by the SNP markers on most commercially available arrays, whose average heterozygosity is only around 25%. A remedy to this problem is to consider several adjacent markers simultaneously, referred to as multipoint analysis.

A drawback of multipoint analysis is the immense computational challenge. For example, the complexity of the Elston–Stewart algorithm on which the ped suite packages are based, grows exponentially in the number of markers and is therefore unsuited for multipoint analysis.

A better algorithm for multipoint analysis was proposed by Lander and Green (1987). Based on an ingenious hidden Markov model, this can handle virtually any number of markers, with the trade-off that the pedigree must be fairly small. The current gold standard implementations of Lander-Green is the MERLIN software (Abecasis et al., 2002), which handles pedigrees with up to 20–25 members. The paramlink2 package offers functionality for running MERLIN from within R, as we demonstrate in Section 9.3.5.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128244302000090

Analysis of Genetic Linkage

Rita M. Cantor , in Emery and Rimoin's Principles and Practice of Medical Genetics and Genomics (Seventh Edition), 2019

8.2.1 Recombination: Biological Basis of Linkage Analysis

Linkage analysis is based on the biological phenomenon of genetic recombination, which occurs in the parental gametes during the process of meiosis before the eggs and sperm are produced. In a parental gamete, when a pair of chromosomes, one from each grandparent, aligns in the first metaphase, an exchange of chromosomal material often occurs via a crossover event, with the crossover location thought to be determined by chance. This recombination of genetic material results in chromosomes different from those that would be inherited from either parent alone. Thus, each child inherits a unique set of chromosomes that are recombinants of the grandparents'. Linkage analysis is based on identifying recombination events between genetic markers and trait loci and inferring whether a trait and marker alleles are traveling in close proximity on the same chromosome or are farther away or on different chromosomes. The fundamental principle of linkage analysis is that for any two loci on the same chromosome, the closer they are to each other, the less likely it is that they will undergo recombination. Linked genes are those located close enough to each other on a chromosome that an expected crossover rate within the genetic material separating them at meiosis is less than 50%. Although recombination rates are not uniform across the genome, this principle has provided an effective biological model for linkage analysis.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128125373000081

Skeletal Genetics

Dongbing Lai , ... Kenneth E. White , in Basic and Applied Bone Biology (Second Edition), 2019

Linkage Analysis

Linkage analysis is the classic genetic method. Families with multiple affected members, preferably large pedigrees, are used in linkage analysis. It tests the co-segregation of a chromosome region marked by polymorphic genetic markers with a trait locus. This is realized by utilizing recombination, which is DNA exchange between sister chromatids via crossover during normal meiosis. If a marker is close to a trait locus, the rate of recombination is low, and thus the marker will more likely be co-inherited with the trait locus than a marker farther from the trait locus. Log-likelihood score (LOD score) is calculated to assess the statistical significance of estimated recombination rate. LOD scores ≥3 are considered to be statistically significant. To perform whole genome linkage analysis, hundreds of microsatellite markers (multiallelic markers) or thousands of single-nucleotide polymorphism markers (SNPs, only having two alleles) that are evenly spaced across the entire genome are needed.

There are two major classes of linkage analyses: parametric and nonparametric. Parametric linkage analysis is the traditional method. A disease model such as dominant, additive, and recessive is specified and usually large pedigrees that show clear Mendelian inheritance pattern are analyzed. In nonparametric linkage analysis, the disease model is undefined. This approach is used in analyses of multiple small pedigrees, in which Mendelian inheritance pattern is hard to determine. Regions identified by linkage analysis are usually large with multiple genes included and fine mapping using additional genetic markers is necessary to narrow down the regions that harbor the causal genes.

Many genes related to bone phenotypes have been identified using linkage approaches, such as LRP5 for osteoporosis pseudoglioma, SOST for sclerosteosis and van Buchem disease, and BMP2 for osteoporosis. Although the mutation carriers of these genes are rare, these findings helped to identify the pathways that are important to bone phenotypes such as Wnt signaling pathway and shed light on the etiologies of bone disorders. However, in general, linkage analysis is underpowered to detect variants with small effects in common/complex diseases and has not been widely used in the GWAS era (since 2005). Recently, with the reduced cost of sequencing technology, researchers are revisiting linkage approaches to analyze family-based sequence data to identify rare variants linked to the phenotype of interest.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128132593000099

Paracellular Channel in Human Disease

Jianghui Hou , in The Paracellular Channel, 2019

Abstract

Linkage analysis and positional cloning have led to the discovery of TJ genes and mutant alleles that cause various human diseases. More than 10 Mendelian diseases related to paracellular channel dysfunction have been solved on the molecular level. These include genetic disorders affecting the skin, the liver, the kidney, the inner ear, and the blood-brain barrier. In each case, identification of a gene sparks intense investigation of the cellular mechanism that can relate a genotype to a phenotype. Knowledge of the causal mechanism for disease not only facilitates the development of new treatment but also provides critical insight into the basic biology and physiology of paracellular channel. For example, the paracellular pathway in the kidney is particularly important for mineral metabolism. Genetic variations in three claudin genes, claudin-14, claudin-16, and claudin-19, are associated with renal diseases of Ca ++ and Mg++ imbalance. Virtually every aspect of claudin biology, e.g. gene transcription, protein translation, trafficking, interaction, and transport function, plays an important role in paracellular channelopathy.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128146354000085

The quest for genetic sequence variants conferring risk of endometriosis

Sun-Wei Guo , in Human Reproductive Genetics, 2020

Linkage analysis

Linkage analysis is a statistical genetic method that aims to identify chromosomal regions that cosegregate with a disease of interest through pedigrees [52]. In this approach, one does not need to know anything about the molecular genetic mechanisms underlying the disease itself. Through the collection of pedigrees enriched with patients with the disease, one could use an existing genetic map and localize the responsible gene in a particular region. However, while this method is very successful in localizing genes responsible for rare genetic disorders such as cystic fibrosis and Huntington's disease in which (1) there is strong evidence for a major gene (unequivocal Mendelian inheritance), and (2) the mode of inheritance is well-elaborated, it was much less successful for more common diseases such as diabetes and cardiovascular diseases, in which multiple genes are apparently involved and the mode of inheritance is often obscure.

As endometriosis is quite common with no apparent mode of inheritance, the only published linkage analysis of endometriosis attempted to replicate the linkage between endometriosis and galactose-1-phosphate uridyl transferase (GALT) gene and ended with a negative result [53].

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128165614000065

DNA Genetic Testing

Ronald J Trent PhD, BSc(Med), MBBS (Sydney), DPhil (Oxon), FRACP, FRCPA, FFSc, FTSE , in Molecular Medicine (Fourth Edition), 2012

Linkage Analysis

Linkage analysis is useful in research strategies such as positional cloning ( Chapter 2) but is rarely used for DNA diagnosis because it is an indirect approach to mutation detection. It works by finding co-segregation between DNA polymorphisms and the disease phenotype in members of a family [8]. For linkage analysis it is necessary to have a family under study containing at least one known affected individual, or one confirmed normal member. It is also necessary to have DNA polymorphisms that are located physically close to the gene causing the disease. Once these two prerequisites are available, the inheritance of the different polymorphisms through the family can be followed and individual markers can be linked to the genetic disorder or the normal phenotype (Figure 3.9). Other members of the family or a fetus in utero can then be assessed with the same DNA polymorphisms to predict normal or abnormal phenotypes.

Figure 3.9. DNA linkage study.

Understanding how DNA polymorphisms are used to follow a disease within a family (called linkage analysis) is a difficult concept. Essentially, a polymorphism is used as a surrogate marker for a chromosomal location or gene. In the case of the β globin gene depicted here, each individual has two genes and so two polymorphic markers should be detectable. To undertake linkage analysis the first step involves identifying family-specific DNA polymorphic markers that will distinguish the two β globin genes. The polymorphisms are not mutations but simply DNA sequence changes or fragment sizes that allow the two genes to be distinguished. Once the polymorphisms are identified, they are traced in a family and compared to the clinical phenotypes. In the pedigree given the two parents are β thalassemia carriers. Their carrier status is easily determined by blood counts and special hematology tests for thalassemia. They have a female child who has homozygous β thalassemia (β thalassemia major) (→), and they also have a normal male. The thalassemia status for a third (female) child is (?). The mother is also pregnant and the fetus (indicated by a triangle) has an unknown thalassemia status. Let us assume that the underlying β globin gene mutations cannot be identified in this family. Therefore, linkage analysis is the next approach to use. The polymorphisms which distinguish the two β globin genes in this family are defined by the letters a and b. Both the parents are carriers and have the a/b polymorphic markers. This information alone is not enough for diagnosis. The key individual for this is the homozygous-affected child who is b/b. This shows that the polymorphic marker b identifies the mutant β thalassemia gene in this family. Therefore, it can be assumed that the marker a defines the normal gene. This is confirmed by showing the normal child is a/a. The child with the unknown status is a/b and so she must be a carrier (which could have been more appropriately determined through a blood count than a DNA test). The fetus can have three combinations and these will predict the genetic status, i.e. a/a (= normal), b/b (= homozygous-affected) and a/b (= carrier).

It may be difficult to get families with phenotypes that are unequivocal, and so a linkage study involves a lot of work. It will not always be possible to undertake such studies, because key family members might be unavailable. DNA polymorphisms can also be uninformative if they do not allow disease and normal phenotypes to be distinguished. Linkage studies have a number of intrinsic problems including: (1) Non-paternity, which will give a false connection between a DNA polymorphism and the disease gene being studied, and (2) Recombination of DNA segments – which is a function of the distance between a polymorphic marker and the gene of interest. Although oversimplified, a physical distance of 1   Mb in DNA is roughly equivalent to a genetic distance of 1   cM (cM=centimorgan). 1   cM indicates a 1% recombination potential – i.e. in 100 meioses there will be one recombination event between the DNA polymorphism and the target DNA of interest. The use of intragenic polymorphisms such as SNPs located within the introns or exons of genes, or microsatellites found within introns or polymorphisms located in the immediate 5′ or 3′ region of genes reduces the risk of recombination.

Another trick when using DNA polymorphisms is to group a number across a segment into a haplotype. In other words, a single DNA polymorphism may not be informative, but when it is used in conjunction with other polymorphisms, its value increases. As well as increasing the informativeness of polymorphisms, haplotypes help to identify recombination events (Figure 3.10).

Figure 3.10. Detecting recombination using flanking DNA markers in the adult polycystic kidney disease locus (PKD1).

(1) The three polymorphic markers and their alleles for the PKD1 locus are: a or b; c or d; e or f. The open box (□) is the normal gene and its associated polymorphisms are a,c,e; the filled box () is the mutant gene and its associated polymorphisms are b,d,f. (2) The pedigree illustrates the segregation patterns for the above three polymorphisms. I-1 (female) has PKD1. Two of her children (II-1, II-2) are clinically affected, and so they allow the mutant-specific haplotype to be identified as bdf/ since this is what the three have in common. The one male offspring (II-3) has not inherited the maternal bdf/ haplotype which is consistent with his normal phenotype at age 50 years. The remaining female sibling (II-4) is a problem. Her adf/adf genotype does not fit. Non-paternity is unlikely since it is the maternal haplotype that is the problem. This is an example of recombination that has occurred somewhere between the a/b and the c/d loci (shown in panel 3). The mutant-specific haplotype has now become adf/ rather than bdf/. Therefore, II-4 has actually inherited the PKD1 mutation which would have been missed if only one set of polymorphisms (a/b) had been used in this linkage study, i.e. the recombination event would not have been detected and II-4 incorrectly diagnosed as normal.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123814517000037

Amyotrophic Lateral Sclerosis

Jemeen Sreedharan , Robert H. BrownJr., in Rosenberg's Molecular and Genetic Basis of Neurological and Psychiatric Disease (Fifth Edition), 2015

Optineurin

Linkage analysis of consanguineous Japanese families with ALS led to the identification of autosomal recessive mutations in the OPTN gene, which encodes optineurin. 51 Subsequently, rare heterozygous mutations were identified in European ALS cases. 52 Truncation and missense mutations are responsible for less than 1% of fALS. It is clear that there is clinical diversity among patients carrying OPTN mutations. In addition to the ALS cases, some OPTN mutations cause open-angle glaucoma. 53 Moreover, a recent GWAS also highlighted a role for optineurin in Paget disease of bone. 54

Optineurin is a multifunctional transcription factor with roles in Golgi membrane trafficking, and inflammation, vasoconstriction, and apoptosis. 55 How its mutations compromise motor neuron viability is unclear. While it is likely that OPTN mutations mediate motor neuron disease via loss of function of the protein, an acquired toxic function cannot be excluded. Some data support the original suggestion that the mutations impair the ability of OPTN to inhibit activation of the nuclear factor kappa-B (NFκB) pathway. 51 The mechanistic complexity is underscored by the finding that both recessive and heterozygous mutations in OPTN cause ALS, although an important observation is that OPTN cases demonstrate TDP-43 inclusions on neuropathological examination. 51

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780124105294000875

The Human Hypothalamus in Health and Disease

Marshall L. Summar , in Progress in Brain Research, 1992

Linkage analysis.

Linkage analysis was used to determine the statistical relationships of the loci in question by comparing the meiotic recombination rates between loci. The frequency of this recombination (θ) and it's associated odds ratio are determined by use of the method of maximum likelihood ( Conneally and Rivas, 1980). These calculations were performed using version 4.7 of the LINKAGE programs MLINK, CILINK, LINKMAP, and CLODSCORE (Lathrop and Lalouel, 1984). A LOD score of 3 (odds of 1000/1) was used as the point at which significant evidence of linkage was achieved. Equal recombination rates were assumed in males and females (θm = θf) for the two-point scores, and the map function option was not used. The θs which correspond to LOD(θmax – 1) are used as the confidence interval for this study (Conneally et al., 1985). The two-point linkage scores were calculated using MLINK, and multi-locus mapping was performed using the program LINKMAP. The order of the loci was determined by assigning probabilities to the various orders using the program LINKMAP and selecting the most probable configuration. Using the map distance of 0.15 cM between AVP/OT/PDYN and D3H12, the other loci were tested against this group to determine the most probable order based on the recombination frequencies.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/S0079612308645811

Handbook of Basal Ganglia Structure and Function, Second Edition

J.R. Crittenden , A.M. Graybiel , in Handbook of Behavioral Neuroscience, 2016

B Primary Dystonia Caused by Mutations in GNAL, Encoding the D1 Dopamine Receptor-Coupled Protein, Gαolf

Linkage analysis and whole genome exon sequencing led to the identification of dominant mutations in GNAL as a causative factor for adult-onset primary cervical dystonia (spasmodic torticollis), in which involuntary muscle contractions lead to twisting of the neck and nearby muscles (Fuchs et al., 2013; Vemula et al., 2013). GNAL encodes Gαolf, which in mice is enriched in MSNs, striatal cholinergic interneurons, dopaminergic neurons, cerebellar neurons, and olfactory epithelium (Belluscio et al., 1998). Within the mouse striatum, immunoreactivity for Gαolf is markedly enriched in the neuropil of striosomes, relative to that of the surrounding matrix (Sako et al., 2010).

olf was named for its expression in olfactory neurons, where it transduces olfactory receptor binding to the activation of adenylyl cyclase 3 (Wong et al., 2000). In the striatum, D1 receptors (enriched in dMSNs) and A2A receptors (enriched in iMSNs) are coupled to Gαolf to activate adenylyl cyclase 5. Mice lacking Gαolf exhibit hyperkinesia and severe defects in olfaction (Wong et al., 2000). The finding that dystonia and microsmia co-occur in an African-American family with GNAL mutations (Vemula et al., 2013) supports the possibility that GNAL is essential for the function of homologous neural circuits in mice and humans. Perhaps related to this are the olfactory deficits that occur early in Huntington's disease and Parkinson's disease, psychomotor disorders with dysregulation of Gαolf (Corvol et al., 2004), and striosome–matrix pathology.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128022061000398

Comparative Medical Genetics

Petra Werner , ... Urs Giger , in Clinical Biochemistry of Domestic Animals (Sixth Edition), 2008

2 Genetic Analysis

The development of genome maps allowed for the mapping of genes without further knowledge of their function. Thousands of genetic markers mapped throughout the genome enable genome-wide linkage or association studies looking for at least one of these markers to segregate with the disease. Because the location of the disease gene is initially not known, genetic markers, such as microsattelites and SNPs, covering the whole genome should be analyzed. The greater the number of markers analyzed, usually between several hundreds or thousands of markers, the higher the likelihood of finding one of these markers close to the disease locus. Most linkage or association analyses in animals are currently based on microsatellites, but with the increasing number of animal genomes sequenced and analyzed for SNP markers, faster and easier analysis with SNP microarray chips will soon be available for animals, as with the current human Genechip.

a Linkage Analysis

Linkage analysis is based on the same principle of recombination used for genetic linkage mapping. However, unlike a genetic marker, the genotype of the disease locus is not known. Therefore, it is important to know the mode of inheritance of the disorder. Pedigree analysis or experimental breeding can help to identify how a disease is inherited. Single gene diseases are usually easier to evaluate and are commonly classified into Mendelian inheritance patterns as described earlier: autosomal recessive, autosomal dominant, and X-linked inheritance. More complex inheritance patterns are due to the involvement of two or more genes (polygenic) necessary to cause disease, variable penetrance, variable expressivity, and influences from the environment.

Once a mode of inheritance is established, the underlying genotype at the disease locus is inferred and analyzed for linkage with all genetic markers that were tested, which is mostly done with the help of computer programs. If a marker is located close to the disease locus, the result will show no or a very small recombination fraction between the marker and the disease locus. Based on this recombination fraction, a numeric value, called the LOD score, is calculated. This value expresses the likelihood that the result is due to linkage between the tested marker and the disease locus rather than by chance. For example, if the LOD score has a value of 3, this indicates that obtained results are a thousand times (103) more likely due to linkage between the tested marker and disease than by chance. In most cases, an LOD score ≥3 is statistically significant. Once linkage is established to a marker, the chromosomal region surrounding the marker can be analyzed for potential candidate genes (positional candidate gene approach). Frequently, more markers will have to be analyzed in that area to confirm and further narrow the genome region of interest.

b Association Study

Genotyping data from hundreds of markers analyzed in groups of affected and unaffected animals can be evaluated for differences in allele frequencies in the two groups, thus demonstrating association between a genetic marker and the disease phenotype. If the marker and the disease locus are located close to each other, both loci will be inherited together, through several generations, and recombination between the two will be rare. Consequently, specific alleles of the marker and the disease locus will mostly be found together within the group of affected animals, which means they are associated (they are said to be in linkage disequilibrium). Therefore, an association study compares the frequency of marker alleles within the two groups, and an increased occurrence of a specific marker allele in the group of affected animals indicates that this marker is located at or close to the disease gene.

c Positional Candidate Gene Approach

A major goal of a genome-wide linkage analysis is to find the gene or genes responsible for the development of the disease or phenotype that was used for the study. The markers found to be linked allow the assignment of the disease locus to a chromosomal area, and the more markers that are tested, the narrower the region will become. A small region is desirable to minimize the number of possible candidate genes that needs to be analyzed for mutations. Because the approximate location of the candidate gene is known, this method is called the positional candidate gene approach. Genes coding for products with a known function that could be involved in the development of the disease will be considered first for analysis.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123704917000027