Polymorphism, Single Nucleotide
Latest Paper:
Most cited papers:
Stacey B Gabriel,
Stephen F Schaffner,
Huy Nguyen,
Jamie M Moore,
Jessica Roy,
Brendan Blumenstiel,
John Higgins,
Matthew DeFelice,
Amy Lochner,
Maura Faggart,
Shau Neen Liu-Cordero,
Charles Rotimi,
Adebowale Adeyemo,
Richard Cooper,
Ryk Ward,
Eric S Lander,
Mark J Daly,
David Altshuler
Whitehead/MIT Center for Genome Research, Cambridge, MA 02139, USA.
Haplotype-based methods offer a powerful approach to disease gene mapping, based on the association between causal mutations and the ancestral haplotypes on which they arose. As part of The SNP Consortium Allele Frequency Projects, we characterized haplotype patterns across 51 autosomal regions (spanning 13 megabases of the human genome) in samples from Africa, Europe, and Asia. We show that the human genome can be parsed objectively into haplotype blocks: sizable regions over which there is little evidence for historical recombination and within which only a few common haplotypes are observed. The boundaries of blocks and specific haplotypes they contain are highly correlated across populations. We demonstrate that such haplotype frameworks provide substantial statistical power in association studies of common genetic variation across each region. Our results provide a foundation for the construction of a haplotype map of the human genome, facilitating comprehensive genetic association studies of human disease.
Mesh-terms: Africa; African Americans; African Continental Ancestry Group :: genetics; Alleles; Asian Continental Ancestry Group :: genetics; China; Chromosome Mapping; Computational Biology; Computer Simulation; Europe; European Continental Ancestry Group :: genetics; Genome, Human; Genotype; Haplotypes; Human; Japan; Linkage Disequilibrium; Models, Genetic; Polymorphism, Single Nucleotide; Recombination, Genetic; Support, Non-U.S. Gov't; Variation (Genetics) ;
J P Hugot,
M Chamaillard,
H Zouali,
S Lesage,
J P Cézard,
J Belaiche,
S Almer,
C Tysk,
C A O'Morain,
M Gassull,
V Binder,
Y Finkel,
A Cortot,
R Modigliani,
P Laurent-Puig,
C Gower-Rousseau,
J Macry,
J F Colombel,
M Sahbatou,
G Thomas
Crohn's disease and ulcerative colitis, the two main types of chronic inflammatory bowel disease, are multifactorial conditions of unknown aetiology. A susceptibility locus for Crohn's disease has been mapped to chromosome 16. Here we have used a positional-cloning strategy, based on linkage analysis followed by linkage disequilibrium mapping, to identify three independent associations for Crohn's disease: a frameshift variant and two missense variants of NOD2, encoding a member of the Apaf-1/Ced-4 superfamily of apoptosis regulators that is expressed in monocytes. These NOD2 variants alter the structure of either the leucine-rich repeat domain of the protein or the adjacent region. NOD2 activates nuclear factor NF-kB; this activating function is regulated by the carboxy-terminal leucine-rich repeat domain, which has an inhibitory role and also acts as an intracellular receptor for components of microbial pathogens. These observations suggest that the NOD2 gene product confers susceptibility to Crohn's disease by altering the recognition of these components and/or by over-activating NF-kB in monocytes, thus documenting a molecular model for the pathogenic mechanism of Crohn's disease that can now be further investigated.
Mesh-terms: Alleles; Carrier Proteins; Chromosomes, Human, Pair 16; Cloning, Molecular; Colitis, Ulcerative :: genetics; Crohn Disease :: etiology; Crohn Disease :: genetics; Gene Frequency; Genetic Predisposition to Disease; Genotype; Human; Leucine; Linkage (Genetics) ; NF-kappa B :: metabolism; Polymorphism, Single Nucleotide; Proteins :: genetics; Repetitive Sequences, Amino Acid; Signal Transduction; Support, Non-U.S. Gov't; Variation (Genetics) ;
R Sachidanandam,
D Weissman,
S C Schmidt,
J M Kakol,
L D Stein,
G Marth,
S Sherry,
J C Mullikin,
B J Mortimore,
D L Willey,
S E Hunt,
C G Cole,
P C Coggill,
C M Rice,
Z Ning,
J Rogers,
D R Bentley,
P Y Kwok,
E R Mardis,
R T Yeh,
B Schultz,
L Cook,
R Davenport,
M Dante,
L Fulton,
L Hillier,
R H Waterston,
J D McPherson,
B Gilman,
S Schaffner,
W J Van Etten,
D Reich,
J Higgins,
M J Daly,
B Blumenstiel,
J Baldwin,
N Stange-Thomann,
M C Zody,
L Linton,
E S Lander,
D Altshuler
We describe a map of 1.42 million single nucleotide polymorphisms (SNPs) distributed throughout the human genome, providing an average density on available sequence of one SNP every 1.9 kilobases. These SNPs were primarily discovered by two projects: The SNP Consortium and the analysis of clone overlaps by the International Human Genome Sequencing Consortium. The map integrates all publicly available SNPs with described genes and other genomic features. We estimate that 60,000 SNPs fall within exon (coding and untranslated regions), and 85% of exons are within 5 kb of the nearest SNP. Nucleotide diversity varies greatly across the genome, in a manner broadly consistent with a standard population genetic model of human history. This high-density SNP map provides a public resource for defining haplotype variation across the genome, and should help to identify biomedically important genes for diagnosis and therapy.
Augustine Kong,
Daniel F Gudbjartsson,
Jesus Sainz,
Gudrun M Jonsdottir,
Sigurjon A Gudjonsson,
Bjorgvin Richardsson,
Sigrun Sigurdardottir,
John Barnard,
Bjorn Hallbeck,
Gisli Masson,
Adam Shlien,
Stefan T Palsson,
Michael L Frigge,
Thorgeir E Thorgeirsson,
Jeffrey R Gulcher,
Kari Stefansson
Determination of recombination rates across the human genome has been constrained by the limited resolution and accuracy of existing genetic maps and the draft genome sequence. We have genotyped 5,136 microsatellite markers for 146 families, with a total of 1,257 meiotic events, to build a high-resolution genetic map meant to:(i) improve the genetic order of polymorphic markers;(ii) improve the precision of estimates of genetic distances;(iii) correct portions of the sequence assembly and SNP map of the human genome; and (iv) build a map of recombination rates. Recombination rates are significantly correlated with both cytogenetic structures (staining intensity of G bands) and sequence (GC content, CpG motifs and poly(A)/poly(T) stretches). Maternal and paternal chromosomes show many differences in locations of recombination maxima. We detected systematic differences in recombination rates between mothers and between gametes from the same mother, suggesting that there is some underlying component determined by both genetic and environmental factors that affects maternal recombination rates.
Whitehead Institute/Massachusetts Institute of Technology, Center for Genome Research, Cambridge, Massachusetts, USA. mjdaly@genome.wi.mit.edu
Linkage disequilibrium (LD) analysis is traditionally based on individual genetic markers and often yields an erratic, non-monotonic picture, because the power to detect allelic associations depends on specific properties of each marker, such as frequency and population history. Ideally, LD analysis should be based directly on the underlying haplotype structure of the human genome, but this structure has remained poorly understood. Here we report a high-resolution analysis of the haplotype structure across 500 kilobases on chromosome 5q31 using 103 single-nucleotide polymorphisms (SNPs) in a European-derived population. The results show a picture of discrete haplotype blocks (of tens to hundreds of kilobases), each with limited diversity punctuated by apparent sites of recombination. In addition, we develop an analytical model for LD mapping based on such haplotype blocks. If our observed structure is general (and published data suggest that it may be), it offers a coherent framework for creating a haplotype map of the human genome.
P Kuehl,
J Zhang,
Y Lin,
J Lamba,
M Assem,
J Schuetz,
P B Watkins,
A Daly,
S A Wrighton,
S D Hall,
P Maurel,
M Relling,
C Brimer,
K Yasuda,
R Venkataramanan,
S Strom,
K Thummel,
M S Boguski,
E Schuetz
Department of Molecular and Cell Biology, University of Maryland at Baltimore, Baltimore, Maryland, USA.
Variation in the CYP3A enzymes, which act in drug metabolism, influences circulating steroid levels and responses to half of all oxidatively metabolized drugs. CYP3A activity is the sum activity of the family of CYP3A genes, including CYP3A5, which is polymorphically expressed at high levels in a minority of Americans of European descent and Europeans (hereafter collectively referred to as 'Caucasians'). Only people with at least one CYP3A5*1 allele express large amounts of CYP3A5. Our findings show that single-nucleotide polymorphisms (SNPs) in CYP3A5*3 and CYP3A5*6 that cause alternative splicing and protein truncation result in the absence of CYP3A5 from tissues of some people. CYP3A5 was more frequently expressed in livers of African Americans (60%) than in those of Caucasians (33%). Because CYP3A5 represents at least 50% of the total hepatic CYP3A content in people polymorphically expressing CYP3A5, CYP3A5 may be the most important genetic contributor to interindividual and interracial differences in CYP3A-dependent drug clearance and in responses to many medicines.
G C Johnson,
L Esposito,
B J Barratt,
A N Smith,
J Heward,
G Di Genova,
H Ueda,
H J Cordell,
I A Eaves,
F Dudbridge,
R C Twells,
F Payne,
W Hughes,
S Nutland,
H Stevens,
P Carr,
E Tuomilehto-Wolf,
J Tuomilehto,
S C Gough,
D G Clayton,
J A Todd
JDRF/WT Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/Medical Research Council Building, Hills Road, Cambridge, UK.
Genome-wide linkage disequilibrium (LD) mapping of common disease genes could be more powerful than linkage analysis if the appropriate density of polymorphic markers were known and if the genotyping effort and cost of producing such an LD map could be reduced. Although different metrics that measure the extent of LD have been evaluated, even the most recent studies have not placed significant emphasis on the most informative and cost-effective method of LD mapping-that based on haplotypes. We have scanned 135 kb of DNA from nine genes, genotyped 122 single-nucleotide polymorphisms (SNPs; approximately 184,000 genotypes) and determined the common haplotypes in a minimum of 384 European individuals for each gene. Here we show how knowledge of the common haplotypes and the SNPs that tag them can be used to (i) explain the often complex patterns of LD between adjacent markers,(ii) reduce genotyping significantly (in this case from 122 to 34 SNPs),(iii) scan the common variation of a gene sensitively and comprehensively and (iv) provide key fine-mapping data within regions of strong LD. Our results also indicate that, at least for the genes studied here, the current version of dbSNP would have been of limited utility for LD mapping because many common haplotypes could not be defined. A directed re-sequencing effort of the approximately 10% of the genome in or near genes in the major ethnic groups would aid the systematic evaluation of the common variant model of common disease.
J C Venter,
M D Adams,
E W Myers,
P W Li,
R J Mural,
G G Sutton,
H O Smith,
M Yandell,
C A Evans,
R A Holt,
J D Gocayne,
P Amanatides,
R M Ballew,
D H Huson,
J R Wortman,
Q Zhang,
C D Kodira,
X H Zheng,
L Chen,
M Skupski,
G Subramanian,
P D Thomas,
J Zhang,
G L Gabor Miklos,
C Nelson,
S Broder,
A G Clark,
J Nadeau,
V A McKusick,
N Zinder,
A J Levine,
R J Roberts,
M Simon,
C Slayman,
M Hunkapiller,
R Bolanos,
A Delcher,
I Dew,
D Fasulo,
M Flanigan,
L Florea,
A Halpern,
S Hannenhalli,
S Kravitz,
S Levy,
C Mobarry,
K Reinert,
K Remington,
J Abu-Threideh,
E Beasley,
K Biddick,
V Bonazzi,
R Brandon,
M Cargill,
I Chandramouliswaran,
R Charlab,
K Chaturvedi,
Z Deng,
V Di Francesco,
P Dunn,
K Eilbeck,
C Evangelista,
A E Gabrielian,
W Gan,
W Ge,
F Gong,
Z Gu,
P Guan,
T J Heiman,
M E Higgins,
R R Ji,
Z Ke,
K A Ketchum,
Z Lai,
Y Lei,
Z Li,
J Li,
Y Liang,
X Lin,
F Lu,
G V Merkulov,
N Milshina,
H M Moore,
A K Naik,
V A Narayan,
B Neelam,
D Nusskern,
D B Rusch,
S Salzberg,
W Shao,
B Shue,
J Sun,
Z Wang,
A Wang,
X Wang,
J Wang,
M Wei,
R Wides,
C Xiao,
C Yan,
A Yao,
J Ye,
M Zhan,
W Zhang,
H Zhang,
Q Zhao,
L Zheng,
F Zhong,
W Zhong,
S Zhu,
S Zhao,
D Gilbert,
S Baumhueter,
G Spier,
C Carter,
A Cravchik,
T Woodage,
F Ali,
H An,
A Awe,
D Baldwin,
H Baden,
M Barnstead,
I Barrow,
K Beeson,
D Busam,
A Carver,
A Center,
M L Cheng,
L Curry,
S Danaher,
L Davenport,
R Desilets,
S Dietz,
K Dodson,
L Doup,
S Ferriera,
N Garg,
A Gluecksmann,
B Hart,
J Haynes,
C Haynes,
C Heiner,
S Hladun,
D Hostin,
J Houck,
T Howland,
C Ibegwam,
J Johnson,
F Kalush,
L Kline,
S Koduru,
A Love,
F Mann,
D May,
S McCawley,
T McIntosh,
I McMullen,
M Moy,
L Moy,
B Murphy,
K Nelson,
C Pfannkoch,
E Pratts,
V Puri,
H Qureshi,
M Reardon,
R Rodriguez,
Y H Rogers,
D Romblad,
B Ruhfel,
R Scott,
C Sitter,
M Smallwood,
E Stewart,
R Strong,
E Suh,
R Thomas,
N N Tint,
S Tse,
C Vech,
G Wang,
J Wetter,
S Williams,
M Williams,
S Windsor,
E Winn-Deen,
K Wolfe,
J Zaveri,
K Zaveri,
J F Abril,
R Guigó,
M J Campbell,
K V Sjolander,
B Karlak,
A Kejariwal,
H Mi,
B Lazareva,
T Hatton,
A Narechania,
K Diemer,
A Muruganujan,
N Guo,
S Sato,
V Bafna,
S Istrail,
R Lippert,
R Schwartz,
B Walenz,
S Yooseph,
D Allen,
A Basu,
J Baxendale,
L Blick,
M Caminha,
J Carnes-Stine,
P Caulk,
Y H Chiang,
M Coyne,
C Dahlke,
A Mays,
M Dombroski,
M Donnelly,
D Ely,
S Esparham,
C Fosler,
H Gire,
S Glanowski,
K Glasser,
A Glodek,
M Gorokhov,
K Graham,
B Gropman,
M Harris,
J Heil,
S Henderson,
J Hoover,
D Jennings,
C Jordan,
J Jordan,
J Kasha,
L Kagan,
C Kraft,
A Levitsky,
M Lewis,
X Liu,
J Lopez,
D Ma,
W Majoros,
J McDaniel,
S Murphy,
M Newman,
T Nguyen,
N Nguyen,
M Nodell,
S Pan,
J Peck,
M Peterson,
W Rowe,
R Sanders,
J Scott,
M Simpson,
T Smith,
A Sprague,
T Stockwell,
R Turner,
E Venter,
M Wang,
M Wen,
D Wu,
M Wu,
A Xia,
A Zandieh,
X Zhu
A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.
Mesh-terms: Algorithms; Animals; Chromosome Banding; Chromosome Mapping; Chromosomes, Artificial, Bacterial; Computational Biology; Consensus Sequence; CpG Islands; DNA, Intergenic; Databases, Factual; Evolution, Molecular; Exons; Female; Gene Duplication; Genes; Genome, Human; Human; Human Genome Project; Introns; Male; Phenotype; Physical Chromosome Mapping; Polymorphism, Single Nucleotide; Proteins :: genetics; Proteins :: physiology; Pseudogenes; Repetitive Sequences, Nucleic Acid; Retroelements; Sequence Analysis, DNA :: methods; Species Specificity; Support, Non-U.S. Gov't; Variation (Genetics) ;
Albert O Edwards,
Robert Ritter 3rd,
Kenneth J Abel,
Alisa Manning,
Carolien Panhuysen,
Lindsay A Farrer
Age-related macular degeneration (AMD) is a common, late-onset, and complex trait with multiple risk factors. Concentrating on a region harboring a locus for AMD on 1q25-31, the ARMD1 locus, we tested single-nucleotide polymorphisms for association with AMD in two independent case-control populations. Significant association (P = 4.95 x 10(-10)) was identified within the regulation of complement activation locus and was centered over a tyrosine-402 --> histidine-402 protein polymorphism in the gene encoding complement factor H. Possession of at least one histidine at amino acid position 402 increased the risk of AMD 2.7-fold and may account for 50% of the attributable risk of AMD.
Mesh-terms: Aged; Alleles; Amino Acid Substitution; Case-Control Studies; Chromosomes, Human, Pair 1 :: genetics; Complement Activation :: genetics; Complement Factor H :: genetics; Complement Factor H :: physiology; Female; Gene Frequency; Genetic Predisposition to Disease; Genotype; Haplotypes; Histidine; Homozygote; Humans; Linkage Disequilibrium; Macular Degeneration :: etiology; Macular Degeneration :: genetics; Male; Middle Aged; Multigene Family; Polymorphism, Single Nucleotide; Research Support, Non-U.S. Gov't; Research Support, U.S. Gov't, P.H.S. ; Risk Factors; Tyrosine; Variation (Genetics) ;
R B Kim,
B F Leake,
E F Choo,
G K Dresser,
S V Kubba,
U I Schwarz,
A Taylor,
H G Xie,
J McKinsey,
S Zhou,
L B Lan,
J D Schuetz,
E G Schuetz,
G R Wilkinson
Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA. richard.kim@mcmail.vanderbilt.edu
MDR1 (P-glycoprotein) is an important factor in the disposition of many drugs, and the involved processes often exhibit considerable interindividual variability that may be genetically determined. Single-strand conformational polymorphism analysis and direct sequencing of exonic MDR1 deoxyribonucleic acid from 37 healthy European American and 23 healthy African American subjects identified 10 single nucleotide polymorphisms (SNPs), including 6 nonsynonymous variants, occurring in various allelic combinations. Population frequencies of the 15 identified alleles varied according to racial background. Two synonymous SNPs (C1236T in exon 12 and C3435T in exon 26) and a nonsynonymous SNP (G2677T, Ala893Ser) in exon 21 were found to be linked (MDR1*2 ) and occurred in 62% of European Americans and 13% of African Americans. In vitro expression of MDR1 encoding Ala893 (MDR1*1 ) or a site-directed Ser893 mutation (MDR1*2 ) indicated enhanced efflux of digoxin by cells expressing the MDR1-Ser893 variant. In vivo functional relevance of this SNP was assessed with the known P-glycoprotein drug substrate fexofenadine as a probe of the transporter's activity. In humans, MDR1*1 and MDR1*2 variants were associated with differences in fexofenadine levels, consistent with the in vitro data, with the area under the plasma level-time curve being almost 40% greater in the *1/*1 genotype compared with the *2/*2 and the *1/*2 heterozygotes having an intermediate value, suggesting enhanced in vivo P-glycoprotein activity among subjects with the MDR1*2 allele. Thus allelic variation in MDR1 is more common than previously recognized and involves multiple SNPs whose allelic frequencies vary between populations, and some of these SNPs are associated with altered P-glycoprotein function.
Mesh-terms: Africa :: ethnology; African Continental Ancestry Group :: genetics; Alleles; Anti-Allergic Agents :: pharmacokinetics; Area Under Curve; Cloning, Molecular; DNA Primers; Digoxin :: pharmacokinetics; Enzyme Inhibitors :: pharmacokinetics; Europe :: ethnology; European Continental Ancestry Group :: genetics; Genes, MDR :: genetics; Genotype; Haplotypes; Human; P-Glycoprotein :: metabolism; Polymerase Chain Reaction; Polymorphism, Single Nucleotide; Sequence Analysis, DNA; Support, Non-U.S. Gov't; Support, U.S. Gov't, P.H.S. ; Terfenadine :: analogs & derivatives; Terfenadine :: pharmacokinetics; Time Factors; Variation (Genetics) ;

