Scottish and Northumbrian statisticians' meeting: 20 May 2005, University of Newcastle upon Tyne
Spatial approaches to the analysis of genetic association studies
David Balding
Centre for Biostatistics, Imperial College London
Approaches to the analysis of genetic association studies that go beyond using one marker at a time are often based on the notion of "haplotype", which can be though of as a group of markers that are treated as a unit for analysis. There is support for this approach in the discovery over recent years that much of the genome has a block-like structure, with strong statistical dependence between markers within blocks, and little between blocks. However the "block" model of the human genome gives an imperfect description of reality, and current haplotype-based analyses have important limitations: haplotypes are not directly observed, but must be inferred from genotype data; it is difficult to model the relationships between haplotypes that are similar and so may have recent shared ancestry; and it is also difficult to accommodate rare haplotypes. I will discuss an approach to the analysis of genetic association studies that is based on a stochastic search for clusters in a metric space of haplotypes.