School of Mathematics
- You are here: Faculty of Mathematics and Physical Sciences
- School of Mathematics
- Research
- Groups
- Statistics research areas
Search site
Statistics research areas
Research in statistics is carried out within research group areas of probability, statistics and statistical bioinformatics. These groups overlap considerably, and our research topics span them all.
In this section
Shape analysis
Spatial linear model
Sums of independent random variables
Robustness
Medical image analysis
Classification
Shape analysis
This is a large area of current research activity aiming to develop methods to describe shapes and to summarise succinctly variability and changes through time. The field is an extension of multivariate analysis but is complicated by the invariance of the shape of an object under translation, rotation and scale changes.
Of particular interest is the case where the description of the outline of an object can be made using landmarks. Applications include automatic chromosome identification and analysis, fitting the outline of a human hand in a noisy image and the description of growth in mouse vertebrae. These projects involve the development of suitable mathematical models including the complex Bingham distribution, the use of deformable templates and analyses based on Kendall and Bookstein coordinates.
There have been substantial developments in face recognition with plastic surgery in mind. Three-dimensional pictures of the head can be produced from a laser scanner. Two questions are, (i) How to average such pictures, (ii) How to predict the effect of changes in plastic surgery. In collaboration, our aim will be to produce a statistical summary of shape changes.
A continuing project with the Department of Anatomy is related to the effect of evolution on size and shape. This is examined through a selection process on mice through several generations using data on vertebrae available as digital images. Further research is needed on the optimum selection of landmarks and the analysis of the evolutionary hypothesis, as well as analysis without using landmarks. These are two-dimensional problems and their extension to three dimensions would be extremely useful in anatomy and anthropology.
Spatial linear model
The spatial linear model is fundamental to a number of techniques used in image processing, for example, for locating gold/ore deposits, or creating maps. There are many unresolved problems in this area such as the behaviour of maximum likelihood estimators and predictors, and diagnostic tools. There are strong connections between kriging predictors for the spatial linear model and spline methods of interpolation and smoothing. The two-dimensional version of splines/kriging can be used to construct deformations of the plane, which are of key importance in shape analysis.
Sums of independent random variables
In this evergreen area of research, the main goal is to describe the distributions of sums of independent random variables via a few simple functionals of their terms. This goal has been achieved long ago and remarkably well at the levels of limit theorems (as in the law of large numbers or in the central limit theorem), rate of convergence results (the Berry-Esseen theorem), and asymptotic expansions (so-called Edgeworth expansions). At the more difficult and more useful level of optimal explicit inequalities, however, almost no final results appear to be known. An infamous example of an open problem here is to prove an optimal version of the Berry-Esseen theorem, a less well-known one concerns Chebyshev's inequality, while a similar problem involving mean absolute deviations has recently been solved.
Another goal is to investigate the precise role of the tools available for studying sums of independent random variables, such as cumulants. Here a recently published result, characterizing cumulants by their homomorphism properties, appears to be improvable.
Robustness
One of the classic problems of robustness theory involves the simultaneous estimation of location and scatter from a set of multivariate data. More work is needed to better understand questions of influence, uniqueness, and breakdown. A classic class of estimators used in multivariate robustness is the class of M-estimators. These have good local robustness properties (i.e. they are not sensitive to a few gross outliers in the data) but have poor breakdown (bounded by 1/(p+1)) in p dimensions. A new class of estimators called constrained M-estimators overcomes these theoretical problems at the price of being more difficult to compute in practice. More work is needed to understand and fine-tune the behaviour of these estimates.
Medical image analysis
There are many opportunities for research in medical imaging. The individual projects usually originate in departments which specialize in various branches of medicine and are associated with the Leeds General Infirmary or St James' Hospital in Leeds. Many of these naturally overlap with shape analysis. The projects have been made possible by the work of the Centre of Medical Imaging Research (CoMIR) Some examples are described below.
The Department of Medical Physics based at Leeds General Infirmary is one of the international leaders in the field of data fusion in medical imaging. One research area where we are involved is the investigation of data fusion problems arising in the construction of maps of the brain using two different imaging methods, magnetic resonance images (MRI) and computer tomography (CT) images.
The Department of Orthopaedics at Leeds has been working on X-ray data of the spine to investigate the onset of scoliosis. Spinal shapes are extracted using the Quantec System which obtains surface measurements from the backs of patients and, in particular, it extracts a line following along the vertebral column. For patients with scoliosis, this line will be deformed in shape. The methodology for the statistical shape analysis of these continuous curves (spine lines) needs to be further developed. Initial work in two dimensions uses smoothed principal components and leads to a few scores which summarize the shape of the line. Further work on curvature, torsion and the extension to 3-D curves is required. This work is in collaboration with the Academic Unit of Orthopaedic Surgery at St James' Hospital, Leeds.
Another example in medical imaging is the identification and assessment of a tumour in an image, say, of the brain or liver. The boundaries in images are not well defined. A method of classification which allows fuzzy rather than hard classification, used in the estimation of the percentage of crops in a pixel, can be applied to obtain the boundary information on tumours. However, some statistical work on the confidence intervals of these percentages and the extension to general covariances is required. The images are available through collaboration with the Department of Medical Physics.
Classification
Classification or discrimination involves learning a rule whereby a new observation can be classified into a pre-defined class. Current approaches can be grouped into three historical strands: statistical, machine learning and neural network. The classical statistical methods make distributional assumptions. There are many others which are distribution free, and which require some regularisation so that the rule performs well on unseen data. Recent interest has focussed on the ability of classification methods to be generalized.
For example, two related methods which are distribution free are the k-nearest neighbour classifier and the kernel density estimation approach. In both methods, there are several problems of importance: the choice of smoothing parameter(s) or k, and choice of appropriate metrics or selection of variables. These problems can be addressed by cross-validation methods, but this is computationally slow. An analysis of the relationship with a neural net approach (LVQ) should yield faster methods.
For further information on our research please visit the Group pages.
© Copyright Leeds 2011

