Summary
The aim of the project is to establish functional data analysis models for count data. The statistical analysis of count data is a specialized research area within statistics due to its high practical significance. Specifically the applications of count data are primarily in two directions. The first one is a statistical tool for counting, examples are number of passengers hospital admissions. The second one is to record the categorical data examples are classification of items in ordered and mutually exclusive quality categories depending on the level of defect.
Functional count data occur frequently in real practice. The first motivating dataset is number of taxi passengers recorded over a few years among different zones in New York see Dubey and Müller (2022). The second motivating dataset is hospital admissions over a few years among different areas in Leeds see Liu et al. (2024). However, the methodology for functional count data is far behind. As far as we know, only two relevant works exist: Canale and Dunson (2012) and Sentürk et al. (2014). The main limitation of these two works is they cannot reduce dimension and therefore following-up regression, classification, clustering cannot be constructed in functional data analysis (FDA) way.
Therefore, in this project, we aim to establish functional principal component analysis (FPCA) framework (including multiple FPCA) for functional count data. FPCA is an important tool in FDA, for its utility in dimensionality reduction and variation mode exploration.
References:
Canale, A., and Dunson, D. B. (2012). A Bayesian nonparametric model for count functional data. In XLVI Riunione Scientifica SIS (pp. 1-8). CLEUP-Coop. Libraria Editrice Universita di Padova.
Dubey, P., and Müller, H. G. (2020). Functional models for time-varying random objects. Journal of the Royal Statistical Society Series B: Statistical Methodology, 82(2), 275-327.
Liu, H., Aivaliotis, G., Kumar, V., and Houwing-Duistermaat, J. (2024). On Estimation of the Effect Lag of Predictors and Prediction in a Functional Linear Model. Statistics in Biosciences, 16(1), 1-24.
Sentürk, D., Dalrymple, L. S., and Nguyen, D. V. (2014). Functional linear models for zero-inflated count data with application to modelling hospitalizations in patients on dialysis. Statistics in medicine, 33(27), 4825-4840.
Student profile:
The successful PhD candidate should have a solid background in mathematics and statistics, with a stong interest in statistical modelling of count data. Key skill required for the project is competent use of R.
