About the Project
Diffusion-based generative models for images such as Ho et al. (2020), Dockhorn et al. (2021) have made significant impact, and nowadays become routinely used in smartphones.
An interesting potential application of such methods for Machine Learning applications is to use them to increase the diversity of existing datasets via generative data augmentation, allowing us to provide better performance, or similar level of performance with much less labelled training data.
Although there have been promising initial results in this area such as Zheng et al (2023), this is not yet completely understood, and there are still significant improvements needed in computational efficiency before such methods can be widely adopted.
We will develop new algorithms that will be numerically evaluated on various machine learning test problems. These methods will improve the consistency and computational scalability of generative data augmentation.
Explainability and uncertainty quantification will be considered as well.
In addition to algorithmic advances, the project will also develop a mathematical theory showing consistency results for our methods.
Applications for an August 2025 start will be considered until the 15th of May, 2025.
References
Zheng, C., Wu, G., & Li, C. (2023). Toward understanding generative data augmentation. Advances in neural information processing systems, 36, 54046-54060.
Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. Advances in neural information processing systems, 33, 6840-6851.
Dockhorn, T., Vahdat, A., & Kreis, K. (2021). Score-based generative modeling with critically-damped langevin diffusion. arXiv preprint arXiv:2112.07068.
Dombrowski, A. K., Gerken, J. E., Müller, K. R., & Kessel, P. (2023). Diffeomorphic counterfactuals with generative models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(5), 3257-3274.
Paulin, D., Whalley, P. A., Chada, N. K., & Leimkuhler, B. (2024). Sampling from Bayesian neural network posteriors with symmetric minibatch splitting Langevin dynamics. arXiv preprint arXiv:2410.19780., Accepted for AISTATS 2025.