Browsing by Author "Heinonen, Markus, Dr., Aalto University, Department of Computer Science, Finland"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
- Deep learning methods for modeling of spatiotemporal dynamical systems governed by partial differential equations
School of Science | Doctoral dissertation (article-based)(2025) Iakovlev, ValeriiThis dissertation focuses on data-driven modeling of spatiotemporal dynamical systems, using observational data to develop models that approximate the underlying dynamical processes. Spatiotemporal modeling has a rich history with numerous successful applications. It has been continually advanced by technological and methodological improvements, evolving from early qualitative approaches to modern sophisticated deep learning methods. Despite recent progress enabled by deep learning—which has shown promise in modeling complex systems like weather patterns, traffic dynamics, and crowd flows—current deep learning-based spatiotemporal models face significant limitations. These include restricted applicability due to simplifying assumptions (such as fully observed states on fixed grids), data inefficiency requiring large datasets for good generalization, and long training times coupled with instabilities arising from complex loss landscapes. This dissertation addresses these challenges by developing novel deep learning-based models and techniques that enhance the flexibility, data efficiency, and stability of spatiotemporal systems modeling. To extend applicability of deep learning-based spatiotemporal models, a graph-based continuoustime model inspired by the method of lines is introduced, enabling modeling on irregular spatiotemporal grids. This is further extended to a space-time continuous model operating in latent space, allowing for learning dynamics from partially observed and noisy states. Finally, a model incorporating a spatiotemporal point process is developed to learn system dynamics from unstructured observations made at random times and locations. To improve data efficiency, the models leverage the locality bias inherent in PDE systems, achieving remarkable data efficiency and requiring significantly fewer training trajectories to generalize compared to previous methods. To enhance training stability and speed, an amortized Bayesian multiple shooting technique is proposed, extending classical multiple shooting to the Bayesian setting and modern computational regimes. This method stabilizes training and reduces training time by up to an order of magnitude. Additionally, a latent space interpolation technique is introduced to further accelerate training without compromising predictive accuracy. Overall, this dissertation advances the field of data-driven spatiotemporal modeling by introducing deep learning methods and techniques that are more widely applicable, data-efficient, and computationally efficient. These developments enable the modeling of a broader spectrum of complex dynamical systems under more realistic conditions than was previously possible. - Modelling non-stationary functions with Gaussian processes
School of Science | Doctoral dissertation (article-based)(2019) Remes, SamiGaussian processes (GP's) are a central piece of non-parametric Bayesian methods, which allow placing priors over functions in settings such as classification and regression. The prior is described using a kernel function that encodes a similarity between any two points in the input space, and thus defines the properties of functions that are modelled by the GP. In applying Gaussian processes the choice of the kernel is crucial, and the commonly used standard kernels often offer unsatisfactory performance due to making the assumption of stationarity. This thesis presents approaches in modelling non-stationarity from two different perspectives in Gaussian processes. First, this thesis presents a formulation of a non-stationary spectral mixture kernel for univariate outputs, focusing on modelling the non-stationarity in the input space. The construction is based on the spectral mixture (SM) kernel, which has been derived for stationary functions using the Fourier duality implied by Bochner's theorem. The work done in this thesis extends the SM kernel into the non-stationary case. This is achieved by two complementary approaches, based on replacing the constant frequency parameters by input-dependent functions. The first approach is based on modelling the latent functions describing the frequency surface as Gaussian processes. In the second approach the functions are directly modelled as a neural network, parameters of which are optimized with respect to the variational evidence lower bound (ELBO). Second, this thesis presents a kernel suitable for modelling non-stationary couplings between multiple output variables of interest in the context of multi-task or multi-output GP regression. The construction of the kernel is based on a Hadamard product of two kernels, which model the different aspects of dependencies between the outputs. The part of the kernel modelling the input-dependent couplings is based on a generalized Wishart process, which is a stochastic process on time-varying positive-definite matrices, in this case describing the changing dependencies between the outputs. The proposed Hadamard product kernel is applied in a latent factor model to enrich the latent variable prior distribution, that is, to model correlations within the latent variables explicitly. This results in the latent correlation Gaussian process model (LCGP). This thesis additionally considers novel, flexible models for classification of multi-view data, specifically one based on a mixture of group factor analyzers (GFA). The model has a close relationship to the LCGP that builds a classifier in the latent variable space, while the classifier in the GFA mixture is based on the mixture assignments. GFA also allows modelling dependencies between groups of variables, which is not done by the LCGP. Applying Gaussian processes and adapting the proposed multi-output kernel would make the multi-view model even more general. The methods introduced in this thesis now allow modelling non-stationary functions in Gaussian processes in a flexible way. The proposed kernels can be applied very generally, and the approaches introduced to derive them can also be applied to derive other types of non-stationary kernels.