MATH 401

Homepage for my senior project.

Author

Andrew Moore

Published

January 16, 2025

Modified

April 25, 2025

Welcome

Hello! This website contains the work I completed during the course of Math-401, my senior project and final course to complete my BS in Applied Mathematics. I completed this course during Spring 2025; Dr. Grady Wright (Professor, Boise State Department of Mathematics), served as my project advisor. During the course, I focused on the topic of Gaussian Processes (GPs), a generalization of the multivariate Gaussian distribution, and how GPs can be useful in the context of various regression problems.

You can browse my writings on different subtopics from the sidebar, or by searching using the magnifying glass in the upper-right corner of the page. This website was built using Quarto, with data analysis, inference, and visualization being conducted using the R and Stan programming languages. My poster’s layout was built from an excellent typst template, with a few formatting tweaks.

Project Summary

We began the semester by examining the connections between Gaussian Processes and Kernal Ridge Regression (KRR), reading Kanagawa et al. (2018) as a starting point. The two approaches are theoretically related in that both rely on positive-definite kernels to accomplish interpolation or data fitting. In practice, the approaches are comparable; for example, the KRR estimator is equivalent to the GP posterior mean function. However, the approaches differ in their quantification of uncertainty¹, and in how they define hypothesis spaces².

Having an introduction to the topics, I worked to gain an understanding of Gaussian Process regression by simulating univariate data. From this basic case, we moved to multidimensional problems to explore how GPs can be useful for regression problems with a spatial dimension. Early applications of GPR were in geostatistics, in which the technique (more commonly referred to as Kriging) has been used to predict the distribution of minerals over a sampled area. We concluded by applying GPR to a vector-valued outcome, specifically velocity fields. Here I drew upon the Intrinsic Coregionalization Model, discussed in Alvarez et al. (2012), which involves constructing a similarity matrix to summarize cross-output dependencies which is then combined with the kernel/Gram matrix constructed from the researcher’s training data. The page Multioutput Gaussian Process Regression explores these topics using a simulation of Hurricane Isabel’s velocity field, and also includes my attempts to conduct hyperparameter inference using the Stan programming language. I also explored multioutput/vector-valued GPR with Particle Image Velocimetry data from Harlander, Wright, and Egbers (2012) in my project’s poster.

Weekly Meetings

These are brief notes I took to summarize topics and materials I discussed with Dr. Wright during our scheduled meetings. Some of the links that reference this site may be dead; I ultimately consolidated several pages down to the set that are available today in the sidebar.

Notes

4/25/25

Wrap-up. Working on completing incomplete reflections.

4/14/25

Review of current draft
- Should tighten up which variables are used in Section I– would be better to use \(\mathbf{z}\), rather than \(\mathbf{x}\) (which is what we use to refer to training data).
Scaling of the PIV vector field is off (all the magnitudes are too small)– and appears to be scrambled compared to what we expect. Bug in my code somewhere.

4/4/25

Summary of my progress re: poster drafting

3/28/25

Reading about the ICM and LCM
Updating Stan code used for inference of hyperparameters.

3/14/25

Writeup of multioutput GPR is close to finished. Used data from a simulation of hurricane Isabel to test out the approach.
I’ve been working on parameter estimation for multioutput GPR.
- I’ve been using R.M.S.E. to measure performance on unseen test data.
- My attempts to optimize the log-likelihood function were not giving better performance over naive defaults of \(\alpha = 1\) and \(\rho = 1\).
- Due to this, I began testing out the Stan programming language for estimation, which appears to work a bit better (but not improving over the defaults).
Multioutput GPR appears to relate to the idea of the Matrix Normal distribution. In some circumstances, it may be equivalent?

3/10/25

I needed to cancel due to a conflict. We’ll postpone until 3/14.

3/3/25

Still working on code for multioutput GPR.
Reviewed my CV and discussed topics related to Reflection 2.
- Learned that Tukey was involved in a landmark implementation of the Fast Fourier Transform (FFT).

2/24/25

Reading Bonilla, Chai, & Williams (2007)
Working towards implementing a simulation of multi-task GPR
Last week, I completed a basic simulation of bivariate GPR, which can be viewed here. It’s light on theory/notation, but I think it follows fairly straightforwardly from the univariate case.
- I need to actually start doing more formal parameter estimation for the kernel/covariance functions. Currently, I’ve just been using arbitrary values for \(\alpha\), \(\rho\) and \(\sigma\) in the RBF kernel. These are fine for demonstration purposes, but aren’t representative of what a fully application would entail.

2/14/25

Reading and searching for more background information about multi-output Gaussian Processes
- e.g., vector-valued GPs (blog post)
- Completed review of the multivariate Gaussian distribution (link)
- Still working on modeling bivariate data, but not (yet) fruitfully

Ideally, we could model position and velocity together as a vector

\[ \begin{bmatrix} u \\ v \end{bmatrix} = f(x, y) + \epsilon \]

rather than something like \[ \begin{align*} u &= f_1(x) + \epsilon_1 \\ v &= f_2(y) + \epsilon_2 \end{align*} \]

2/7/25

Discussed the PIV laboratory data that Dr. Wright worked on in 2012.
\(\partial_x u + \partial_y v + \partial_z w = 0\)
- recall: \(u\) position, \(v\) velocity
Goal is to induce Rossby waves within a rotating cylinder.
- Preprocessing of data may be subject to artifacts– opportunity to use statistical methods to fill in, or correct, artifacts.

1/31/25

Working on an overview of Gaussian processes
- I have a more formal writeup (in the sense of notation) of GPR here
Working on an overview of KRR
- todos:
  - update citations

1/23/25

Reviewed sections 1-3. Covered the algorithm that auths present on page N.
Goal: try to have some workable code to demonstrate KRR and GPR on simulated data. Let’s understand what the methods are producing.

1/16/25

Discussion, admin work
Decided to cover first 3 sections of Kanagawa et al. (2018)
Mentioned __ textbook.

References

Alvarez, Mauricio A, Lorenzo Rosasco, Neil D Lawrence, et al. 2012. “Kernels for Vector-Valued Functions: A Review.” Foundations and Trends in Machine Learning 4 (3): 195–266.

Harlander, Uwe, Grady B Wright, and Christoph Egbers. 2012. “Reconstruction of the 3D Flow Field in a Differentially Heated Rotating Annulus by Synchronized Particle Image Velocimetry and Infrared Thermography Measurements.” In 16th International Symposium on Applied Laser Techniques to Fluid Mechanics, Lisbon, Portugal.

Kanagawa, Motonobu, Philipp Hennig, Dino Sejdinovic, and Bharath K Sriperumbudur. 2018. “Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences.” arXiv Preprint arXiv:1807.02582.

Footnotes

For GPs, uncertainty is represented through the posterior variance, and can be estimated/visualized via draws from the posterior distribution. With KRR, Kanagawa et al. (2018) suggest that the posterior variance can be interpreted as a “worst case error” in a Reproducing Kernel Hilbert Space (RKHS).↩︎
For KRR, it is assumed that the target function belongs to a RKHS (or can be well approximated by function in a RKHS). However, the support for GPs is not identical to that of a RKHS (Kanagawa et al. 2018, 3).↩︎