AI/ML for mathematicians

Posts

Convolutional neural networks: 1. convolutional layers (2-dimensional)

- July 05, 2024

From super flexible to somewhat flexible In the first posting of this entire blog, we discussed how to build the most flexible form of a neural network called the feedforward neural network (FNN). There are various versions of universal approximation theorems that such neural networks can approximate any map $\mathbb{R}^m \rightarrow \mathbb{R}^d$ in a large class (e.g., continuous maps on a compact support with supreme norm or $L^p$ maps with $L^p$-norm), which we shall refer to as a universality result. Unfortunately, no universality result provides an algorithm to find an FNN . Disclaimer . I have personally gone through a decent amount of references, and the underlined sentence above is what seems to be true based on my search rather than the absolute truth. However, it is evident that the stochastic gradient descent (SGD) or any of its variants cannot guarantee that each step (which we call back propagation ) is actually a descent process, so it is safe to say that a ...

First posting: How does a neural network work?

- June 13, 2024

Purpose of this blog The goal of this blog is to unravel ideas in artificial intelligence (AI) and machine learning (ML) for people who have some training in mathematics but not necessarily in computer science. This does not mean that I will always "prove" things. Rather, I will often try to find heuristic explanations that are mathematically sounding or summarize ideas of some important proofs. As the first posting of this blog, we start off with understanding how a neural network works, which is central to AI. Of course, this is a vast topic, so we will only study essential principles on how it works now and discuss more specific topics in separate postings. The content of this posting is technically about feedforward neural networks , which are neural networks of the simplest architecture. However, I will try to explain some intuitions on how to generalize to other architectures at the end so that we feel more natural when we discuss those in later postings. Remark about r...

Search This Blog

AI/ML for mathematicians

Posts

Convolutional neural networks: 2. zero paddings

Convolutional neural networks: 1. convolutional layers (2-dimensional)

First posting: How does a neural network work?