Posts

Showing posts from June, 2024

First posting: How does a neural network work?

Purpose of this blog The goal of this blog is to unravel ideas in artificial intelligence (AI) and machine learning (ML) for people who have some training in mathematics but not necessarily in computer science. This does not mean that I will always "prove" things. Rather, I will often try to find heuristic explanations that are mathematically sounding or summarize ideas of some important proofs. As the first posting of this blog, we start off with understanding how a neural network works, which is central to AI. Of course, this is a vast topic, so we will only study essential principles on how it works now and discuss more specific topics in separate postings. The content of this posting is technically about feedforward neural networks , which are neural networks of the simplest architecture. However, I will try to explain some intuitions on how to generalize to other architectures at the end so that we feel more natural when we discuss those in later postings. Remark about r...