aalto1 untyped-item.component.html

Exploring the structure in deep networks: Group, manifold and category theory

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

School of Science | Master's thesis

Department

Mcode

Language

en

Pages

77

Series

Abstract

Modern deep learning has achieved remarkable success in recent years, yet we lack a comprehensive understanding of why it performs well in some tasks while failing in others. This thesis develops a mathematical framework for understanding and designing neural networks through the lenses of group theory, differential geometry, and category theory. We begin by analyzing the symmetry structure of parameter spaces. For a traditional deep learning structure: linear layers + non-linear activation + regularization, we prove that the linear part possesses maximal $\mathrm{GL}_n(\mathbb{R})$ symmetry. Nonlinear activations break this symmetry to proper subgroups; we analyze ReLU and sigmoid, for example. Then we study how regularization with different norms affects symmetry, especially Schatten-$p$ norms and entry-wise $\ell_p$ norms. This work connects the choice of activation/regularization and the geometry of representations we want to learn. We then introduce Path Equivariant Networks (PENs), which generalize classical group equivariance from point-wise constraints $F(g \cdot x) = \rho(g) \cdot F(x)$ to path-wise constraints on manifolds. We prove that classical group equivariance arises as a special case under certain conditions. As an extension of this idea, we introduce content-pose decomposition, which factors the data manifolds into a symmetry-carrying pose (living in the group $G$) and a symmetry-free content (living in the quotient $U = X/G$). Finally, we provide a categorical formalization where equivariant maps are natural transformations between functors. The naturality condition captures the essence of symmetry-preserving computation. This work contributes to the theoretical foundation that the design of neural networks is fundamentally a choice of structures we want to retain in the data.

Description

Supervisor

Jung, Alex

Thesis advisor

Schnoor, Ekkehard

Other note

Citation

Endorsement

Review

Supplemented By

Referenced By