NISHIO Hirokazu[English][日本語]

Suzuki, Daiji - Mathematics of Deep Learning

hillbig Theoretical analysis of deep learning by Dr. Daiji Suzuki, especially on representation capability, generalization capability, and optimization theory. He covers a wide range of important topics, including the latest Neural Tangent Kernel and dual effect. I don't think there is anything as comprehensive as this in English.

Daisuke Okanohara

I get an error when I access the original Slideshare, I can see it on X/Twitter, cache?

As an easy-to-understand concrete example, in the case of a function whose value is determined by the distance from the origin, four layers would be of polynomial order with respect to the number of dimensions (I think it's linear, frankly).

Kernel Method and Ridge Regression
regenerative nuclear hillbelt space I'm redescribing the kernel ridge regression in terms of the idea of a regenerative nuclear Hilbert space, but I'll skip that part. Deep learning can be interpreted as learning the kernel function itself in accordance with the data. ...
double-drop
implicit regularization

Generalization error bound

skip this spot

Approximation performance by function class

piecewise smooth function mixed-smoothness

kernel ridge regression
adaptive method
- deep learning
- sparse estimation
  - I guess if you have too many things to prepare in advance, it becomes impractical.

Bezov space

The various function classes mentioned in past discussions are special cases of [Bezov space

→Sparsity.

Neural Tangent Kernel Mean Field

Watterstein Distance

This page is auto-translated from /nishio/鈴木大慈-深層学習の数理 using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I'm very happy to spread my thought to non-Japanese readers.

(C)NISHIO Hirokazu / Converted from Markdown (en)
Source: [GitHub] / [Scrapbox]