And hello to u too:) Click the icon above for more detail

Dnn

第一章

应用数学与机器学习基础

线性代数

\[\begin{aligned} & Ax = b \\ & x = A^{-1}b \end{aligned}\]

Deep Feedforward Networks

Linear Models

6.1 Example: Learning XOR

6.2 Gradient-Based Learning

6.2.1 Cost function

\[\begin{aligned} \theta_{ML} = argmax_{\theta} \sum_{i = 1}^{m} LogP_{model}(x^{(i)};\theta) \end{aligned}\] \[\begin{aligned} \theta_{ML} = argmax_{\theta} E_{x \sim \hat{P}_{data}}[LogP_{model}(x^{(i)};\theta)] \end{aligned}\]
6.2.1.1 Learning Conditional Distributions with Maximum Likelihood
\[\begin{aligned} J(\theta) = -E_{x, y \sim \hat{P}_{data}} [Log p_{model}(y|x)] \end{aligned}\]
6.2.1.2 Learning Conditional Statistics

6.2.2 Output Units

6.2.2.1 Linear Units for Gaussian Output Distribution
\[\begin{aligned} p(y|x) = \mathcal{N} (y;\hat{y}, I) \end{aligned}\]
6.2.2.2 Sigmoid Units for Bernoulli Output Distribution
\[\begin{aligned} f(x) = ln(1 + e^{x}) \end{aligned}\]
Softmax Units for Multinouli Output Distributions
\[s(x_{i}) = \frac{e^{x_{i}}}{sum_{j = 1}^{n} e^{x_{j}}}\]

6.2.2.4 Other Output Types

6.3 Hidden Units

6.3.1 Rectified Linear Units and Their Generalization整流线性单元及其拓展
\[( h_{i} = g(z, \alpha)_{i} = max(0, z_i) + \alpha_{i} min(0, z_{i})\]
6.3.2 Logistic Sigmoid and Hyperbolic
6.3.3 其他隐藏单元

6.4 Architecture Design

6.4.1 Universal Approximation Properties and Depth
6.5.2 Other Architectural Consideration

6.5 Back-Propagation and Other Differentiation Algorithm

6.5.1 Computational Graphs
6.5.2 Chain Rule of Calculus
\[\frac{dz}{dx} = \frac{dz}{dy} \times \frac{dy}{dx}\]

6.5.3 Recursively Applying the Chain Rule to Obtain Backprop

6.5.4 Back-Propagation Computation in Fully connected MLP

Symbol-to-Symbol Derivatives

6.5.6 General Back-Propagation

6.5.7 Example: Back-Propagation for MLP Training

6.5.8 Complications

6.5.9 Differentiation outside the Deep Learning Communit

6.5.10 Higher-Order Derivatives

6.6 Historical Notes