๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
TIL/Coursera(Google ML Bootcamp)

[Deep Learning Specialization] Shallow Neural Network - week3:: seoftware

by seowit 2021. 8. 20.

๐Ÿ“œ ๊ฐ•์˜ ์ •๋ฆฌ 

* Cousera ๊ฐ•์˜ ์ค‘ Andrew Ng ๊ต์ˆ˜๋‹˜์˜ Deep Learning Specialization ๊ฐ•์˜๋ฅผ ๊ณต๋ถ€ํ•˜๊ณ  ์ •๋ฆฌํ•œ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค.

* ์˜์–ด ๊ณต๋ถ€๋ฅผ ํ•˜๋ ค๊ณ  ์˜์–ด๋กœ ๊ฐ•์˜๋ฅผ ์ •๋ฆฌํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ํ˜น์‹œ ํ‹€๋ฆฐ ๋ถ€๋ถ„์ด๋‚˜ ์–ด์ƒ‰ํ•œ ๋ถ€๋ถ„์ด ์žˆ๋‹ค๋ฉด ๋Œ“๊ธ€๋กœ ์•Œ๋ ค์ฃผ์‹œ๊ฑฐ๋‚˜ ๋„˜์–ด๊ฐ€์ฃผ์‹œ๋ฉด ๊ฐ์‚ฌํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค


1. Neural networks overview

I learned about new notations and representation of above shallow neural network. Brackets [] means the layer order and x superscript (i) means i-th training example of x.

 

2. Neural network representation

We can represent input layer x to a^[0], called layer zero, so technically, there are three layers in a "two layered neural network". 

Two layered neural network consists of...

  • Input layer a^[0]
  • Hidden layer a^[1] (including W^[1], b^[1])
  • Output layer a^[2] (including W^[2], b^[2])

 

3. Vetorizing Across Mutiple Examples

4. Activation functions

    • Sigmoid function
      •   
      • the tanh is pretty much strictly superior.
      • Derivatives : g'(z) = g(z)(1-g(z)) 
  • tanh
    • Derivatives : g'(z) = 1-(g(z))^2
  • ReLU
  • Leaky ReLU
    •  

 

5. Why do you need Non-Linear function?

Z[1] = W[1]*x + b[1]; A[1] = Z[1]

Z[2] = W[2]*A[1] + b[2]= W[2]*W[1]*x + W[2]b[1] + b[2] = W'x + b'

"W'x + b'" is a linear function. If you don't have an activation function, then no matter how many layers your neural network has, all it's doing is just computing a linear activation function. So you might as well not have any hidden layers.

6. Derivatives of Activation Functions

๋Œ“๊ธ€