๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
TIL/Coursera(Google ML Bootcamp)

[Sequence Models] Recurrent Neural Networks :: seoftware

by seowit 2021. 9. 27.

๐Ÿ“œ ๊ฐ•์˜ ์ •๋ฆฌ 

 

* Coursera ๊ฐ•์˜ ์ค‘ Andrew Ng ๊ต์ˆ˜๋‹˜์˜ Sequence Models ๊ฐ•์˜๋ฅผ ๊ณต๋ถ€ํ•˜๊ณ  ์ •๋ฆฌํ•œ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค.

* ์ด๋ฏธ์ง€ ์ถœ์ฒ˜ : Deeplearning.AI


Recurrent Neural Networks

1. Why Sequence Models?

2. Notation

์ด์ „๊ณผ ๋™์ผํ•˜๊ฒŒ i๋ฒˆ์งธ example์— ๋Œ€ํ•ด (i)๋กœ ํ‘œํ˜„ํ•˜๊ณ  x_(i)์—์„œ ์‹œํ€€์Šค๋ฅผ ๋‚˜ํƒ€๋‚ด๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ <t>๋ฅผ ์‚ฌ์šฉ

10K๊ฐœ์˜ ๋‹จ์–ด๊ฐ€ ์žˆ๋Š” ์‚ฌ์ „์ด ์žˆ๋‹ค๊ณ  ํ•  ๋•Œ, ๊ฐ ๋‹จ์–ด์˜ ์œ„์น˜๋ฅผ 1๋งŒ ์ฐจ์› shape์˜ ์›ํ•ซ ๋ฒกํ„ฐ๋กœ ํ‘œํ˜„

 

3. Recurrent Neural Network Model

๋ฌธ์žฅ์„ ์ผ๋ฐ˜ ์‹ ๊ฒฝ๋ง์œผ๋กœ ํ•™์Šตํ•˜์ง€ ๋ชปํ•˜๋Š” ์ด์œ ๋Š” ์ธํ’‹๊ณผ ์•„์›ƒํ’‹์˜ ํฌ๊ธฐ๊ฐ€ ์ผ์ •ํ•˜์ง€ ์•Š๊ณ , ํ•œ ๋‹จ์–ด์˜ ์ค‘์š”๋„๊ฐ€ ๋ฌธ์žฅ๋งˆ๋‹ค ๋‹ค๋ฅด๊ธฐ ๋•Œ๋ฌธ์— ํŠน์ง•์„ ๋ฝ‘๊ธฐ ์–ด๋ ต๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค.

์ˆœํ™˜์‹ ๊ฒฝ๋ง์„ ๋‹ค์ด์–ด๊ทธ๋žจ์œผ๋กœ ํ‘œํ˜„ํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™๋‹ค. ๋ณดํ†ต ์˜ค๋ฅธ์ชฝ(2๋ฒˆ)๊ณผ ๊ฐ™์ด ํ‘œ๊ธฐํ•˜์ง€๋งŒ ์™ผ์ชฝ(1๋ฒˆ)์ด ๋” ์ง๊ด€์ ์œผ๋กœ ์ดํ•ดํ•˜๊ธฐ ์‰ฝ๋‹ค. ๋ฌธ์žฅ์—์„œ ํ•œ ๋‹จ์–ด(x_t)์”ฉ ์‹ ๊ฒฝ๋ง์˜ ์ธํ’‹์œผ๋กœ ๋„ฃ๋Š”๋ฐ ๊ทธ ์ „ ๋‹จ๊ณ„์˜ ๊ฒฐ๊ณผ(a_t)๋ฅผ ํ•จ๊ป˜ ๋„ฃ์–ด์ค€๋‹ค.

์ด์ „์˜ ๊ฐ’์„ ๋„ฃ์–ด์คŒ์œผ๋กœ์จ y_<3>๋ฅผ ๊ตฌํ•˜๋Š” ๊ณผ์ •์— x_<1>, x_<2>, x_<3>๊ฐ€ ๋ชจ๋‘ ์˜ํ–ฅ์„ ์ฃผ๊ฒŒ ๋œ๋‹ค.

์—ฌ๊ธฐ์„œ(RNN) ๋ฌธ์ œ์ ์€ x_<1>, x_<2>, x_<3> ๋Š” y_<3>์— ์˜ํ–ฅ์„ ์ฃผ์ง€๋งŒ x_<4>, x_<5> ๋“ฑ์˜ ๋’ค์— ์žˆ๋Š” ๋‹จ์–ด๋“ค์€ ๊ณ ๋ คํ•˜์ง€ ๋ชปํ•œ๋‹ค๋Š” ์ ์ด๋‹ค. ์•„๋ž˜ Teddy ์˜ˆ์ œ๋ฅผ ๋ณด๋ฉด ์•ž์˜ ์ •๋ณด๋กœ๋งŒ Teddy๊ฐ€ ๊ณฐ์ธํ˜•์ธ์ง€ ๋Œ€ํ†ต๋ น์ธ์ง€ ์•Œ ์ˆ˜ ์—†๋‹ค.

 

์œ„์˜ ๊ณผ์ •์„ ๊น”๋”ํ•˜๊ฒŒ ํ‘œํ˜„ํ•˜๊ณ  a์™€ y์— ๋Œ€ํ•œ ์ž์„ธํ•œ ์‹์„ ๊ตฌํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

์ด๋ฅผ ๋” ๊ฐ„๋‹จํ•˜๊ฒŒ ํ‘œํ˜„ํ•ด๋ณด๋ฉด ์•„๋ž˜์™€ ๊ฐ™๋‹ค. a์™€ x๋ฅผ ์Œ“์•„์„œ ๋ฌถ์Œ์œผ๋กœ ํ‘œํ˜„ํ•˜๋Š” ๊ฒƒ์ด๋‹ค. 

 

4. Backpropagation through time

logistic regression loss ๋ฐฉ๋ฒ• ์ค‘ Cross entropy๋ฅผ ์‚ฌ์šฉํ•œ backpropagation ๊ณผ์ •์„ ๋‚˜ํƒ€๋‚ธ ๊ฒƒ์ด๋‹ค.

RNN์˜ ์ฃผ์š” motivation์€ input sequence์˜ ๊ธธ์ด์™€ output sequence์˜ ๊ธธ์ด๊ฐ€ ๊ฐ™๋‹ค๋Š” ์ (Tx == Ty)์ด๋‹ค.

5. Different Types of RNNS

RNN์—๋Š” Tx = Ty์ธ many-to-many ์ผ€์ด์Šค๋„ ์žˆ๊ณ  Tx != Ty์ธ many-to-one ์ผ€์ด์Šค๋„ ์žˆ๋‹ค. ์ „์ž์˜ ์˜ˆ์‹œ๋กœ๋Š” ์–ธ์–ด ๋ฒˆ์—ญํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ์žˆ๊ณ , ํ›„์ž ์˜ˆ์‹œ๋กœ๋Š” ์˜ํ™” ํ‰๋ก ์„ ํ‰์ ์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ์žˆ๋‹ค. 

์ด ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ one-to-one, one-to-many ๋“ฑ๋“ฑ ์•„๋ž˜์™€ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ์ข…๋ฅ˜์˜ RNN์ด ์กด์žฌํ•œ๋‹ค.

6. Language Model and Sequence Generation

 

 

๋Œ“๊ธ€