Learning Algorithms#

random variables matrix vector eignevector

what is machine learning?#

์ „ํ†ต์ ์ธ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์€ (rule based) explicit programming์ด์—ˆ๋‹ค. ์ฆ‰ ํŠน์ • ์กฐ๊ฑด์—์„œ ํŠน์ • ๊ฒฐ๊ณผ๋ฅผ ๋„์ถœํ•˜๋Š” ๋ฐฉ์‹์ด์—ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ด๋Ÿฐ ๋ฐฉ์‹์€ ๊ทœ์น™์„ ๋งŒ๋“œ๋Š” ๊ฒƒ์ด ์–ด๋ ต๊ณ , ๊ทœ์น™์ด ๋ณต์žกํ•ด์ง€๋ฉด ๋ณต์žกํ•ด์งˆ์ˆ˜๋ก ๊ทœ์น™์„ ๋งŒ๋“œ๋Š” ๊ฒƒ์ด ์–ด๋ ค์›Œ์ง„๋‹ค. ๋˜ํ•œ ๊ทœ์น™ ๊ธฐ๋ฐ˜์˜ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์€ ๊ทธ ๊ทœ์น™๋“ค์„ ํ”ผํ•  ๋ณ€์น™๋“ค์„ ๋ชจ๋‘ ์ปค๋ฒ„ํ•ด์•ผ ํ•˜๋Š”๋ฐ ๊ทธ๊ฒƒ์€ ์‚ฌ์‹ค์ƒ ์–ด๋ ต๋‹ค. ๊ทธ๋ ‡๊ธฐ ๋•Œ๋ฌธ์— ๊ธฐ๊ณ„๊ฐ€ ์Šค์Šค๋กœ pattern์„ learningํ•˜๋Š” ์ ‘๊ทผ๋ฒ•์ด ๋“ฑ์žฅํ–ˆ๋Š”๋ฐ, ๊ทธ๊ฒƒ์ด Machine learning์ด๋‹ค.

Definition of machine learning#

Field of study that gives computers the ability to learn without being explicitly programmed

โ€“ Machine learning Definition by Arthur Samuel 1959

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

โ€“ Tom Mitchell 1998 well-posed learning problem

Supervised learning ์ง€๋„ ํ•™์Šต#

๊ฐ€์žฅ ๊ธฐ๋ณธ์ ์ธ ๋จธ์‹ ๋Ÿฌ๋‹ ํˆด์€ supervised learning ์ง€๋„ํ•™์Šต ์ด๋‹ค. Given a dataset with inputs X and labels Y, learn a mapping from X to y. ์ฆ‰, ํŠน์ • input(X)์— ๋Œ€ํ•ด์„œ label or targe(y) output์ด ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์…‹์ด ์ฃผ์–ด์ง„ ๊ฒฝ์šฐ๋ฅผ ๋งํ•œ๋‹ค. ์ปดํ“จํ„ฐ๋Š” ์ด X์™€ y ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ํ•™์Šต ๋ฐ ์œ ์ถ”๋ฅผ ํ•˜๊ฒŒ ๋œ๋‹ค. ์ง€๋„ํ•™์Šต์˜ ํฐ ๋ถ„๋ฅ˜๋Š” regression ๊ณผ classification์ด ์žˆ๋‹ค.

Regression ํšŒ๊ท€#

the value y yorโ€™re trying to predict is continuous

โ€”Andrew Ng

ํšŒ๊ท€๋ž€ output y์˜ ๊ฐ’์ด ์—ฐ์†์ ์ธ ์ฆ‰ continuousํ•œ ๊ฐ’์„ ๊ฐ€์ง€๋Š” ๊ฒƒ์„ ๋งํ•œ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด์„œ ์ง‘์˜ ํฌ๊ธฐ์— ๋”ฐ๋ฅธ ์ง‘๊ฐ’์„ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ด ํšŒ๊ท€ ๋ฌธ์ œ์ด๋‹ค. ์ง‘์˜ ํฌ๊ธฐ๋Š” ์—ฐ์†์ ์ธ ๊ฐ’์ด๊ธฐ ๋•Œ๋ฌธ์— ํšŒ๊ท€ ๋ฌธ์ œ์ด๋‹ค. ๋ฐ˜๋ฉด์— ์ง‘์˜ ํฌ๊ธฐ์— ๋”ฐ๋ผ์„œ ์ง‘์ด ์žˆ๋Š” ๋™๋„ค๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์€ classification ๋ฌธ์ œ์ด๋‹ค. ๋™๋„ค๋Š” ์—ฐ์†์ ์ธ ๊ฐ’์ด ์•„๋‹ˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

์กฐ๊ธˆ ๋” ๋‚˜์•„๊ฐ€์„œ Lienar regression์€ ์–ด๋–ค ์˜๋ฏธ์ผ๊นŒ? ์„ ํ˜• ํšŒ๊ท€๋ž€ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์žฅ ์ž˜ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๋Š” ์„ ํ˜• ํ•จ์ˆ˜๋ฅผ ์ฐพ๋Š” ๊ฒƒ์ด๋‹ค. ์„ ํ˜•์˜ ์˜๋ฏธ input x์˜ ๊ฐ’์˜ ๋ณ€ํ™”์— ๋”ฐ๋ผ์„œ ์˜ˆ์ธก๊ฐ€๋Šฅํ•˜๊ฒŒ output y์˜ ๊ฐ’์ด ์–ด๋–ป๊ฒŒ ๋ณ€ํ™”ํ•˜๋Š”์ง€๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฒƒ์ด๋‹ค. ์ฆ‰ input x์˜ ๊ฐ’์˜ ๋ณ€ํ™”์— ๋”ฐ๋ผ y ๊ฐ’์€ ์ข…์†์ ์œผ๋กœ ๋ณ€ํ•จ์œผ๋กœ x๋Š” ๋…๋ฆฝ ๋ณ€์ˆ˜์ด๊ณ  y๋Š” ์ข…์† ๋ณ€์ˆ˜๋ผ๊ณ  ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค.

์„ ํ˜• ํšŒ๊ท€๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํ˜•ํƒœ๋ฅผ ๊ฐ€์ง„๋‹ค.

\[ f(x) = w^Tx + b \]

์—ฌ๊ธฐ์„œ \(w\)๋Š” weight, \(b\)๋Š” bias์ด๋‹ค. ์ด๊ฒƒ์„ ํ–‰๋ ฌ๋กœ ํ‘œํ˜„ํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

\[\begin{split} f(x) = w^Tx + b = \begin{bmatrix}w_1 & w_2 & \cdots & w_n\end{bmatrix}\begin{bmatrix}x_1 \\ x_2 \\ \vdots \\ x_n\end{bmatrix} + b \end{split}\]

์•„๋งˆ 2๊ฐ•์—์„œ ๋” ์ž์„ธํ•˜๊ฒŒ ๋‹ค๋ฃฐ ๋“ฏํ•˜๋‹ค.

Classification ๋ถ„๋ฅ˜#

classification refers to that y takes on a discrete number of variables

ํ•˜๋‚˜์˜ feature ์ฆ‰ ํŠน์„ฑ์œผ๋กœ plotํ•˜๋Š” ๊ฒƒ์€ ์‰ฝ์ง€๋งŒ, ํ˜„์‹ค์ ์œผ๋กœ ์–ด๋–ค ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์—ฌ๋Ÿฌ feature๋“ค๊ณผ y์˜ ๊ด€๊ณ„๋ฅผ mappingํ•˜๋Š”๊ฒŒ ๋Œ€๋ถ€๋ถ„์˜ ๊ฒฝ์šฐ์ผ ๊ฒƒ์ด๋‹ค.

๊ทธ๋Ÿฐ ๊ฒฝ์šฐ์—๋Š” ์–ด๋–ค feature๊ฐ€ ์ค‘์š”ํ•œ์ง€ feature extracting์„ ํ•˜๊ธฐ๋„ ํ•˜๊ณ , feature๋“ค์„ ์กฐํ•ฉํ•ด์„œ ์ƒˆ๋กœ์šด feature๋ฅผ ๋งŒ๋“ค๊ธฐ๋„ ํ•œ๋‹ค. ์ด๋ ‡๊ฒŒ feature๋“ค์„ ์กฐํ•ฉํ•ด์„œ ์ƒˆ๋กœ์šด feature๋ฅผ ๋งŒ๋“ค์–ด๋‚ด๋Š” ๊ฒƒ์„ feature engineering์ด๋ผ๊ณ  ํ•œ๋‹ค.

๋˜ํ•œ

svm์€ ๋ฌดํ•œ๋Œ€์˜ input features๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค? kernels ๋ฅผ ๋‚˜์ค‘์— ์‚ฌ์šฉํ•ด์„œ ๋ฌดํ•œ๋Œ€์˜ ์ •๋ณด๋ฅผ ๋จธ์‹ ๋Ÿฌ๋‹์— ์ด์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ๋‚˜์ค‘์— ๋‚˜์˜จ๋‹ค๊ณ  ํ•œ๋‹ค.

Unsupervised learning ๋น„์ง€๋„ ํ•™์Šต#

just input x, no labels

  • k-means clustering

  • ๊ตฌ๊ธ€ ๋‰ด์Šค๋Š” ๋งค์ผ์˜ ๋‰ด์Šค๋ฅผ ํด๋Ÿฌ์Šคํ„ฐ๋งํ•ด์„œ ๋น„์Šทํ•œ ๋‰ด์Šค๋“ค ๋ผ๋ฆฌ ๋ฌถ๋Š” ๊ฒƒ์—์„œ ์˜ˆ๋ฅผ ๋“ค ์ˆ˜ ์žˆ๋‹ค.

  • Organize computing clusters

  • social network analysis

  • market segmentation

  • astronomical data analysis

Reinforcement learning ๊ฐ•ํ™”ํ•™์Šต#

if behaves well, โ€˜good dogโ€™ compliment

๊ฐœ๋ฅผ ํ‚ค์šฐ๋Š” ๊ฒƒ๊ณผ ๋น„์Šทํ•˜๋‹ค. ๊ฐœ๊ฐ€ ์ž˜ํ•˜๋ฉด ์นญ์ฐฌ์„ ํ•ด์ฃผ๊ณ  ์ž˜๋ชปํ•˜๋ฉด ํ˜ผ๋‚ด์ฃผ๋Š” ๊ฒƒ์ฒ˜๋Ÿผ. ์‹œ๊ฐ„์ด ์ง€๋‚จ์— ๋”ฐ๋ผ ์ž˜ํ•˜๋ฉด ์ƒ์„ ์ฃผ๋Š” ๋ฐฉ์‹์œผ๋กœ ์ง„ํ–‰๋œ๋‹ค. ๊ฒŒ์ž„ ํ”Œ๋ ˆ์ž‰์—์„œ ์ด๋Ÿฐ ์œ ํ˜•์„ ๋งŽ์ด ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

Deep learning#