๋‹ค์ธต ํผ์…‰ํŠธ๋ก (Multilayer Perceptron, MLP)

๋‹ค์ธต ํผ์…‰ํŠธ๋ก (MLP) ๋”ฅ๋Ÿฌ๋‹์˜ ์ถœ๋ฐœ์ ์ธ MLP์˜ ๊ตฌ์กฐ์™€ ์ž‘๋™ ์›๋ฆฌ๋ฅผ ์•Œ์•„๋ณด๊ณ , Python์œผ๋กœ ์ง์ ‘ ๊ตฌํ˜„ํ•ด๋ณธ๋‹ค.


๋“ค์–ด๊ฐ€๋ฉฐ

๋”ฅ๋Ÿฌ๋‹์˜ ์„ธ๊ณ„์— ์ฒซ๋ฐœ์„ ๋‚ด๋””๋”œ ๋•Œ ๊ฐ€์žฅ ๋จผ์ € ๋งˆ์ฃผํ•˜๊ฒŒ ๋˜๋Š” ๋ชจ๋ธ์€ ๋ฐ”๋กœ ๋‹ค์ธต ํผ์…‰ํŠธ๋ก (Multilayer Perceptron, MLP)์ผ ๊ฒƒ์ด๋‹ค. MLP๋Š” ์ธ๊ณต์‹ ๊ฒฝ๋ง์˜ ๊ฐ€์žฅ ๊ธฐ๋ณธ์ ์ธ ํ˜•ํƒœ๋กœ, ๋ณต์žกํ•œ ๋”ฅ๋Ÿฌ๋‹ ์•„ํ‚คํ…์ฒ˜์˜ ๊ทผ๊ฐ„์„ ์ด๋ฃจ๋Š” ์ค‘์š”ํ•œ ๊ฐœ๋…์ด๋‹ค.

์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” ๋”ฅ๋Ÿฌ๋‹์˜ ์ถœ๋ฐœ์ ์ด๋ผ ํ•  ์ˆ˜ ์žˆ๋Š” MLP์˜ ๊ตฌ์กฐ์™€ ์ž‘๋™ ์›๋ฆฌ๋ฅผ ์•Œ์•„๋ณด๊ณ , ๊ฐ„๋‹จํ•œ Python ์˜ˆ์ œ ์ฝ”๋“œ๋ฅผ ํ†ตํ•ด ์ง์ ‘ ๋ชจ๋ธ์„ ๊ตฌํ˜„ํ•ด๋ณด์•˜๋‹ค.

1. ๋‹ค์ธต ํผ์…‰ํŠธ๋ก (MLP)์ด๋ž€?

MLP๋Š” ์ž…๋ ฅ์ธต(Input Layer)๊ณผ ์ถœ๋ ฅ์ธต(Output Layer) ์‚ฌ์ด์— ํ•˜๋‚˜ ์ด์ƒ์˜ ์€๋‹‰์ธต(Hidden Layer)์„ ํฌํ•จํ•˜๋Š” ์ธ๊ณต์‹ ๊ฒฝ๋ง ๊ตฌ์กฐ๋‹ค. ๊ฐ ์ธต์€ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋…ธ๋“œ(๋‰ด๋Ÿฐ)๋กœ ๊ตฌ์„ฑ๋˜๋ฉฐ, ํ•œ ์ธต์˜ ๋…ธ๋“œ๋“ค์€ ๋‹ค์Œ ์ธต์˜ ๋ชจ๋“  ๋…ธ๋“œ์™€ ์™„์ „ํžˆ ์—ฐ๊ฒฐ๋œ(Fully-connected) ํŠน์ง•์„ ๊ฐ€์ง„๋‹ค.

๊ฐ€์žฅ ๋‹จ์ˆœํ•œ ์‹ ๊ฒฝ๋ง์ธ ํผ์…‰ํŠธ๋ก ์€ ์„ ํ˜• ๋ถ„๋ฆฌ๊ฐ€ ๊ฐ€๋Šฅํ•œ ๋ฌธ์ œ๋งŒ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ์—ˆ๋‹ค. ํ•˜์ง€๋งŒ MLP๋Š” ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์€๋‹‰์ธต์„ ์Œ“๊ณ  ๊ฐ ์ธต์— ๋น„์„ ํ˜• ํ™œ์„ฑํ™” ํ•จ์ˆ˜(Non-linear Activation Function, ์˜ˆ: ReLU, Sigmoid)๋ฅผ ๋„์ž…ํ•จ์œผ๋กœ์จ, ๋ณต์žกํ•œ ๋น„์„ ํ˜• ๋ฌธ์ œ๊นŒ์ง€ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋Š” ๋Šฅ๋ ฅ์„ ๊ฐ–์ถ”๊ฒŒ ๋˜์—ˆ๋‹ค.

MLP์˜ ๊ธฐ๋ณธ ๊ตฌ์กฐ

2. MLP์˜ ํ•™์Šต ๊ณผ์ •

MLP์˜ ํ•™์Šต์€ ํฌ๊ฒŒ ์ˆœ์ „ํŒŒ(Forward Propagation)์™€ ์—ญ์ „ํŒŒ(Backward Propagation) ๋‘ ๋‹จ๊ณ„๋กœ ์ด๋ฃจ์–ด์ง„๋‹ค.

  1. ์ˆœ์ „ํŒŒ (Forward Propagation): ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๊ฐ€ ์ž…๋ ฅ์ธต์—์„œ ์‹œ์ž‘ํ•˜์—ฌ ์€๋‹‰์ธต์„ ๊ฑฐ์ณ ์ถœ๋ ฅ์ธต๊นŒ์ง€ ์ „๋‹ฌ๋˜๋Š” ๊ณผ์ •์ด๋‹ค. ๊ฐ ๋…ธ๋“œ์—์„œ๋Š” ์ด์ „ ์ธต์œผ๋กœ๋ถ€ํ„ฐ ๋“ค์–ด์˜จ ์‹ ํ˜ธ๋“ค์— ๊ฐ€์ค‘์น˜(weight)๋ฅผ ๊ณฑํ•˜๊ณ , ํŽธํ–ฅ(bias)์„ ๋”ํ•œ ๋’ค, ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ๊ฑฐ์ณ ๋‹ค์Œ ์ธต์œผ๋กœ ์‹ ํ˜ธ๋ฅผ ์ „๋‹ฌํ•œ๋‹ค. ์ด๋ ‡๊ฒŒ ๊ณ„์‚ฐ๋œ ์ตœ์ข… ์ถœ๋ ฅ๊ฐ’(์˜ˆ์ธก๊ฐ’)๊ณผ ์‹ค์ œ๊ฐ’(์ •๋‹ต)์˜ ์ฐจ์ด๋ฅผ ์†์‹ค(Loss)์ด๋ผ๊ณ  ํ•œ๋‹ค.
  2. ์—ญ์ „ํŒŒ (Backward Propagation): ๊ณ„์‚ฐ๋œ ์†์‹ค์„ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ๊ฐ ์—ฐ๊ฒฐ์˜ ๊ฐ€์ค‘์น˜์™€ ํŽธํ–ฅ์„ ์—…๋ฐ์ดํŠธํ•˜๋Š” ๊ณผ์ •์ด๋‹ค. ์ถœ๋ ฅ์ธต์—์„œ๋ถ€ํ„ฐ ์ž…๋ ฅ์ธต ๋ฐฉํ–ฅ์œผ๋กœ, ๊ฐ ๊ฐ€์ค‘์น˜๊ฐ€ ์†์‹ค์— ์–ผ๋งˆ๋‚˜ ์˜ํ–ฅ์„ ๋ฏธ์ณค๋Š”์ง€(๊ธฐ์—ฌ๋„)๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ , ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•(Gradient Descent)๊ณผ ๊ฐ™์€ ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ€์ค‘์น˜๋ฅผ ์กฐ์ •ํ•œ๋‹ค. ์ด ๊ณผ์ •์„ ์ˆ˜์—†์ด ๋ฐ˜๋ณตํ•˜๋ฉฐ ๋ชจ๋ธ์€ ์ตœ์ ์˜ ๊ฐ€์ค‘์น˜๋ฅผ ์ฐพ์•„๊ฐ„๋‹ค.

3. Python์œผ๋กœ MLP ๊ตฌํ˜„ํ•˜๊ธฐ

๊ฐœ๋…๋งŒ์œผ๋กœ๋Š” ์™€๋‹ฟ์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋‹ค. ์ด๋ฒˆ์—๋Š” scikit-learn์˜ Iris ๋ฐ์ดํ„ฐ์…‹๊ณผ TensorFlow/Keras๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ„๋‹จํ•œ MLP ๋ชจ๋ธ์„ ์ง์ ‘ ๋งŒ๋“ค์–ด ๋ณด์•˜๋‹ค.

๊ฐ€. ํ•„์š”ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์„ค์น˜

๋จผ์ €, ํ•„์š”ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์„ค์น˜ํ•ด์•ผ ํ•œ๋‹ค.

sudo pip3 install tensorflow scikit-learn

๋‚˜. ์˜ˆ์ œ ์†Œ์Šค ์ฝ”๋“œ

๋ถ“๊ฝƒ์˜ ๊ฝƒ์žŽ๊ณผ ๊ฝƒ๋ฐ›์นจ์˜ ๊ธธ์ด/๋„ˆ๋น„(4๊ฐœ ํŠน์„ฑ)๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ 3๊ฐ€์ง€ ํ’ˆ์ข…(Setosa, Versicolor, Virginica) ์ค‘ ํ•˜๋‚˜๋กœ ๋ถ„๋ฅ˜ํ•˜๋Š” ๊ฐ„๋‹จํ•œ MLP ๋ชจ๋ธ์ด๋‹ค.

import tensorflow as tf
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.datasets import load_iris
import numpy as np

# 1. ๋ฐ์ดํ„ฐ ๋กœ๋“œ ๋ฐ ์ „์ฒ˜๋ฆฌ
# ๋ถ“๊ฝƒ ๋ฐ์ดํ„ฐ์…‹ ๋กœ๋“œ
iris = load_iris()
X = iris.data
y = iris.target.reshape(-1, 1) # (n_samples,) -> (n_samples, 1)

# ์ž…๋ ฅ ๋ฐ์ดํ„ฐ(X) ํ‘œ์ค€ํ™”
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# ํƒ€๊ฒŸ ๋ฐ์ดํ„ฐ(y) ์›-ํ•ซ ์ธ์ฝ”๋”ฉ
# 0 -> [1, 0, 0]
# 1 -> [0, 1, 0]
# 2 -> [0, 0, 1]
encoder = OneHotEncoder(sparse_output=False)
y_onehot = encoder.fit_transform(y)

# ํ•™์Šต ๋ฐ์ดํ„ฐ์™€ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ ๋ถ„๋ฆฌ
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y_onehot, test_size=0.2, random_state=42)

# 2. MLP ๋ชจ๋ธ ๊ตฌ์ถ• (Keras ์‚ฌ์šฉ)
model = tf.keras.Sequential([
    # ์ž…๋ ฅ์ธต (ํŠน์„ฑ 4๊ฐœ), ์ฒซ ๋ฒˆ์งธ ์€๋‹‰์ธต (๋…ธ๋“œ 10๊ฐœ), ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ReLU
    tf.keras.layers.Dense(10, activation='relu', input_shape=(4,)),

    # ๋‘ ๋ฒˆ์งธ ์€๋‹‰์ธต (๋…ธ๋“œ 10๊ฐœ), ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ReLU
    tf.keras.layers.Dense(10, activation='relu'),

    # ์ถœ๋ ฅ์ธต (ํด๋ž˜์Šค 3๊ฐœ), ํ™œ์„ฑํ™” ํ•จ์ˆ˜ Softmax (๋‹ค์ค‘ ๋ถ„๋ฅ˜์šฉ)
    tf.keras.layers.Dense(3, activation='softmax')
])

# 3. ๋ชจ๋ธ ์ปดํŒŒ์ผ
# ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜: Adam
# ์†์‹ค ํ•จ์ˆ˜: categorical_crossentropy (๋‹ค์ค‘ ๋ถ„๋ฅ˜์šฉ)
# ํ‰๊ฐ€ ์ง€ํ‘œ: accuracy (์ •ํ™•๋„)
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# ๋ชจ๋ธ ๊ตฌ์กฐ ์š”์•ฝ ์ถœ๋ ฅ
model.summary() 

 

ํ•™์Šต ๊ณผ์ •์˜ ์†์‹ค ๋ฐ ์ •ํ™•๋„ ๋ณ€ํ™”๋ฅผ ์‹œ๊ฐํ™”ํ•œ ๊ทธ๋ž˜ํ”„์ด๋‹ค.

# 4. ๋ชจ๋ธ ํ•™์Šต
# 100๋ฒˆ์˜ epoch(์ „์ฒด ๋ฐ์ดํ„ฐ์…‹ ๋ฐ˜๋ณต ํ•™์Šต ํšŸ์ˆ˜) ๋™์•ˆ ํ•™์Šต
# batch_size: ํ•œ ๋ฒˆ์— ์ฒ˜๋ฆฌํ•  ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ ๊ฐœ์ˆ˜
history = model.fit(X_train, y_train, epochs=100, batch_size=5, verbose=1)

# 5. ๋ชจ๋ธ ํ‰๊ฐ€
# ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ๋กœ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ ํ‰๊ฐ€
loss, accuracy = model.evaluate(X_test, y_test, verbose=0)
print(f'\nTest Accuracy: {accuracy*100:.2f}%')

# 6. ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ๋กœ ์˜ˆ์ธก
# ์ž„์˜์˜ ์ƒˆ๋กœ์šด ๋ถ“๊ฝƒ ๋ฐ์ดํ„ฐ ์ƒ์„ฑ (ํ‘œ์ค€ํ™” ํ•„์š”)
new_flower_data = np.array([[5.1, 3.5, 1.4, 0.2]]) # Setosa ํ’ˆ์ข…์— ๊ฐ€๊นŒ์šด ๋ฐ์ดํ„ฐ
new_flower_data_scaled = scaler.transform(new_flower_data)

# ์˜ˆ์ธก ์ˆ˜ํ–‰
prediction = model.predict(new_flower_data_scaled)
predicted_class = np.argmax(prediction, axis=1)

print(f'\nNew data prediction: {prediction}')
print(f'Predicted class: {iris.target_names[predicted_class][0]}')
# print
New data prediction: [[9.9966228e-01 3.3776104e-04 1.1798655e-13]]
Predicted class: setosa

 

๋งˆ์น˜๋ฉฐ

MLP๋Š” ์ •ํ˜• ๋ฐ์ดํ„ฐ(ํ…Œ์ด๋ธ” ํ˜•ํƒœ์˜ ๋ฐ์ดํ„ฐ)์— ๋Œ€ํ•œ ๋ถ„๋ฅ˜๋‚˜ ํšŒ๊ท€ ๋ฌธ์ œ์—์„œ ์—ฌ์ „ํžˆ ๊ฐ•๋ ฅํ•˜๊ณ  ํšจ๊ณผ์ ์ธ ๋ชจ๋ธ์ด๋‹ค.
๋˜ํ•œ, CNN์ด๋‚˜ RNN๊ณผ ๊ฐ™์€ ๋ณต์žกํ•œ ๋ชจ๋ธ์˜ ์ผ๋ถ€(์ฃผ๋กœ ๋งˆ์ง€๋ง‰ ์ถœ๋ ฅ๋‹จ)๋กœ๋„ ํ™œ์šฉ๋˜๋Š” ๋งŒํผ,
๊ทธ ๊ตฌ์กฐ์™€ ์›๋ฆฌ๋ฅผ ๋ช…ํ™•ํžˆ ์ดํ•ดํ•˜๋Š” ๊ฒƒ์€ ๋”ฅ๋Ÿฌ๋‹ ํ•™์Šต์˜ ํŠผํŠผํ•œ ๊ธฐ์ดˆ๊ฐ€ ๋  ๊ฒƒ์ด๋‹ค.


์ฐธ๊ณ  ์ž๋ฃŒ