https://archive.ics.uci.edu/ml/datasets/concrete+compressive+strength
이번에는 케라스를 활용해, 콘크리트의 강도를 체크하는 예제를 활용해보도록 하겠습니다. 위 사이트에서 관련 데이터를 다운받아옵니다.
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from keras.layers import *
from keras.models import *
from keras.utils import *
from sklearn.preprocessing import *
import seaborn as sns
필요한 라이브러리를 임포트합니다.
df = pd.read_excel('Concrete_Data.xls')
df.head()
Cement (component 1)(kg in a m^3 mixture) | Blast Furnace Slag (component 2)(kg in a m^3 mixture) | Fly Ash (component 3)(kg in a m^3 mixture) | Water (component 4)(kg in a m^3 mixture) | Superplasticizer (component 5)(kg in a m^3 mixture) | Coarse Aggregate (component 6)(kg in a m^3 mixture) | Fine Aggregate (component 7)(kg in a m^3 mixture) | Age (day) | Concrete compressive strength(MPa, megapascals) | |
---|---|---|---|---|---|---|---|---|---|
0 | 540.0 | 0.0 | 0.0 | 162.0 | 2.5 | 1040.0 | 676.0 | 28 | 79.986111 |
1 | 540.0 | 0.0 | 0.0 | 162.0 | 2.5 | 1055.0 | 676.0 | 28 | 61.887366 |
2 | 332.5 | 142.5 | 0.0 | 228.0 | 0.0 | 932.0 | 594.0 | 270 | 40.269535 |
3 | 332.5 | 142.5 | 0.0 | 228.0 | 0.0 | 932.0 | 594.0 | 365 | 41.052780 |
4 | 198.6 | 132.4 | 0.0 | 192.0 | 0.0 | 978.4 | 825.5 | 360 | 44.296075 |
읽어들인 데이터는 위와 같습니다.
df.columns
"""
Index(['Cement (component 1)(kg in a m^3 mixture)',
'Blast Furnace Slag (component 2)(kg in a m^3 mixture)',
'Fly Ash (component 3)(kg in a m^3 mixture)',
'Water (component 4)(kg in a m^3 mixture)',
'Superplasticizer (component 5)(kg in a m^3 mixture)',
'Coarse Aggregate (component 6)(kg in a m^3 mixture)',
'Fine Aggregate (component 7)(kg in a m^3 mixture)', 'Age (day)',
'Concrete compressive strength(MPa, megapascals) '],
dtype='object')
"""
컬럼의 정보는 다음과 같습니다.
df.rename(
columns={'Cement (component 1)(kg in a m^3 mixture)' : 'cement',
'Blast Furnace Slag (component 2)(kg in a m^3 mixture)' : 'blast',
'Fly Ash (component 3)(kg in a m^3 mixture)' : 'fly',
'Water (component 4)(kg in a m^3 mixture)' : 'water',
'Superplasticizer (component 5)(kg in a m^3 mixture)' : 'super',
'Coarse Aggregate (component 6)(kg in a m^3 mixture)' : 'coarse',
'Fine Aggregate (component 7)(kg in a m^3 mixture)' : 'fine',
'Age (day)' : 'age',
'Concrete compressive strength(MPa, megapascals) ' : 'strength'}, inplace=True)
개발이 편하도록, 불필요한 컬럼의 이름을 간단하게 바꾸어 줍니다.
df.head()
cement | blast | fly | water | super | coarse | fine | age | strength | |
---|---|---|---|---|---|---|---|---|---|
0 | 540.0 | 0.0 | 0.0 | 162.0 | 2.5 | 1040.0 | 676.0 | 28 | 79.986111 |
1 | 540.0 | 0.0 | 0.0 | 162.0 | 2.5 | 1055.0 | 676.0 | 28 | 61.887366 |
2 | 332.5 | 142.5 | 0.0 | 228.0 | 0.0 | 932.0 | 594.0 | 270 | 40.269535 |
3 | 332.5 | 142.5 | 0.0 | 228.0 | 0.0 | 932.0 | 594.0 | 365 | 41.052780 |
4 | 198.6 | 132.4 | 0.0 | 192.0 | 0.0 | 978.4 | 825.5 | 360 | 44.296075 |
데이터 컬럼이 잘 들어갔는지 확인해 봅니다.
X = df.drop(['strength'], axis=1)
Y = df['strength']
scaler = MinMaxScaler()
X = scaler.fit_transform(X)
X.shape
"""
(1030, 8)
"""
그 다음, 입력데이터는 강도를 뺀 나머지 데이터를 활용합니다. 그리고 훈련이 잘 진행되도록 정규화를 진행합니다.
sns.pairplot(df)
데이터 컬럼들의 상관성과 분포도를 해당 함수를 통해 알 수 있습니다. 좀 더 좋은 훈련이 되기 위해서는 분포가 좋은 데이터들이 많을 수 록 좋습니다.
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.1)
X_train.shape
"""
(927, 8)
"""
데이터를 라이브러리를 활용하여, 초기화 합니다. 그러면 해당 함수가 알아서, 훈련데이터와 검증데이터를 나누어 줍니다.
model = Sequential()
model.add(Dense(256, input_shape=(8,), activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(1, activation='relu'))
model.compile(loss='mse', optimizer='adam')
model.summary()
hist = model.fit(X_train, Y_train, epochs=100, validation_split=0.1)
"""
Layer (type) Output Shape Param #
=================================================================
dense_5 (Dense) (None, 256) 2304
_________________________________________________________________
dense_6 (Dense) (None, 128) 32896
_________________________________________________________________
dense_7 (Dense) (None, 32) 4128
_________________________________________________________________
dense_8 (Dense) (None, 1) 33
=================================================================
Total params: 39,361
Trainable params: 39,361
Non-trainable params: 0
_________________________________________________________________
Train on 834 samples, validate on 93 samples
Epoch 1/100
834/834 [==============================] - 0s 580us/step - loss: 1429.1455 - val_loss: 1002.4135
Epoch 2/100
834/834 [==============================] - 0s 115us/step - loss: 528.6366 - val_loss: 303.4880
Epoch 3/100
834/834 [==============================] - 0s 119us/step - loss: 229.3984 - val_loss: 203.1050
Epoch 4/100
834/834 [==============================] - 0s 119us/step - loss: 187.8111 - val_loss: 176.7382
Epoch 5/100
834/834 [==============================] - 0s 116us/step - loss: 162.9026 - val_loss: 156.8786
Epoch 6/100
834/834 [==============================] - 0s 114us/step - loss: 145.4408 - val_loss: 138.2341
Epoch 7/100
834/834 [==============================] - 0s 122us/step - loss: 131.0700 - val_loss: 128.6725
Epoch 8/100
834/834 [==============================] - 0s 120us/step - loss: 122.2626 - val_loss: 121.5253
Epoch 9/100
834/834 [==============================] - 0s 117us/step - loss: 117.8350 - val_loss: 119.4496
Epoch 10/100
834/834 [==============================] - 0s 126us/step - loss: 113.6401 - val_loss: 114.5831
Epoch 11/100
834/834 [==============================] - 0s 124us/step - loss: 111.4505 - val_loss: 112.7722
Epoch 12/100
834/834 [==============================] - 0s 121us/step - loss: 108.4970 - val_loss: 109.1458
Epoch 13/100
834/834 [==============================] - 0s 122us/step - loss: 106.8510 - val_loss: 108.2624
Epoch 14/100
834/834 [==============================] - 0s 115us/step - loss: 102.4254 - val_loss: 105.4226
Epoch 15/100
834/834 [==============================] - 0s 115us/step - loss: 102.5407 - val_loss: 104.1681
Epoch 16/100
834/834 [==============================] - 0s 112us/step - loss: 98.5970 - val_loss: 99.1225
Epoch 17/100
834/834 [==============================] - 0s 116us/step - loss: 95.1558 - val_loss: 99.9388
Epoch 18/100
834/834 [==============================] - 0s 113us/step - loss: 92.6347 - val_loss: 96.2228
Epoch 19/100
834/834 [==============================] - 0s 126us/step - loss: 88.5141 - val_loss: 98.0514
Epoch 20/100
834/834 [==============================] - 0s 122us/step - loss: 85.5895 - val_loss: 85.3493
Epoch 21/100
834/834 [==============================] - 0s 115us/step - loss: 81.2445 - val_loss: 83.1137
Epoch 22/100
834/834 [==============================] - 0s 113us/step - loss: 77.7913 - val_loss: 77.3656
Epoch 23/100
834/834 [==============================] - 0s 114us/step - loss: 77.4275 - val_loss: 77.2062
Epoch 24/100
834/834 [==============================] - 0s 115us/step - loss: 74.6809 - val_loss: 71.1784
Epoch 25/100
834/834 [==============================] - 0s 114us/step - loss: 70.9468 - val_loss: 66.5246
Epoch 26/100
834/834 [==============================] - 0s 111us/step - loss: 73.2065 - val_loss: 65.0481
Epoch 27/100
834/834 [==============================] - 0s 116us/step - loss: 63.8469 - val_loss: 59.5192
Epoch 28/100
834/834 [==============================] - 0s 114us/step - loss: 68.9593 - val_loss: 71.8911
Epoch 29/100
834/834 [==============================] - 0s 115us/step - loss: 63.5867 - val_loss: 55.2200
Epoch 30/100
834/834 [==============================] - 0s 113us/step - loss: 59.2258 - val_loss: 58.6557
Epoch 31/100
834/834 [==============================] - 0s 114us/step - loss: 57.8298 - val_loss: 51.3701
Epoch 32/100
834/834 [==============================] - 0s 115us/step - loss: 56.0855 - val_loss: 49.2646
Epoch 33/100
834/834 [==============================] - 0s 119us/step - loss: 54.9721 - val_loss: 52.2159
Epoch 34/100
834/834 [==============================] - 0s 126us/step - loss: 52.9757 - val_loss: 50.7561
Epoch 35/100
834/834 [==============================] - 0s 119us/step - loss: 52.1276 - val_loss: 44.1845
Epoch 36/100
834/834 [==============================] - 0s 115us/step - loss: 51.8647 - val_loss: 44.7191
Epoch 37/100
834/834 [==============================] - 0s 113us/step - loss: 48.8898 - val_loss: 46.6701
Epoch 38/100
834/834 [==============================] - 0s 119us/step - loss: 52.6059 - val_loss: 40.7133
Epoch 39/100
834/834 [==============================] - 0s 128us/step - loss: 48.0287 - val_loss: 37.4926
Epoch 40/100
834/834 [==============================] - 0s 125us/step - loss: 46.6562 - val_loss: 40.3896
Epoch 41/100
834/834 [==============================] - 0s 114us/step - loss: 45.7802 - val_loss: 42.5468
Epoch 42/100
834/834 [==============================] - 0s 112us/step - loss: 48.0991 - val_loss: 39.0044
Epoch 43/100
834/834 [==============================] - 0s 115us/step - loss: 44.6652 - val_loss: 48.1392
Epoch 44/100
834/834 [==============================] - 0s 114us/step - loss: 44.9251 - val_loss: 35.2333
Epoch 45/100
834/834 [==============================] - 0s 114us/step - loss: 42.9037 - val_loss: 37.9893
Epoch 46/100
834/834 [==============================] - 0s 116us/step - loss: 44.2775 - val_loss: 50.2931
Epoch 47/100
834/834 [==============================] - 0s 113us/step - loss: 44.1532 - val_loss: 34.7995
Epoch 48/100
834/834 [==============================] - 0s 114us/step - loss: 40.0724 - val_loss: 36.7891
Epoch 49/100
834/834 [==============================] - 0s 114us/step - loss: 41.4432 - val_loss: 37.5936
Epoch 50/100
834/834 [==============================] - 0s 121us/step - loss: 42.5412 - val_loss: 34.2832
Epoch 51/100
834/834 [==============================] - 0s 115us/step - loss: 38.5718 - val_loss: 33.4898
Epoch 52/100
834/834 [==============================] - 0s 114us/step - loss: 38.5902 - val_loss: 33.4109
Epoch 53/100
834/834 [==============================] - 0s 114us/step - loss: 38.4109 - val_loss: 36.4448
Epoch 54/100
834/834 [==============================] - 0s 116us/step - loss: 39.3725 - val_loss: 35.7690
Epoch 55/100
834/834 [==============================] - 0s 121us/step - loss: 38.8388 - val_loss: 33.4616
Epoch 56/100
834/834 [==============================] - 0s 115us/step - loss: 37.8989 - val_loss: 36.6492
Epoch 57/100
834/834 [==============================] - 0s 115us/step - loss: 37.0896 - val_loss: 45.6021
Epoch 58/100
834/834 [==============================] - 0s 116us/step - loss: 38.5968 - val_loss: 33.0854
Epoch 59/100
834/834 [==============================] - 0s 115us/step - loss: 36.5730 - val_loss: 35.5852
Epoch 60/100
834/834 [==============================] - 0s 139us/step - loss: 39.7343 - val_loss: 32.3643
Epoch 61/100
834/834 [==============================] - 0s 120us/step - loss: 36.0273 - val_loss: 33.9282
Epoch 62/100
834/834 [==============================] - 0s 113us/step - loss: 36.3586 - val_loss: 34.9153
Epoch 63/100
834/834 [==============================] - 0s 113us/step - loss: 37.7078 - val_loss: 31.4244
Epoch 64/100
834/834 [==============================] - 0s 116us/step - loss: 34.0002 - val_loss: 33.8998
Epoch 65/100
834/834 [==============================] - 0s 114us/step - loss: 34.8807 - val_loss: 36.9429
Epoch 66/100
834/834 [==============================] - 0s 115us/step - loss: 34.2639 - val_loss: 41.7181
Epoch 67/100
834/834 [==============================] - 0s 115us/step - loss: 36.1606 - val_loss: 33.2617
Epoch 68/100
834/834 [==============================] - 0s 126us/step - loss: 34.2432 - val_loss: 36.9399
Epoch 69/100
834/834 [==============================] - 0s 116us/step - loss: 33.2924 - val_loss: 34.7811
Epoch 70/100
834/834 [==============================] - 0s 119us/step - loss: 33.9718 - val_loss: 42.2463
Epoch 71/100
834/834 [==============================] - 0s 114us/step - loss: 33.3999 - val_loss: 32.5764
Epoch 72/100
834/834 [==============================] - 0s 118us/step - loss: 36.4881 - val_loss: 31.7392
Epoch 73/100
834/834 [==============================] - 0s 115us/step - loss: 35.3655 - val_loss: 33.9083
Epoch 74/100
834/834 [==============================] - 0s 119us/step - loss: 33.2319 - val_loss: 30.5091
Epoch 75/100
834/834 [==============================] - 0s 126us/step - loss: 34.1242 - val_loss: 29.8430
Epoch 76/100
834/834 [==============================] - 0s 120us/step - loss: 32.0600 - val_loss: 31.4169
Epoch 77/100
834/834 [==============================] - 0s 118us/step - loss: 32.1489 - val_loss: 32.2625
Epoch 78/100
834/834 [==============================] - 0s 120us/step - loss: 31.6808 - val_loss: 32.4065
Epoch 79/100
834/834 [==============================] - 0s 114us/step - loss: 31.1104 - val_loss: 32.3849
Epoch 80/100
834/834 [==============================] - 0s 132us/step - loss: 30.8013 - val_loss: 31.2789
Epoch 81/100
834/834 [==============================] - 0s 114us/step - loss: 31.1945 - val_loss: 30.9757
Epoch 82/100
834/834 [==============================] - 0s 113us/step - loss: 30.8913 - val_loss: 31.5188
Epoch 83/100
834/834 [==============================] - 0s 114us/step - loss: 30.7836 - val_loss: 31.5137
Epoch 84/100
834/834 [==============================] - 0s 120us/step - loss: 31.6711 - val_loss: 29.7293
Epoch 85/100
834/834 [==============================] - 0s 116us/step - loss: 30.1709 - val_loss: 46.2979
Epoch 86/100
834/834 [==============================] - 0s 125us/step - loss: 40.2524 - val_loss: 41.9256
Epoch 87/100
834/834 [==============================] - 0s 122us/step - loss: 33.3113 - val_loss: 35.6902
Epoch 88/100
834/834 [==============================] - 0s 117us/step - loss: 31.0072 - val_loss: 30.4290
Epoch 89/100
834/834 [==============================] - 0s 114us/step - loss: 29.0793 - val_loss: 30.0556
Epoch 90/100
834/834 [==============================] - 0s 119us/step - loss: 29.6565 - val_loss: 31.6022
Epoch 91/100
834/834 [==============================] - 0s 122us/step - loss: 30.1623 - val_loss: 33.7353
Epoch 92/100
834/834 [==============================] - 0s 115us/step - loss: 29.2801 - val_loss: 29.8321
Epoch 93/100
834/834 [==============================] - 0s 114us/step - loss: 28.3933 - val_loss: 29.5040
Epoch 94/100
834/834 [==============================] - 0s 114us/step - loss: 27.8342 - val_loss: 28.7708
Epoch 95/100
834/834 [==============================] - 0s 115us/step - loss: 29.4938 - val_loss: 28.7717
Epoch 96/100
834/834 [==============================] - 0s 115us/step - loss: 28.5992 - val_loss: 29.8672
Epoch 97/100
834/834 [==============================] - 0s 118us/step - loss: 28.1570 - val_loss: 30.7702
Epoch 98/100
834/834 [==============================] - 0s 116us/step - loss: 34.0191 - val_loss: 30.1975
Epoch 99/100
834/834 [==============================] - 0s 115us/step - loss: 28.6245 - val_loss: 29.6248
Epoch 100/100
834/834 [==============================] - 0s 128us/step - loss: 27.7708 - val_loss: 30.6792
"""
이제 주어진 데이터들을 활용해, 간단한 모델을 활용해, 훈련을 진행합니다.
plt.figure(figsize=(10,10))
plt.subplot(1,2,1)
plt.plot(hist.history['loss'], color='r')
plt.plot(hist.history['var_loss'], color='b')
plt.title('loss')
훈련 진행 결과는 다음과 같습니다. 잘 진행된 것을 알 수 있습니다.
score = model.evaluate(X_test, Y_test)
print(score)
pred = model.predict(X_test[-5:])
print(pred)
print(Y_test[-5:])
"""
103/103 [==============================] - 0s 78us/step
32.46895279004736
[[16.99518 ]
[16.98122 ]
[44.119606]
[56.167183]
[ 9.190986]]
607 18.415904
788 18.126324
352 51.434910
127 55.495923
685 13.664035
Name: strength, dtype: float64
"""
이제 훈련된 모델을 가지고, 백테스팅을 진행합니다. 테스트 결과, 약 32프로의 정확도로 잘 진행되지 않는 것을 알 수 있었습니다.
좀 더 다양한 모델과 시도를 통해, 진행해볼 가치가 있는 데이터인 것 같습니다.
다음에 또 기회가 된다면, 다양한 레이어와 테스팅을 통해, 성능을 향상해보도록 하겠습니다.
'SW > Python' 카테고리의 다른 글
Python : Keras : pretrain model : transfer learning : 활용, 예제, 방법 (0) | 2020.02.09 |
---|---|
Python : Keras : CIFAR10 : 예제, 실습, 사용법 (0) | 2020.02.08 |
Python : Keras : iris 품종 예측하기 : 예제, 구현, 방법 (0) | 2020.02.06 |
Python : Keras : 사람의 정보로 사람의 수입 예측, 분류하기 : 예제 (0) | 2020.02.05 |
pywinauto : 원격 접속 에러 : 원인, 방법 (0) | 2020.02.02 |