[기계학습] machine learning [Tensorflow 2.0] 모델 저장하고 불러오기

2023.05.21 16:14

[Tensorflow 2.0] 모델 저장하고 불러오기

텐서플로우가 2.0 버전으로 새로워지면서, 전보다 훨씬 직관적이고 쉬워졌습니다. 이번에는 텐서플로우 2.0을 활용해 모델을 저장하고 다시 불러오는 방법에 대해 알아보겠습니다.

I. 들어가며

모델을 저장해야 하는 이유

모델을 저장하고 복구하는 데에는 상당한 노력과 시간이 필요한데요, 수십줄 또는 그 이상의 코드를 작성할 때 왜 이런 수고를 하면서까지 모델을 저장할까요?

이는 모델을 학습시키는 도중에 어떤 문제가 발생하여 작업이 중단되어도, 저장된 모델을 불러와 그 부분부터 시작할 수 있기 때문입니다. 또한 모델을 저장하여 사람들과 코드를 공유할 수 있고, 공유받은 코드를 활용해 모델의 정확도와 효율성을 향상시키며 더 나은 모델을 만들 수도 있습니다.

지금부터 ‘checkpoint’와 ‘callback’을 사용해 모델을 언제든 저장할 수 있는 방법을 알아보겠습니다.

II. 기본 작업

Tensorflow 설치하기

모델을 만들기 전에 텐서플로우를 설치하겠습니다.

try:
  # %tensorflow_version only exists in Colab.
  %tensorflow_version 2.x
except Exception:
  pass!pip install -q pyyaml h5py  # Required to save models in HDF5 formatfrom __future__ import absolute_import, division, print_function, unicode_literalsimport osimport tensorflow as tf
from tensorflow import kerasprint(tf.version.VERSION)

**해시태그(#) 안의 코드에서 언급한 바와 같이, 위 코드는 Colab 환경에서 연습하시기 바랍니다** https://colab.research.google.com

상단의 코드를 통해 텐서플로우를 설치하였습니다.

2. 데이터셋(Dataset) 가져오기

이 스토리에서는 MNIST 데이터셋을 예시로 활용하여 저장과 불러오기에 대해 배우겠습니다. 빠른 코드 실행을 위해, 전체 데이터셋 중 처음 1000개의 데이터만 사용하겠습니다.

(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()train_labels = train_labels[:1000]
test_labels = test_labels[:1000]train_images = train_images[:1000].reshape(-1, 28 * 28) / 255.0
test_images = test_images[:1000].reshape(-1, 28 * 28) / 255.0

상단의 코드를 통해 1000개의 데이터셋을 학습시키고, 테스트하였습니다.

3. 모델 정의하기

Simple Sequential 모델을 만들어 보겠습니다.

# Define a simple sequential model
def create_model():
  model = tf.keras.models.Sequential([
    keras.layers.Dense(512, activation='relu', input_shape=(784,)),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(10, activation='softmax')
  ])model.compile(optimizer='adam',
                loss='sparse_categorical_crossentropy',
                metrics=['accuracy'])return model

‘create_model()’ 함수를 활용해 모델을 간단하게 만들 수 있습니다.

# Create a basic model instance
model = create_model()# Display the model's architecture
model.summary()

‘model = create_model()’ 을 통해 완성된 ‘create_model()’ 함수를 불러와 사용할 수 있습니다.

다음은 ‘model.summary()’ 의 결과값입니다.

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 512)               401920    
_________________________________________________________________
dropout (Dropout)            (None, 512)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                5130      
=================================================================
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________

**위 결과들을 비롯해 MNIST 데이터셋이나 모델 만들기에 대해 더 알아보고 싶은 분은 이 튜토리얼을 참고하시기 바랍니다** https://www.tensorflow.org/beta/tutorials/keras/basic_classification

III. Checkpoint 저장 및 불러오기

Checkpoint는 학습이 완료된 모델을 불러와 직접 사용할 수 있도록 해줍니다. 또한, 모델 트레이닝 도중에 Checkpoint를 만든 후 해당 부분부터 다시 시작할 수도 있습니다.

먼저 ‘III.’ 에서는 ‘모델 트레이닝 도중에 Checkpoint를 만들어 저장하는 방법’을 알아본 후, ‘IV.’ 에서 ‘학습 완료된 모델을 저장하여 사용하는 방법’에 대해 알아보겠습니다.

모델 트레이닝 중 Weights 저장하기

우선 모델 트레이닝 도중에 weights를 저장해보겠습니다. 여기서 Weights는 state, 즉 상태라고 볼 수 있습니다.

checkpoint_path = "training_1/cp.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)# Create a callback that saves the model's weights
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
                                   save_weights_only=True,
                                   verbose=1)# Train the model with the new callback
model.fit(train_images, 
          train_labels,  
          epochs=10,
          validation_data=(test_images,test_labels),
          callbacks=[cp_callback])  # Pass callback to training# This may generate warnings related to saving the state of the optimizer.
# These warnings (and similar warnings throughout this notebook)
# are in place to discourage outdated usage, and can be ignored.

위 코드를 실행시키면 사용자가 지정한 디렉토리에 Tensorflow checkpoint 파일이 저장됩니다. Checkpoint는 각각의 epoch들이 실행될 때마다 업데이트 됩니다.

아래 코드를 실행시키면

!ls {checkpoint_dir}

다음과 같은 결과가 나옵니다.

checkpoint           cp.ckpt.data-00001-of-00002
cp.ckpt.data-00000-of-00002  cp.ckpt.index

이를 통해 저장된 Checkpoint를 확인할 수 있습니다.

2. Weights-only model을 저장하여 학습되지 않은 새 모델에 적용하기

애석하게도, checkpoint는 저장된 한글 파일이나 medium 포스트와는 달리 모델의 모든 내용을 담을 수 없습니다. 비유하자면 checkpoint는 우리가 긴 글을 읽고 요약한 것과 비슷하다고 볼 수 있습니다. 그럼 어떻게 하면 저장된 모델을 활용해 여러분만의 새 모델을 만들 수 있을까요?

원래 모델과 완전히 같은 구조로 저장하면, 다른 예시를 넣어도 weights를 공유할 수 있습니다! 아이폰 충전기로 갤럭시 노트 10을 충전할 수는 없지만, 갤럭시 s9 충전기로 갤럭시 노트10은 충전할 수 있죠? 이처럼 구조가 같다면(충전기처럼) weights를 공유할 수 있습니다.

자, 이제 학습되지 않은 새 모델을 구축하고 테스트셋에서 평가해 봅시다.

# Create a basic model instance
model = create_model()# Evaluate the model
loss, acc = model.evaluate(test_images, test_labels)
print("Untrained model, accuracy: {:5.2f}%".format(100*acc))

아직 학습시키지 않았기 때문에 정확도가 굉장히 낮을 것입니다.

경축! 아무것도 안하여 에스천사게임즈가 새로운 모습으로 재오픈 하였습니다.
어린이용이며, 설치가 필요없는 브라우저 게임입니다.
https://s1004games.com

그러면 이번에는 전에 저장해 둔 checkpoint를 불러오고 다시 평가해봅시다.

# Loads the weights
model.load_weights(checkpoint_path)# Re-evaluate the model
loss,acc = model.evaluate(test_images, test_labels)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

전자와 후자의 정확도를 비교해보세요.

IOW, compare it with the results that the following model fitting achieved previously:

model.fit(train_images, 
          train_labels,  
          epochs=10,
          validation_data=(test_images,test_labels),
          callbacks=[cp_callback])

3. Checkpoint Callback 옵션

앞에서 예상하셨겠지만, Checkpoint와 관련해서 몇가지 옵션을 선택할 수 있습니다. 시험공부 하거나 논문을 읽을 때 몇 페이지 내내 계속 같은 개념에 대해서만 나오면 효율을 위해서 “아, 뒤쪽 좀 읽고 좀이따 돌아와서 읽어야지~” 하고 넘어가신 적 있으실텐데요, checkpoint도 이렇게 효율적으로 쓸 수 있도록 이름을 붙이고 매 epoch마다 저장할 수 있습니다.

이제 새 모델을 학습시키고 epoch 5개씩 다른 이름을 지어서 저장해봅시다.

3.1. Checkpoint path와 checkpoint directory 정의하기

# Include the epoch in the file name (uses `str.format`)
checkpoint_path = "training_2/cp-{epoch:04d}.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)#previoulsy, checkpoint_path = "training_1/cp.ckpt"

3.2. Callback 만들어서 모델의 weights 저장하기

# Create a callback that saves the model's weights every 5 epochs
cp_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_path, 
    verbose=1, 
    save_weights_only=True,
    period=5)# previously, there was no period defined

3.3 과 3.4번에서는 옵션을 추가해서, 앞에서 했던것처럼 학습시키고 저장시키고, 새 모델을 만들어서 저장된 weights를 불러오겠습니다.

3.3 새 모델 만들고 새 callback으로 학습시키기

# Create a new model instance
model = create_model()

# Save the weights using the `checkpoint_path` format
model.save_weights(checkpoint_path.format(epoch=0))

# Train the model with the new callback
model.fit(train_images, 
          train_labels,
          epochs=50, 
          callbacks=[cp_callback],
          validation_data=(test_images,test_labels),
          verbose=0)

이제 도출된 checkpoints를 살펴보고 가장 마지막 checkpoint를 찾아 ‘latest’라고 이름 붙여보겠습니다.

! ls {checkpoint_dir}latest = tf.train.latest_checkpoint(checkpoint_dir)
latest

위 코드를 실행하면 결과는:

'training_2/cp-0050.ckpt'

3.4. 모델 다시 만들고 latest checkpoint 다시 불러오기

# Create a new model instance
model = create_model()

# Load the previously saved weights
model.load_weights(latest)

# Re-evaluate the model
loss, acc = model.evaluate(test_images, test_labels)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

정확도 결과가 다른가요, 비슷한가요?

4. 직접 weights 저장하기

# Save the weights
model.save_weights('./checkpoints/my_checkpoint')
#previously, model.save_weights(checkpoint_path.format(epoch=0))
# Create a new model instance
model = create_model()

# Restore the weights
model.load_weights('./checkpoints/my_checkpoint')
#previously, model.load_weights(checkpoint_path) # Evaluate the model
loss,acc = model.evaluate(test_images, test_labels)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

IV. 학습된 모델 통째로 저장하기

모델을 통째로 저장하고 싶으면 어떻게 해야할까요? Checkpoint방법으로는 weights만 저장할 수 있지만, 전체 모델을 파일로 저장하면 다시 모델을 정의하지 않아도 원래 모델 그대로 가져올 수 있습니다. 모델을 통째로 저장하면, 마지막에 멈췄던 그 부분에서 항상 시작할 수 있어 굉장히 편리합니다.

HDF5 파일로 모델 저장하기

HDF5 파일로 모델을 저장하는 것은 꽤 간단합니다.

# Create a new model instance
model = create_model()# Train the model
model.fit(train_images, train_labels, epochs=5)# Save the entire model to a HDF5 file
model.save('my_model.h5')

2. 저장된 파일 위에서 모델 만들기

여러분이나 다른 사용자들이 만들어놓은 모델을 다시 불러와서 쓸 때 사용하는 방법입니다.

# Recreate the exact same model, including its weights and the optimizer
new_model = keras.models.load_model('my_model.h5')# Show the model architecture
new_model.summary()

결과는 딱 II. 기본작업의 3. 모델 정의하기와 같습니다;

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_10 (Dense)             (None, 512)               401920    
_________________________________________________________________
dropout_5 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_11 (Dense)             (None, 10)                5130      
=================================================================
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________

우리는 모델 정확도를 다시 평가할 수 있습니다.

loss, acc = new_model.evaluate(test_images, test_labels)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

V. 마무리하며

이 스토리를 통해 Tensorflow 2.0을 활용하여 저장하는 법에 대해 알아봤는데요, 아무쪼록 텐서플로우 모델을 저장하고 불러오는데 도움이 되었으면 좋겠습니다.

한 치의 실수조차 없는 전문가는 아닌 만큼 부족한 부분이 있을 수도 있지만, 읽어주신 모든 분께 감사하고, 문의사항이나 따뜻한 지적은 언제타 큰 힘이 됩니다.

좋은하루 되세요.

[출처] https://financial-engineering.medium.com/%EC%83%88%EB%A1%9C%EC%9A%B4-%ED%85%90%EC%84%9C%ED%94%8C%EB%A1%9C%EC%9A%B02-0-%EB%A7%8C%EB%93%A0-%EB%AA%A8%EB%8D%B8%EC%9D%84-%EC%A0%80%EC%9E%A5%ED%95%98%EA%B3%A0-%EB%B6%88%EB%9F%AC%EC%98%A4%EA%B8%B0-5da506b59e13

본 웹사이트는 광고를 포함하고 있습니다.
광고 클릭에서 발생하는 수익금은 모두 웹사이트 서버의 유지 및 관리, 그리고 기술 콘텐츠 향상을 위해 쓰여집니다.

이 게시물을

번호	제목	글쓴이	날짜	조회 수
공지	오라클 기본 샘플 데이터베이스	졸리운_곰	2014.01.02	28525
공지	[SQL컨셉] 서적 "SQL컨셉"의 샘플 데이타 베이스 SAMPLE DATABASE of ORACLE	가을의 곰을...	2013.02.10	28315
공지	[G_SQL] Sample Database	가을의 곰을...	2012.05.20	35447
1042	[AutoML][AutoKeras] [OSS] AutoKeras로 자동학습(AutoML) 하기	졸리운_곰	2023.07.02	211
1041	[NoSQL][MongoDB] Truncate a collection	졸리운_곰	2023.06.04	362
»	[Tensorflow 2.0] 모델 저장하고 불러오기	졸리운_곰	2023.05.21	271
1039	[pytorch] Pytorch에서 학습한 모델 저장 및 불러오기	졸리운_곰	2023.05.21	296
1038	[MySQL] MySQL - 테이블 만들기	졸리운_곰	2023.05.13	346
1037	[R library] library(XML) # install.packages("XML") 인스톨 에러	졸리운_곰	2023.05.06	210
1036	[MySQL] MySQL Strict mode 끄기/켜기	졸리운_곰	2023.05.05	332
1035	[R 데이터 분석] Titanic: Machine Learning from Disaster (타이타닉 생존 예측)	졸리운_곰	2023.04.29	228
1034	[R 데이터 분석] R 유명한 패키지 정리	졸리운_곰	2023.04.24	351
1033	[NoSQL] MongoDB 인증 모드 (password) 설정	졸리운_곰	2023.03.26	277
1032	[MySQL] [MySQL] 테이블 구조와 데이터 복사 (Table Structure and Data Copy)	졸리운_곰	2023.03.20	237
1031	[R 데이터 분석] Shiny : 대시보드 배포하기	졸리운_곰	2023.03.19	268
1030	[데이터 수집 및 전처리] (놀라운) 한글 데이터 짱! AwesomeKorean_Data	졸리운_곰	2023.03.07	292
1029	[pytorch] Using BERT with Pytorch	졸리운_곰	2023.03.06	219
1028	[pytorch] Full NMT model from pretrained BERT	졸리운_곰	2023.03.06	139
1027	[기계학습][딥러닝] PyTorch Hello World	졸리운_곰	2023.02.12	240
1026	[PostgreSQL] 열을 행으로 전환 쿼리	졸리운_곰	2023.01.29	302
1025	[postgreSQL] PostgreSQL 계층형 쿼리 구현 방법	졸리운_곰	2023.01.29	279
1024	[postgreSQL] ORACLE쿼리에서 postgreSQL쿼리 변환	졸리운_곰	2023.01.29	221
1023	[postgreSQL] [PostgreSQL] stored function(stored procedures) 사용하기	졸리운_곰	2023.01.23	206

첫 페이지 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 끝 페이지

쓰기

태그