基於小波分解和卷積神經網路的影象識別研究

基於小波分解和卷積神經網路的影象識別研究

本文屬於影象識別領域的hello world,屬於很簡單的那種,目的是探索小波變換對CNN影象識別的影響。用的資料集也是很簡單的 MNIST 資料集,由 70000 張 0 到 9 之間的手寫數字灰度影象組成,為 28 x 28 畫素,訓練集60000 張影象,測試集10000 張影象。

首先匯入相關模組

import pywtimport randomimport sklearnimport numpy as npimport pandas as pdimport seaborn as snsimport kerasimport matplotlib。pyplot as pltfrom sklearn。metrics import confusion_matrixfrom sklearn import metricsimport tensorflow as tfimport tensorflow。kerasfrom keras。models import Sequentialfrom keras。layers import Dense, Conv2D, Flatten, Dropout, BatchNormalization, MaxPooling2Dimport timeitfrom tensorflow。keras。utils import to_categoricalfrom keras。callbacks import EarlyStopping, ModelCheckpoint

首先進行資料載入

random。seed(666)mnist = tf。keras。datasets。mnist(x_train, y_train), (x_test, y_test) = mnist。load_data()

由於影象已經是灰度圖並且都具有相同的尺寸,因此直接將畫素值歸一化

X_train = x_train / 255。0X_test = x_test / 255。0

視覺化部分資料

fig = plt。figure(figsize = (20, 8))for i, a in enumerate(X_train[:15]): ax = fig。add_subplot(3, 5, i + 1) ax。imshow(a, cmap = “Greys”) ax。set_title(‘Value = ’ + str(y_train[i]), fontsize = 10) ax。set_xticks([]) ax。set_yticks([])fig。tight_layout()plt。show()

基於小波分解和卷積神經網路的影象識別研究

卷積神經網路對原始影象進行分類,首先進行輸入輸出設定

X_train_conv = X_train。reshape(len(X_train), X_train[0]。shape[0], X_train[0]。shape[1], 1)X_test_conv = X_test。reshape(len(X_test), X_test[0]。shape[0], X_test[0]。shape[1], 1)y_train_conv = to_categorical(y_train)y_test_conv = to_categorical(y_test)

然後構建一個簡單的卷積神經網路

model = Sequential()model。add(Conv2D(128, kernel_size = 3, activation = ‘relu’, input_shape = (X_train_conv[0]。shape), padding = ‘valid’))model。add(Conv2D(128, kernel_size = 3, activation = ‘relu’))model。add(BatchNormalization())model。add(Dropout(0。2))model。add(MaxPooling2D((2, 2)))model。add(Conv2D(64, kernel_size = 3, activation = ‘relu’))model。add(Conv2D(32, kernel_size = 3, activation = ‘relu’))model。add(Conv2D(64, kernel_size = 3, activation = ‘relu’))model。add(Flatten())model。add(Dense(10, activation = ‘relu’))model。add(Dense(100, activation = ‘relu’))model。add(Dense(10, activation = ‘softmax’))model。compile(optimizer = ‘adam’, loss = ‘categorical_crossentropy’, metrics = [‘accuracy’])model。summary()

基於小波分解和卷積神經網路的影象識別研究

如果模型沒有改善就停止訓練,並儲存最好的模型

filepath = ‘best_model。hdf5’earlyStopping = EarlyStopping(monitor = ‘val_loss’, patience = 10, verbose = 0, mode = ‘min’)checkpoint = ModelCheckpoint(filepath = filepath, monitor = ‘val_loss’, verbose = 1, save_best_only = True, mode = ‘min’)start = timeit。default_timer()model_history = model。fit(X_train_conv, y_train_conv, epochs = 50, batch_size = 128, validation_split = 0。3, callbacks = [earlyStopping, checkpoint])stop = timeit。default_timer()print(‘Time to train the model: ’, stop - start)

基於小波分解和卷積神經網路的影象識別研究

基於小波分解和卷積神經網路的影象識別研究

看一下訓練曲線

plt。plot(model_history。history[‘accuracy’])plt。plot(model_history。history[‘val_accuracy’])plt。title(‘Model accuracy’)plt。ylabel(‘Accuracy’)plt。xlabel(‘Epoch’)plt。legend([‘Train’, ‘Validation’], loc = ‘upper left’)plt。show()plt。plot(model_history。history[‘loss’])plt。plot(model_history。history[‘val_loss’])plt。title(‘Model loss’)plt。ylabel(‘Loss’)plt。xlabel(‘Epoch’)plt。legend([‘Train’, ‘Validation’], loc = ‘upper left’)plt。show()

基於小波分解和卷積神經網路的影象識別研究

基於小波分解和卷積神經網路的影象識別研究

載入最佳模型並查看準確率

model = keras。models。load_model(filepath)acc = model。evaluate(X_test_conv, y_test_conv)print(acc[1]*100)

繪製混淆矩陣

preds = model。predict(X_test_conv)preds = np。argmax(preds, axis = 1)f,ax = plt。subplots(figsize = (8, 8))sns。heatmap(confusion_matrix(y_test, preds), annot = True, linewidths = 0。01, cmap = “Greens”, linecolor = “gray”, fmt = ‘。0f’, ax = ax)plt。xlabel(“Predicted Label”)plt。ylabel(“True Label”)plt。title(“Confusion matrix for original images”)plt。show()

基於小波分解和卷積神經網路的影象識別研究

視覺化一些模型判別錯誤的例子

errors = y_test != predsimg_errors= X_test[errors,:]correct_labels = y_test[errors]incorrect_labels = preds[errors]fig = plt。figure(figsize = (20, 8))for i, a in enumerate(img_errors[:15]): ax = fig。add_subplot(3, 5, i + 1) ax。imshow(a, cmap = “Greys”) ax。set_title(‘Correct = ’ + str(correct_labels[i]) + ‘ Predict = ’ + str(incorrect_labels[i]), fontsize = 10) ax。set_xticks([]) ax。set_yticks([])fig。tight_layout()plt。show()

基於小波分解和卷積神經網路的影象識別研究

下面使用簡單的

Haar小波

對影象進行分解,以第1張影象做示例,進行一層Haar小波分解後可以獲得 4 個影象(原始影象的一半大小:28 x 28 -> 14 x 14),分別為近似影象、水平、垂直和對角線細節影象。

plt。imshow(X_train[0], cmap = “Greys”)plt。show()titles = [‘Original’, ‘Approximation’, ‘ Horizontal detail’, ‘Vertical detail’, ‘Diagonal detail’]coeffs2 = pywt。dwt2(X_train[0], ‘haar’, ‘periodization’)LL0, (LH0, HL0, HH0) = coeffs2fig = plt。figure(figsize = (12, 3))for i, a in enumerate([X_train[0], LL0, LH0, HL0, HH0]): ax = fig。add_subplot(1, 5, i + 1) ax。imshow(a, interpolation = “nearest”, cmap = plt。cm。gray) ax。set_title(titles[i], fontsize = 10) ax。set_xticks([]) ax。set_yticks([])fig。tight_layout()plt。show()

基於小波分解和卷積神經網路的影象識別研究

基於小波分解和卷積神經網路的影象識別研究

然後對所有圖片進行變換

LL_train = []LL_test = []for img in X_train: LLi, (LHi, HLi, HHi) = pywt。dwt2(img, ‘haar’, ‘periodization’) LL_train。append(LLi)for img in X_test: LLi, (LHi, HLi, HHi) = pywt。dwt2(img, ‘haar’, ‘periodization’) LL_test。append(LLi)

使用第一次近似影象訓練卷積神經網路

LL_train_conv = np。array(LL_train)。reshape(len(LL_train), LL_train[0]。shape[0], LL_train[0]。shape[1], 1)LL_test_conv = np。array(LL_test)。reshape(len(LL_test), LL_test[0]。shape[0], LL_test[0]。shape[1], 1)model_1 = Sequential()model_1。add(Conv2D(128, kernel_size = 3, activation = ‘relu’, input_shape = (LL_train_conv[0]。shape), padding = ‘valid’))model_1。add(Conv2D(128, kernel_size = 3, activation = ‘relu’))model_1。add(BatchNormalization())model_1。add(Dropout(0。2))model_1。add(MaxPooling2D((2, 2)))model_1。add(Conv2D(64, kernel_size = 3, activation = ‘relu’))model_1。add(Conv2D(32, kernel_size = 3, activation = ‘relu’))model_1。add(Flatten())model_1。add(Dense(10, activation = ‘relu’))model_1。add(Dense(100, activation = ‘relu’))model_1。add(Dense(10, activation = ‘softmax’))model_1。compile(optimizer = ‘adam’, loss = ‘categorical_crossentropy’, metrics = [‘accuracy’])model_1。summary()

基於小波分解和卷積神經網路的影象識別研究

訓練模型

filepath = ‘best_model_wv。hdf5’earlyStopping = EarlyStopping(monitor = ‘val_loss’, patience = 10, verbose = 0, mode = ‘min’)checkpoint = ModelCheckpoint(filepath = filepath, monitor = ‘val_loss’, verbose = 1, save_best_only = True, mode = ‘min’)start = timeit。default_timer()history_model1 = model_1。fit(LL_train_conv, y_train_conv, epochs = 50, batch_size = 128, validation_split = 0。3, callbacks = [earlyStopping, checkpoint])stop = timeit。default_timer()print(‘Time to train the model: ’, stop - start)

基於小波分解和卷積神經網路的影象識別研究

繪製訓練曲線

plt。plot(history_model1。history[‘accuracy’])plt。plot(history_model1。history[‘val_accuracy’])plt。title(‘Model accuracy’)plt。ylabel(‘Accuracy’)plt。xlabel(‘Epoch’)plt。legend([‘Train’, ‘Validation’], loc = ‘upper left’)plt。show()plt。plot(history_model1。history[‘loss’])plt。plot(history_model1。history[‘val_loss’])plt。title(‘Model loss’)plt。ylabel(‘Loss’)plt。xlabel(‘Epoch’)plt。legend([‘Train’, ‘Validation’], loc = ‘upper left’)plt。show()

基於小波分解和卷積神經網路的影象識別研究

基於小波分解和卷積神經網路的影象識別研究

model_1 = keras。models。load_model(filepath)acc_LL = model_1。evaluate(LL_test_conv, y_test_conv)print(acc_LL[1] * 100)

繪製混淆矩陣

preds_LL = model_1。predict(LL_test_conv)preds_LL = np。argmax(preds_LL, axis = 1)f,ax = plt。subplots(figsize = (8, 8))sns。heatmap(confusion_matrix(y_test, preds_LL), annot = True, linewidths = 0。01, cmap = “Greens”, linecolor = “gray”, fmt = ‘。0f’, ax = ax)plt。xlabel(“Predicted Label”)plt。ylabel(“True Label”)plt。title(“Confusion matrix for Haar wavelet aproximations”)plt。show()

基於小波分解和卷積神經網路的影象識別研究

視覺化一些模型判別錯誤的例子

errors_2 = y_test != preds_LLimg_errors_2 = X_test[errors_2,:]correct_labels_2 = y_test[errors_2]incorrect_labels_2 = preds_LL[errors_2]fig = plt。figure(figsize = (20, 8))for i, a in enumerate(img_errors_2[:15]): ax = fig。add_subplot(3, 5, i + 1) ax。imshow(a, cmap = “Greys”) ax。set_title(‘Correct = ’ + str(correct_labels_2[i]) + ‘ Predict = ’ + str(incorrect_labels_2[i]), fontsize = 10) ax。set_xticks([]) ax。set_yticks([])fig。tight_layout()plt。show()

基於小波分解和卷積神經網路的影象識別研究

下面使用Haar小波對圖片進行2層分解,近似影象大小為7 x 7。

plt。imshow(LL0, cmap = “Greys”)plt。show()titles = [‘First approximation’,‘Second approximation’, ‘ Second horizontal detail’, ‘Second vertical detail’, ‘Second diagonal detail’]coeffs2_2 = pywt。dwt2(LL0, ‘haar’, ‘periodization’)LL0_2, (LH0_2, HL0_2, HH0_2) = coeffs2_2fig = plt。figure(figsize = (12, 3))for i, a in enumerate([LL0, LL0_2, LH0_2, HL0_2, HH0_2]): ax = fig。add_subplot(1, 5, i + 1) ax。imshow(a, interpolation = “nearest”, cmap = plt。cm。gray) ax。set_title(titles[i], fontsize = 10) ax。set_xticks([]) ax。set_yticks([])fig。tight_layout()plt。show()

基於小波分解和卷積神經網路的影象識別研究

基於小波分解和卷積神經網路的影象識別研究

對所有影象進行2層小波分解

LL_train_2 = []LL_test_2 = []for img in LL_train: LLi, (LHi, HLi, HHi) = pywt。dwt2(img, ‘haar’, ‘periodization’) LL_train_2。append(LLi)for img in LL_test: LLi, (LHi, HLi, HHi) = pywt。dwt2(img, ‘haar’, ‘periodization’) LL_test_2。append(LLi)LL_train_2_conv = np。array(LL_train_2)。reshape(len(LL_train_2), LL_train_2[0]。shape[0], LL_train_2[0]。shape[1], 1)LL_test_2_conv = np。array(LL_test_2)。reshape(len(LL_test_2), LL_test_2[0]。shape[0], LL_test_2[0]。shape[1], 1)model_2 = Sequential()model_2。add(Conv2D(128, kernel_size = 3, activation = ‘relu’, input_shape = (LL_train_2_conv[0]。shape), padding = ‘valid’))model_2。add(BatchNormalization())model_2。add(Dropout(0。2))model_2。add(Conv2D(32, kernel_size = 3, activation = ‘relu’))model_2。add(Flatten())model_2。add(Dense(10, activation = ‘relu’))model_2。add(Dense(100, activation = ‘relu’))model_2。add(Dense(10, activation = ‘softmax’))model_2。compile(optimizer = ‘adam’, loss = ‘categorical_crossentropy’, metrics = [‘accuracy’])model_2。summary()

基於小波分解和卷積神經網路的影象識別研究

filepath = ‘best_model_wv_2。hdf5’earlyStopping = EarlyStopping(monitor = ‘val_loss’, patience = 10, verbose = 0, mode = ‘min’)checkpoint = ModelCheckpoint(filepath = filepath, monitor = ‘val_loss’, verbose = 1, save_best_only = True, mode = ‘min’)start = timeit。default_timer()history_model2 = model_2。fit(LL_train_2_conv, y_train_conv, epochs = 50, batch_size = 128, validation_split = 0。3, callbacks = [earlyStopping, checkpoint])stop = timeit。default_timer()print(‘Time to train the model: ’, stop - start) plt。plot(history_model2。history[‘accuracy’])plt。plot(history_model2。history[‘val_accuracy’])plt。title(‘Model accuracy’)plt。ylabel(‘Accuracy’)plt。xlabel(‘Epoch’)plt。legend([‘Train’, ‘Validation’], loc = ‘upper left’)plt。show()plt。plot(history_model2。history[‘loss’])plt。plot(history_model2。history[‘val_loss’])plt。title(‘Model loss’)plt。ylabel(‘Loss’)plt。xlabel(‘Epoch’)plt。legend([‘Train’, ‘Validation’], loc = ‘upper left’)plt。show()

基於小波分解和卷積神經網路的影象識別研究

基於小波分解和卷積神經網路的影象識別研究

model_2 = keras。models。load_model(filepath)acc_LL_2 = model_2。evaluate(LL_test_2_conv, y_test_conv)print(acc_LL_2[1] * 100)preds_LL_2 = model_2。predict(LL_test_2_conv)preds_LL_2 = np。argmax(preds_LL_2, axis = 1)f,ax = plt。subplots(figsize = (8, 8))sns。heatmap(confusion_matrix(y_test, preds_LL_2), annot = True, linewidths = 0。01, cmap = “Greens”, linecolor = “gray”, fmt = ‘。0f’, ax = ax)plt。xlabel(“Predicted Label”)plt。ylabel(“True Label”)plt。title(“Confusion matrix for Haar wavelet second aproximations”)plt。show()

基於小波分解和卷積神經網路的影象識別研究

errors_3 = y_test != preds_LL_2img_errors_3 = X_test[errors_3,:]correct_labels_3 = y_test[errors_3]incorrect_labels_3 = preds_LL_2[errors_3]fig = plt。figure(figsize = (20, 8))for i, a in enumerate(img_errors_3[:15]): ax = fig。add_subplot(3, 5, i + 1) ax。imshow(a, cmap = “Greys”) ax。set_title(‘Correct = ’ + str(correct_labels_3[i]) + ‘ Predict = ’ + str(incorrect_labels_3[i]), fontsize = 10) ax。set_xticks([]) ax。set_yticks([])fig。tight_layout()plt。show()

基於小波分解和卷積神經網路的影象識別研究

詳細的文章請見:

基於小波分解和卷積神經網路的影象識別研究 - 哥廷根數學學派的文章 - 知乎 https://zhuanlan。zhihu。com/p/554876956