原创 深度残差收缩网络的Keras图像识别

2020-1-15 10:05 2847 17 2 分类: 机器人/ AI
从本质上讲,深度残差收缩网络属于卷积神经网络,是深度残差网络(deep residual network, ResNet)的一个变种。它的核心思想在于,在深度学习进行特征学习的过程中,剔除冗余信息是非常重要的;软阈值化是一种非常灵活的、删除冗余信息的方式。
1.深度残差网络
首先,在介绍深度残差收缩网络的时候,经常需要从深度残差网络开始讲起。下图展示了深度残差网络的基本模块,包括一些非线性层(残差路径)和一个跨层的恒等连接。恒等连接是深度残差网络的核心,是其优异性能的一个保障。

2.深度残差收缩网络
深度残差收缩网络,就是对深度残差网络的残差路径进行收缩的一种网络。这里的“收缩”指的就是软阈值化。

软阈值化是许多信号降噪方法的核心步骤,它是将接近于零(或者说绝对值低于某一阈值τ)的特征置为0,也就是将[-τ, τ]区间内的特征置为0,让其他的、距0较远的特征也朝着0进行收缩。
如果和前一个卷积层的偏置b放在一起看的话,这个置为零的区间就变成了[-τ+b, τ+b]。因为τ和b都是可以自动学习得到的参数,这个角度看的话,软阈值化其实是可以将任意区间的特征置为零,是一种更灵活的、删除某个取值范围特征的方式,也可以理解成一种更灵活的非线性映射。
从另一个方面来看,前面的两个卷积层、两个批标准化和两个激活函数,将冗余信息的特征,变换成接近于零的值;将有用的特征,变换成远离零的值。之后,通过自动学习得到一组阈值,利用软阈值化将冗余特征剔除掉,将有用特征保留下来。
通过堆叠一定数量的基本模块,可以构成完整的深度残差收缩网络,如下图所示:

3.图像识别及Keras编程
虽然深度残差收缩网络原先是应用于基于振动信号的故障诊断,但是深度残差收缩网络事实上是一种通用的特征学习方法,相信在很多任务(计算机视觉、语音、文本)中都可能有一定的用处。
下面是基于深度残差收缩网络的MNIST手写数字识别程序(程序很简单,仅供参考):
  1. #!/usr/bin/env python3
  2. # -*- coding: utf-8 -*-
  3. """
  4. Created on Sat Dec 28 23:24:05 2019
  5. Implemented using TensorFlow 1.0.1 and Keras 2.2.1
  6. M. Zhao, S. Zhong, X. Fu, et al., Deep Residual Shrinkage Networks for Fault Diagnosis,
  7. IEEE Transactions on Industrial Informatics, 2019, DOI: 10.1109/TII.2019.2943898
  8. @author: me
  9. """
  10. from __future__ import print_function
  11. import keras
  12. import numpy as np
  13. from keras.datasets import mnist
  14. from keras.layers import Dense, Conv2D, BatchNormalization, Activation
  15. from keras.layers import AveragePooling2D, Input, GlobalAveragePooling2D
  16. from keras.optimizers import Adam
  17. from keras.regularizers import l2
  18. from keras import backend as K
  19. from keras.models import Model
  20. from keras.layers.core import Lambda
  21. K.set_learning_phase(1)
  22. # Input image dimensions
  23. img_rows, img_cols = 28, 28
  24. # The data, split between train and test sets
  25. (x_train, y_train), (x_test, y_test) = mnist.load_data()
  26. if K.image_data_format() == 'channels_first':
  27. x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
  28. x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
  29. input_shape = (1, img_rows, img_cols)
  30. else:
  31. x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
  32. x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
  33. input_shape = (img_rows, img_cols, 1)
  34. # Noised data
  35. x_train = x_train.astype('float32') / 255. + 0.5*np.random.random([x_train.shape[0], img_rows, img_cols, 1])
  36. x_test = x_test.astype('float32') / 255. + 0.5*np.random.random([x_test.shape[0], img_rows, img_cols, 1])
  37. print('x_train shape:', x_train.shape)
  38. print(x_train.shape[0], 'train samples')
  39. print(x_test.shape[0], 'test samples')
  40. # convert class vectors to binary class matrices
  41. y_train = keras.utils.to_categorical(y_train, 10)
  42. y_test = keras.utils.to_categorical(y_test, 10)
  43. def abs_backend(inputs):
  44. return K.abs(inputs)
  45. def expand_dim_backend(inputs):
  46. return K.expand_dims(K.expand_dims(inputs,1),1)
  47. def sign_backend(inputs):
  48. return K.sign(inputs)
  49. def pad_backend(inputs, in_channels, out_channels):
  50. pad_dim = (out_channels - in_channels)//2
  51. return K.spatial_3d_padding(inputs, padding = ((0,0),(0,0),(pad_dim,pad_dim)))
  52. # Residual Shrinakge Block
  53. def residual_shrinkage_block(incoming, nb_blocks, out_channels, downsample=False,
  54. downsample_strides=2):
  55. residual = incoming
  56. in_channels = incoming.get_shape().as_list()[-1]
  57. for i in range(nb_blocks):
  58. identity = residual
  59. if not downsample:
  60. downsample_strides = 1
  61. residual = BatchNormalization()(residual)
  62. residual = Activation('relu')(residual)
  63. residual = Conv2D(out_channels, 3, strides=(downsample_strides, downsample_strides),
  64. padding='same', kernel_initializer='he_normal',
  65. kernel_regularizer=l2(1e-4))(residual)
  66. residual = BatchNormalization()(residual)
  67. residual = Activation('relu')(residual)
  68. residual = Conv2D(out_channels, 3, padding='same', kernel_initializer='he_normal',
  69. kernel_regularizer=l2(1e-4))(residual)
  70. # Calculate global means
  71. residual_abs = Lambda(abs_backend)(residual)
  72. abs_mean = GlobalAveragePooling2D()(residual_abs)
  73. # Calculate scaling coefficients
  74. scales = Dense(out_channels, activation=None, kernel_initializer='he_normal',
  75. kernel_regularizer=l2(1e-4))(abs_mean)
  76. scales = BatchNormalization()(scales)
  77. scales = Activation('relu')(scales)
  78. scales = Dense(out_channels, activation='sigmoid', kernel_regularizer=l2(1e-4))(scales)
  79. scales = Lambda(expand_dim_backend)(scales)
  80. # Calculate thresholds
  81. thres = keras.layers.multiply([abs_mean, scales])
  82. # Soft thresholding
  83. sub = keras.layers.subtract([residual_abs, thres])
  84. zeros = keras.layers.subtract([sub, sub])
  85. n_sub = keras.layers.maximum([sub, zeros])
  86. residual = keras.layers.multiply([Lambda(sign_backend)(residual), n_sub])
  87. # Downsampling (it is important to use the pooL-size of (1, 1))
  88. if downsample_strides > 1:
  89. identity = AveragePooling2D(pool_size=(1,1), strides=(2,2))(identity)
  90. # Zero_padding to match channels (it is important to use zero padding rather than 1by1 convolution)
  91. if in_channels != out_channels:
  92. identity = Lambda(pad_backend)(identity, in_channels, out_channels)
  93. residual = keras.layers.add([residual, identity])
  94. return residual
  95. # define and train a model
  96. inputs = Input(shape=input_shape)
  97. net = Conv2D(8, 3, padding='same', kernel_initializer='he_normal', kernel_regularizer=l2(1e-4))(inputs)
  98. net = residual_shrinkage_block(net, 1, 8, downsample=True)
  99. net = BatchNormalization()(net)
  100. net = Activation('relu')(net)
  101. net = GlobalAveragePooling2D()(net)
  102. outputs = Dense(10, activation='softmax', kernel_initializer='he_normal', kernel_regularizer=l2(1e-4))(net)
  103. model = Model(inputs=inputs, outputs=outputs)
  104. model.compile(loss='categorical_crossentropy', optimizer=Adam(), metrics=['accuracy'])
  105. model.fit(x_train, y_train, batch_size=100, epochs=5, verbose=1, validation_data=(x_test, y_test))
  106. # get results
  107. K.set_learning_phase(0)
  108. DRSN_train_score = model.evaluate(x_train, y_train, batch_size=100, verbose=0)
  109. print('Train loss:', DRSN_train_score[0])
  110. print('Train accuracy:', DRSN_train_score[1])
  111. DRSN_test_score = model.evaluate(x_test, y_test, batch_size=100, verbose=0)
  112. print('Test loss:', DRSN_test_score[0])
  113. print('Test accuracy:', DRSN_test_score[1])

为方便对比,深度残差网络的代码如下:
  1. #!/usr/bin/env python3
  2. # -*- coding: utf-8 -*-
  3. """
  4. Created on Sat Dec 28 23:19:03 2019
  5. Implemented using TensorFlow 1.0 and Keras 2.2.1
  6. K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition, CVPR, 2016.
  7. @author: me
  8. """
  9. from __future__ import print_function
  10. import numpy as np
  11. import keras
  12. from keras.datasets import mnist
  13. from keras.layers import Dense, Conv2D, BatchNormalization, Activation
  14. from keras.layers import AveragePooling2D, Input, GlobalAveragePooling2D
  15. from keras.optimizers import Adam
  16. from keras.regularizers import l2
  17. from keras import backend as K
  18. from keras.models import Model
  19. from keras.layers.core import Lambda
  20. K.set_learning_phase(1)
  21. # input image dimensions
  22. img_rows, img_cols = 28, 28
  23. # the data, split between train and test sets
  24. (x_train, y_train), (x_test, y_test) = mnist.load_data()
  25. if K.image_data_format() == 'channels_first':
  26. x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
  27. x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
  28. input_shape = (1, img_rows, img_cols)
  29. else:
  30. x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
  31. x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
  32. input_shape = (img_rows, img_cols, 1)
  33. # Noised data
  34. x_train = x_train.astype('float32') / 255. + 0.5*np.random.random([x_train.shape[0], img_rows, img_cols, 1])
  35. x_test = x_test.astype('float32') / 255. + 0.5*np.random.random([x_test.shape[0], img_rows, img_cols, 1])
  36. print('x_train shape:', x_train.shape)
  37. print(x_train.shape[0], 'train samples')
  38. print(x_test.shape[0], 'test samples')
  39. # convert class vectors to binary class matrices
  40. y_train = keras.utils.to_categorical(y_train, 10)
  41. y_test = keras.utils.to_categorical(y_test, 10)
  42. def pad_backend(inputs, in_channels, out_channels):
  43. pad_dim = (out_channels - in_channels)//2
  44. return K.spatial_3d_padding(inputs, padding = ((0,0),(0,0),(pad_dim,pad_dim)))
  45. def residual_block(incoming, nb_blocks, out_channels, downsample=False,
  46. downsample_strides=2):
  47. residual = incoming
  48. in_channels = incoming.get_shape().as_list()[-1]
  49. for i in range(nb_blocks):
  50. identity = residual
  51. if not downsample:
  52. downsample_strides = 1
  53. residual = BatchNormalization()(residual)
  54. residual = Activation('relu')(residual)
  55. residual = Conv2D(out_channels, 3, strides=(downsample_strides, downsample_strides),
  56. padding='same', kernel_initializer='he_normal',
  57. kernel_regularizer=l2(1e-4))(residual)
  58. residual = BatchNormalization()(residual)
  59. residual = Activation('relu')(residual)
  60. residual = Conv2D(out_channels, 3, padding='same', kernel_initializer='he_normal',
  61. kernel_regularizer=l2(1e-4))(residual)
  62. # Downsampling (it is important to use the pooL-size of (1, 1))
  63. if downsample_strides > 1:
  64. identity = AveragePooling2D(pool_size=(1, 1), strides=(2, 2))(identity)
  65. # Zero_padding to match channels (it is important to use zero padding rather than 1by1 convolution)
  66. if in_channels != out_channels:
  67. identity = Lambda(pad_backend)(identity, in_channels, out_channels)
  68. residual = keras.layers.add([residual, identity])
  69. return residual
  70. # define and train a model
  71. inputs = Input(shape=input_shape)
  72. net = Conv2D(8, 3, padding='same', kernel_initializer='he_normal', kernel_regularizer=l2(1e-4))(inputs)
  73. net = residual_block(net, 1, 8, downsample=True)
  74. net = BatchNormalization()(net)
  75. net = Activation('relu')(net)
  76. net = GlobalAveragePooling2D()(net)
  77. outputs = Dense(10, activation='softmax', kernel_initializer='he_normal', kernel_regularizer=l2(1e-4))(net)
  78. model = Model(inputs=inputs, outputs=outputs)
  79. model.compile(loss='categorical_crossentropy', optimizer=Adam(), metrics=['accuracy'])
  80. model.fit(x_train, y_train, batch_size=100, epochs=5, verbose=1, validation_data=(x_test, y_test))
  81. # get results
  82. K.set_learning_phase(0)
  83. resnet_train_score = model.evaluate(x_train, y_train, batch_size=100, verbose=0)
  84. print('Train loss:', resnet_train_score[0])
  85. print('Train accuracy:', resnet_train_score[1])
  86. resnet_test_score = model.evaluate(x_test, y_test, batch_size=100, verbose=0)
  87. print('Test loss:', resnet_test_score[0])
  88. print('Test accuracy:', resnet_test_score[1])
备注:
(1)深度残差收缩网络的结构比普通的深度残差网络复杂,也许更难训练。
(2)程序里只设置了一个基本模块,在更复杂的数据集上,可适当增加。
(3)如果遇到这个TypeError:softmax() got an unexpected keyword argument 'axis',就点开tensorflow_backend.py,将return tf.nn.softmax(x, axis=axis)中的第一个axis改成dim即可。
参考文献:
M. Zhao, S. Zhong, X. Fu, et al., Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2019, DOI: 10.1109/TII.2019.2943898
PARTNER CONTENT

文章评论2条评论)

登录后参与讨论

用户3902290 2020-1-15 21:21

curton: 学习了
感谢感谢

curton 2020-1-15 15:28

学习了
相关推荐阅读
用户3902290 2020-01-15 20:52
【残差网络升级】深度残差收缩网络
深度残差网络(ResNet)是一种非常成功的深度学习模型,自2015年底在arXiv公布以来,在谷歌学术上的引用量已经达到了惊人的37185次。深度残差收缩网络是ResNet的一种新型改进,这篇文章将...
EE直播间
更多
我要评论
2
17
关闭 站长推荐上一条 /3 下一条