In Keras, the ResNet50 has a strange pattern

Solution for In Keras, the ResNet50 has a strange pattern
is Given Below:

As you know, in the CNN, only layers of Convolution, BatchNormalization have weights. And Usually, they are constructed by this way. Conv – BN – ReLU – Conv – BN – ReLU
But, As you can see, below the structure remain unusual.

conv2_block1_0_conv/kernel:0
conv2_block1_0_conv/bias:0
conv2_block1_3_conv/kernel:0
conv2_block1_3_conv/bias:0
conv2_block1_1_bn/gamma:0
conv2_block1_1_bn/beta:0
conv2_block1_1_bn/moving_mean:0
conv2_block1_1_bn/moving_variance:0
conv2_block1_3_bn/gamma:0
conv2_block1_3_bn/beta:0
conv2_block1_3_bn/moving_mean:0
conv2_block1_3_bn/moving_variance:0

You can find this result by:

model = tf.keras.application.ResNet50()
#The unusual phenomenon begins with index 18.
model.weights[18]

I recommend that you use debugging mode in your IDE. Then you’ll find it easier.

In the below lines, the ResNet50 has stack_fn function for creating layers

def ResNet50():
.
.
  def stack_fn(x):
    x = stack1(x, 64, 3, stride1=1, name="conv2")
    x = stack1(x, 128, 4, name="conv3")
    x = stack1(x, 256, 6, name="conv4")
    return stack1(x, 512, 3, name="conv5")
.
.

In the below codes, the stack1 is for simplifying repeated residential blocks.

def stack1(x, filters, blocks, stride1=2, name=None):


  x = block1(x, filters, stride=stride1, name=name + '_block1')
  for i in range(2, blocks + 1):
    x = block1(x, filters, conv_shortcut=False, name=name + '_block' + str(i))
  return x

In the below structure, the block1 is Residential layers in ResNet50.

def block1(x, filters, kernel_size=3, stride=1, conv_shortcut=True, name=None):

  bn_axis = 3 if backend.image_data_format() == 'channels_last' else 1

  if conv_shortcut:
    shortcut = layers.Conv2D(
        4 * filters, 1, strides=stride, name=name + '_0_conv')(x)
    shortcut = layers.BatchNormalization(
        axis=bn_axis, epsilon=1.001e-5, name=name + '_0_bn')(shortcut) 
  else:
    shortcut = x

  x = layers.Conv2D(filters, 1, strides=stride, name=name + '_1_conv')(x)
  x = layers.BatchNormalization(
      axis=bn_axis, epsilon=1.001e-5, name=name + '_1_bn')(x)
  x = layers.Activation('relu', name=name + '_1_relu')(x)

  x = layers.Conv2D(
      filters, kernel_size, padding='SAME', name=name + '_2_conv')(x)
  x = layers.BatchNormalization(
      axis=bn_axis, epsilon=1.001e-5, name=name + '_2_bn')(x)
  x = layers.Activation('relu', name=name + '_2_relu')(x)

  x = layers.Conv2D(4 * filters, 1, name=name + '_3_conv')(x)
  x = layers.BatchNormalization(
      axis=bn_axis, epsilon=1.001e-5, name=name + '_3_bn')(x)

  x = layers.Add(name=name + '_add')([shortcut, x]) 
  x = layers.Activation('relu', name=name + '_out')(x)
  return x

My problem is why are the model instance different from the actual structures?

update
Sorry I might have misunderstood your question previously.
As shown in the picture below, there seems to be two contiguous conv layer, and I assume this is what you meant. However, this is in fact not contiguous.
ResNet has a branching structure (residual), which means it is not sequential. But in TensorFlow, summary prints its layers sequentially, so, note the last column, it represents what this layer is connected to before it TensorFlow illustrates parallel structures by specifying which layer is after which.
ResNet summary

for example, conv2_block1_0_conv is connected to pool1_pool
conv2_block1_3_conv is connected to conv2_block1_2_relu

Which means although they are printed side by side, they are not contiguous, they are parallel structures!

conv2_block1_0_conv and conv2_block1_0_bn are on the shortcut path
while conv2_block1_3_conv and conv2_block1_3_bn are on the residual path

please feel free to comment if you have more questions on this part, or open a new post if you have other questions


model.weights return weights of a model (which is self-explanatory by name).

Conv – BN – ReLU – Conv – BN – ReLU are layers.
Conv stands for Convolutional layer, BN stands for Batch Normalization, ReLU is activation.

To get a list of layers, you can use model.layers (which returns a list of Layer objects). If you simply want to see the summary of model structure, use model.summary() to print the structure

For example, ResNet50().summary() gives (partial output)

odel: “resnet50”
__________________________________________________________________________________________________ Layer (type) Output Shape Param #
Connected to
================================================================================================== input_1 (InputLayer) [(None, 224, 224, 3) 0

__________________________________________________________________________________________________ conv1_pad (ZeroPadding2D) (None, 230, 230, 3) 0
input_1[0][0]
__________________________________________________________________________________________________ conv1_conv (Conv2D) (None, 112, 112, 64) 9472
conv1_pad[0][0]
__________________________________________________________________________________________________ conv1_bn (BatchNormalization) (None, 112, 112, 64) 256
conv1_conv[0][0]
__________________________________________________________________________________________________ conv1_relu (Activation) (None, 112, 112, 64) 0
conv1_bn[0][0]
__________________________________________________________________________________________________ pool1_pad (ZeroPadding2D) (None, 114, 114, 64) 0
conv1_relu[0][0]
__________________________________________________________________________________________________ pool1_pool (MaxPooling2D) (None, 56, 56, 64) 0
pool1_pad[0][0]
__________________________________________________________________________________________________ conv2_block1_1_conv (Conv2D) (None, 56, 56, 64) 4160
pool1_pool[0][0]
__________________________________________________________________________________________________ conv2_block1_1_bn (BatchNormali (None, 56, 56, 64) 256
conv2_block1_1_conv[0][0]
__________________________________________________________________________________________________ conv2_block1_1_relu (Activation (None, 56, 56, 64) 0
conv2_block1_1_bn[0][0]