Note

Go to the end to download the full example code. or to run this example in your browser via Binder

understanding Dense layer in Keras

This notebook describes dense layer or fully connected layer using tensorflow.

import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense

def reset_seed(seed=313):
    tf.keras.backend.clear_session()
    tf.random.set_seed(seed)
    np.random.seed(seed)

np.set_printoptions(linewidth=100, suppress=True)

print(tf.__version__)

2.7.0

print(np.__version__)

1.21.6

set some global parameters

input_features = 2
batch_size = 10
dense_units = 5

define input to model

in_np = np.random.randint(0, 100, size=(batch_size,input_features))
print(in_np)

[[73 84]
 [34 76]
 [83 33]
 [95  2]
 [15 20]
 [ 6 87]
 [25 10]
 [54 65]
 [ 3 26]
 [27 38]]

build a model consisting of single dense layer

reset_seed()


ins = Input(input_features, name='my_input')
out = Dense(dense_units, use_bias=False, name='my_output')(ins)
model = Model(inputs=ins, outputs=out)

out_np = model.predict(in_np)

print(out_np)

[[-24.545158    60.223736   -22.586353    10.208422     5.8342476 ]
 [-23.865839    29.79804    -19.818281    31.742483    -6.619419  ]
 [ -6.831863    65.5096      -9.919023   -34.138214    22.459442  ]
 [  4.241456    73.28466     -2.3332765  -65.252594    34.763397  ]
 [ -5.9672885   12.504654    -5.3318644    4.102663     0.50515246]
 [-29.023615     8.747902   -22.052912    59.456474   -19.799788  ]
 [ -2.0781016   19.734663    -3.0028474  -10.238509     6.7496395 ]
 [-19.122025    44.684822   -17.429634     9.646704     3.5908635 ]
 [ -8.611273     3.544132    -6.613761    16.921026    -5.469105  ]
 [-11.415466    22.603212   -10.101664     8.848475     0.40289795]]

print(out_np.shape)

(10, 5)

We can get all layers of model as list

print(model.layers)

[<keras.engine.input_layer.InputLayer object at 0x70633c3aa0a0>, <keras.layers.core.dense.Dense object at 0x7062ea5161f0>]

or a specific layer by its name

dense_layer = model.get_layer('my_output')

input to dense layer must be of the shape

print(dense_layer.input_shape)

(None, 2)

output from dense layer will be of the shape

print(dense_layer.output_shape)

(None, 5)

dense layer ususally has two variables i.e. weight/kernel and bias. As we did not use bias thus no bias is shown

print(dense_layer.weights)

[<tf.Variable 'my_output/kernel:0' shape=(2, 5) dtype=float32, numpy=
array([[ 0.0517453 ,  0.77041924, -0.0192523 , -0.7022766 ,  0.37126076],
       [-0.3371734 ,  0.04741824, -0.252154  ,  0.7318406 , -0.25318795]], dtype=float32)>]

The shape of the dense weights is of the form (input_size, units) dense_layer.weights returns a list, the first variable of which kernel/weights. We can convert a numpy version of weights

dense_w = dense_layer.weights[0].numpy()
print(dense_w.shape)

(2, 5)

print(dense_w)

[[ 0.0517453   0.77041924 -0.0192523  -0.7022766   0.37126076]
 [-0.3371734   0.04741824 -0.252154    0.7318406  -0.25318795]]

The output from our model consisting of a single dense layer is simply the matrix multiplication between input and weight matrix as can be verified from below.

np.matmul(in_np, dense_w)

array([[-24.54515922,  60.22373641, -22.5863533 ,  10.2084204 ,   5.83424747],
       [-23.86583853,  29.79804015, -19.81828165,  31.74248242,  -6.61941862],
       [ -6.83186275,  65.50959873,  -9.91902268, -34.13821661,  22.45944077],
       [  4.24145627,  73.28466427,  -2.33327651, -65.25259459,  34.7633965 ],
       [ -5.96728861,  12.50465333,  -5.33186436,   4.1026634 ,   0.50515234],
       [-29.02361423,   8.74790204, -22.05291116,  59.45647359, -19.79978746],
       [ -2.07810163,  19.73466337,  -3.00284743, -10.23850858,   6.74963951],
       [-19.12202519,  44.68482435, -17.42963374,   9.64670396,   3.59086412],
       [ -8.61127257,   3.54413188,  -6.61376071,  16.92102611,  -5.46910453],
       [-11.41546631,  22.60321248, -10.10166383,   8.84847534,   0.40289831]])

compare above output from the model’s output which was obtained earlier.

Using Bias

By default the Dense layer in tensorflow uses bias as well.

reset_seed()
tf.keras.backend.clear_session()

ins = Input(input_features, name='my_input')
out = Dense(5, use_bias=True,  name='my_output')(ins)
model = Model(inputs=ins, outputs=out)

out_np = model.predict(in_np)
print(out_np.shape)
print(out_np)

(10, 5)
[[-24.545158    60.223736   -22.586353    10.208422     5.8342476 ]
 [-23.865839    29.79804    -19.818281    31.742483    -6.619419  ]
 [ -6.831863    65.5096      -9.919023   -34.138214    22.459442  ]
 [  4.241456    73.28466     -2.3332765  -65.252594    34.763397  ]
 [ -5.9672885   12.504654    -5.3318644    4.102663     0.50515246]
 [-29.023615     8.747902   -22.052912    59.456474   -19.799788  ]
 [ -2.0781016   19.734663    -3.0028474  -10.238509     6.7496395 ]
 [-19.122025    44.684822   -17.429634     9.646704     3.5908635 ]
 [ -8.611273     3.544132    -6.613761    16.921026    -5.469105  ]
 [-11.415466    22.603212   -10.101664     8.848475     0.40289795]]

dense_layer = model.get_layer('my_output')
print(dense_layer.weights)

[<tf.Variable 'my_output/kernel:0' shape=(2, 5) dtype=float32, numpy=
array([[ 0.0517453 ,  0.77041924, -0.0192523 , -0.7022766 ,  0.37126076],
       [-0.3371734 ,  0.04741824, -0.252154  ,  0.7318406 , -0.25318795]], dtype=float32)>, <tf.Variable 'my_output/bias:0' shape=(5,) dtype=float32, numpy=array([0., 0., 0., 0., 0.], dtype=float32)>]

The bias vector above was all zeros thus had no effect on model output as the equation for dense layer becomes $$ y = Ax + b$$ We can initialize bias vector with ones and see the output

reset_seed()

ins = Input(input_features, name='my_input')
out = Dense(dense_units, use_bias=True, bias_initializer='ones', name='my_output')(ins)
model = Model(inputs=ins, outputs=out)

out_np = model.predict(in_np)
print(out_np.shape)
print(out_np)

(10, 5)
[[-23.545158   61.223736  -21.586353   11.208422    6.8342476]
 [-22.865839   30.79804   -18.818281   32.742485   -5.619419 ]
 [ -5.831863   66.5096     -8.919023  -33.138214   23.459442 ]
 [  5.241456   74.28466    -1.3332765 -64.252594   35.763397 ]
 [ -4.9672885  13.504654   -4.3318644   5.102663    1.5051525]
 [-28.023615    9.747902  -21.052912   60.456474  -18.799788 ]
 [ -1.0781016  20.734663   -2.0028474  -9.238509    7.7496395]
 [-18.122025   45.684822  -16.429634   10.646704    4.590863 ]
 [ -7.611273    4.544132   -5.613761   17.921026   -4.469105 ]
 [-10.415466   23.603212   -9.101664    9.848475    1.402898 ]]

dense_layer = model.get_layer('my_output')
print(dense_layer.weights)

[<tf.Variable 'my_output/kernel:0' shape=(2, 5) dtype=float32, numpy=
array([[ 0.0517453 ,  0.77041924, -0.0192523 , -0.7022766 ,  0.37126076],
       [-0.3371734 ,  0.04741824, -0.252154  ,  0.7318406 , -0.25318795]], dtype=float32)>, <tf.Variable 'my_output/bias:0' shape=(5,) dtype=float32, numpy=array([1., 1., 1., 1., 1.], dtype=float32)>]

We can verify that the model’s output is obtained following the equation we wrote above.

dense_layer = model.get_layer('my_output')
dense_w = dense_layer.weights[0].numpy()
np.matmul(in_np, dense_w) + np.ones(dense_units)

array([[-23.54515922,  61.22373641, -21.5863533 ,  11.2084204 ,   6.83424747],
       [-22.86583853,  30.79804015, -18.81828165,  32.74248242,  -5.61941862],
       [ -5.83186275,  66.50959873,  -8.91902268, -33.13821661,  23.45944077],
       [  5.24145627,  74.28466427,  -1.33327651, -64.25259459,  35.7633965 ],
       [ -4.96728861,  13.50465333,  -4.33186436,   5.1026634 ,   1.50515234],
       [-28.02361423,   9.74790204, -21.05291116,  60.45647359, -18.79978746],
       [ -1.07810163,  20.73466337,  -2.00284743,  -9.23850858,   7.74963951],
       [-18.12202519,  45.68482435, -16.42963374,  10.64670396,   4.59086412],
       [ -7.61127257,   4.54413188,  -5.61376071,  17.92102611,  -4.46910453],
       [-10.41546631,  23.60321248,  -9.10166383,   9.84847534,   1.40289831]])

using activation function

We can add non-linearity to the output of dense layer by making use of activation keyword argument. A common activation function is relu which makes all the values below 0 as zero. In this case the equation of dense layer will become $$ y = alpha (Ax + b) $$ Where $alpha$ is the non-linearity applied.

reset_seed()

ins = Input(input_features, name='my_input')
out = Dense(dense_units, use_bias=True, bias_initializer='ones',
            activation='relu', name='my_output')(ins)
model = Model(inputs=ins, outputs=out)

out_np = model.predict(in_np)
print(out_np.shape)
print(out_np)

(10, 5)
[[ 0.        61.223736   0.        11.208422   6.8342476]
 [ 0.        30.79804    0.        32.742485   0.       ]
 [ 0.        66.5096     0.         0.        23.459442 ]
 [ 5.241456  74.28466    0.         0.        35.763397 ]
 [ 0.        13.504654   0.         5.102663   1.5051525]
 [ 0.         9.747902   0.        60.456474   0.       ]
 [ 0.        20.734663   0.         0.         7.7496395]
 [ 0.        45.684822   0.        10.646704   4.590863 ]
 [ 0.         4.544132   0.        17.921026   0.       ]
 [ 0.        23.603212   0.         9.848475   1.402898 ]]

We can again verify that the above output from dense layer follows the equation that we wrote above.

def relu(X):
   return np.maximum(0,X)


dense_layer = model.get_layer('my_output')
dense_w = dense_layer.weights[0].numpy()
relu(np.matmul(in_np, dense_w) + np.ones(dense_units))

array([[ 0.        , 61.22373641,  0.        , 11.2084204 ,  6.83424747],
       [ 0.        , 30.79804015,  0.        , 32.74248242,  0.        ],
       [ 0.        , 66.50959873,  0.        ,  0.        , 23.45944077],
       [ 5.24145627, 74.28466427,  0.        ,  0.        , 35.7633965 ],
       [ 0.        , 13.50465333,  0.        ,  5.1026634 ,  1.50515234],
       [ 0.        ,  9.74790204,  0.        , 60.45647359,  0.        ],
       [ 0.        , 20.73466337,  0.        ,  0.        ,  7.74963951],
       [ 0.        , 45.68482435,  0.        , 10.64670396,  4.59086412],
       [ 0.        ,  4.54413188,  0.        , 17.92102611,  0.        ],
       [ 0.        , 23.60321248,  0.        ,  9.84847534,  1.40289831]])

customizing weights

we can set the weights and bias of dense layer to values of our choice. This is useful for example when we want to initialize the weights/bias with the values that we already have.

custom_dense_weights = np.array([[1, 2, 3 , 4,  5],
                                 [6, 7, 8 , 9 , 10]], dtype=np.float32)
custom_bias = np.array([0., 0., 0., 0., 0.])

reset_seed()

ins = Input(input_features, name='my_input')

dense_lyr = Dense(dense_units, use_bias=True, bias_initializer='ones', name='my_output')
out = dense_lyr(ins)

model = Model(inputs=ins, outputs=out)

dense_lyr.set_weights([custom_dense_weights, custom_bias])

The method set_weights must be called after initializing Model class. The input to set_weights is a list containing both weight matrix and bias vector respectively.

out_np = model.predict(in_np)
print(out_np.shape)
print(out_np)

WARNING:tensorflow:5 out of the last 5 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7062db7f75e0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.
(10, 5)
[[ 577.  734.  891. 1048. 1205.]
 [ 490.  600.  710.  820.  930.]
 [ 281.  397.  513.  629.  745.]
 [ 107.  204.  301.  398.  495.]
 [ 135.  170.  205.  240.  275.]
 [ 528.  621.  714.  807.  900.]
 [  85.  120.  155.  190.  225.]
 [ 444.  563.  682.  801.  920.]
 [ 159.  188.  217.  246.  275.]
 [ 255.  320.  385.  450.  515.]]

dense_layer = model.get_layer('my_output')
dense_w = dense_layer.weights[0].numpy()
print(dense_w)

[[ 1.  2.  3.  4.  5.]
 [ 6.  7.  8.  9. 10.]]

Verify that the output from dense is just matrix multiplication.

np.matmul(in_np, custom_dense_weights) + np.zeros(dense_units)

array([[ 577.,  734.,  891., 1048., 1205.],
       [ 490.,  600.,  710.,  820.,  930.],
       [ 281.,  397.,  513.,  629.,  745.],
       [ 107.,  204.,  301.,  398.,  495.],
       [ 135.,  170.,  205.,  240.,  275.],
       [ 528.,  621.,  714.,  807.,  900.],
       [  85.,  120.,  155.,  190.,  225.],
       [ 444.,  563.,  682.,  801.,  920.],
       [ 159.,  188.,  217.,  246.,  275.],
       [ 255.,  320.,  385.,  450.,  515.]])

Reducing Dimensions

Dense layer can be used to reduce last dimension of incoming input. In following the size is reduced from (10, 20, 30) ==> (10, 20, 1)

input_shape = 20, 30
in_np = np.random.randint(0, 100, size=(batch_size,*input_shape))

reset_seed()


ins = Input(input_shape, name='my_input')
out = Dense(1, use_bias=False, name='my_output')(ins)
model = Model(inputs=ins, outputs=out)
out_np = model.predict(in_np)
print('input shape: {}\n output shape: {}'.format(in_np.shape, out_np.shape))

WARNING:tensorflow:6 out of the last 6 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7062e883e430> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.
input shape: (10, 20, 30)
 output shape: (10, 20, 1)

Total running time of the script: (0 minutes 1.733 seconds)

Gallery generated by Sphinx-Gallery