15. Parameter Management#

We start by focusing on an MLP with one hidden layer.

using Flux

x = rand(Float32,4,2)
model = @autosize (size(x)[1],) Chain(Dense(_=>8),relu,Dense(_,1))
size(model(x))
(1, 2)

15.1. Parameter Access#

We can inspect the parameters of the second fully connected layer as follows.

weight, bias = Flux.params(model[3])
@show weight
@show bias;
weight = Float32[0.5881851 -0.7515991 -0.1395958 0.52311933 0.6160638 -0.07015736 -0.78449243 0.56064373]
bias = Float32[0.0]

We can see that this fully connected layer contains two parameters, corresponding to that layer’s weights and biases, respectively.

15.1.1. All Parameters at Once#

When we need to perform operations on all parameters, accessing them one-by-one can grow tedious. The situation can grow especially unwieldy when we work with more complex, e.g., nested, modules, since we would need to recurse through the entire tree to extract each sub-module’s parameters. Below we demonstrate accessing the parameters of all layers.

params_vec = []
for (index,layer) in enumerate(model)
    params = Flux.params(layer)
    if length(params)!=0
        push!(params_vec,("$(index).weight",size(params[1])))
        push!(params_vec,("$(index).bias",size(params[2])))
    end
end
params_vec
4-element Vector{Any}:
 ("1.weight", (8, 4))
 ("1.bias", (8,))
 ("3.weight", (1, 8))
 ("3.bias", (1,))

15.2. Tied Parameters#

Often, we want to share parameters across multiple layers. Let’s see how to do this elegantly. In the following we allocate a fully connected layer and then use its parameters specifically to set those of another layer.

shared = Dense(8=>8)
model = Chain(Dense(8=>8),relu,shared,relu,shared,relu,Dense(8,1))
println(Flux.params(model[3])[1] == Flux.params(model[5])[1]) 
Flux.params(model[3])[1][1,1] = 1
println(Flux.params(model[3])[1][1,1] == Flux.params(model[5])[1][1,1]) 
true
true