16. File I/O#

16.1. Loading and Saving Vectors#

For individual vectors, we can directly invoke the serialize and deserialize functions to read and write them respectively. Both functions require that we supply a name, and save requires as input the variable to be saved.

using Serialization

x = [1:4...]
serialize("x-file",x)
4-element Vector{Int64}:
 1
 2
 3
 4

We can now read the data from the stored file back into memory.

x2 = deserialize("x-file")
4-element Vector{Int64}:
 1
 2
 3
 4

We can store a list of vectors and read them back into memory.

y = zeros(4)
serialize("x-files",[x,y])
x2,y2 = deserialize("x-files")
2-element Vector{Vector{Float64}}:
 [1.0, 2.0, 3.0, 4.0]
 [0.0, 0.0, 0.0, 0.0]

We can even write and read a dictionary that maps from strings to vectors. This is convenient when we want to read or write all the weights in a model.

mydict = Dict{String,Vector{Number}}("x"=>x,"y"=>y)
serialize("mydict",mydict)
mydict2 = deserialize("mydict")
Dict{String, Vector{Number}} with 2 entries:
  "x" => [1, 2, 3, 4]
  "y" => [0.0, 0.0, 0.0, 0.0]

16.2. Loading and Saving Model Parameters#

Saving individual weight vectors is useful, but it gets very tedious if we want to save (and later load) an entire model. After all, we might have hundreds of parameter groups sprinkled throughout. For this reason the deep learning framework provides built-in functionalities to load and save entire networks. An important detail to note is that this saves model parameters and not the entire model. For example, if we have a 3-layer MLP, we need to specify the architecture separately. The reason for this is that the models themselves can contain arbitrary code, hence they cannot be serialized as naturally. Thus, in order to reinstate a model, we need to generate the architecture in code and then load the parameters from disk. Let’s start with our familiar MLP.

using Flux

struct MLP
    net::Chain
end

X = rand(Float32,20,2)
MLP() = @autosize (size(X)[1],) Chain(Dense(_=>256),relu,Dense(_=>10))
model = MLP() 
Y = model(X)
10×2 Matrix{Float32}:
 -0.269618   -0.174497
  0.109138   -0.03599
 -0.0990227   0.00134618
 -0.0878646   0.0274887
  0.582191    0.44179
 -0.335405   -0.183537
 -0.168548   -0.328809
  0.133812   -0.0234257
  0.434612    0.216745
  0.203886    0.0880216

Next, we store the parameters of the model as a file with the name “mlp.params”.

using JLD2

jldsave("mlp.params"; model_state = Flux.state(model))

To recover the model, we instantiate a clone of the original MLP model. Instead of randomly initializing the model parameters, we read the parameters stored in the file directly.

model_state = JLD2.load("mlp.params", "model_state");
model = MLP()
clone = Flux.loadmodel!(model, model_state)
Chain(
  Dense(20 => 256),                     # 5_376 parameters
  NNlib.relu,
  Dense(256 => 10),                     # 2_570 parameters
)                   # Total: 4 arrays, 7_946 parameters, 31.289 KiB.

Since both instances have the same model parameters, the computational result of the same input X should be the same. Let’s verify this.

Y_clone = clone(X)
Y_clone == Y
true