Applying steps to validation data as well or only to training data?

I have an image classification problem and I want to use the pretrained model EfficicentNetB0 (keras applications: https://keras.io/api/applications/efficientnet/#efficientnetb0-function and the paper: https://arxiv.org/abs/1905.11946) with weights from ImageNet for this classification problem.

I downloaded and imported the model and its weights as follows:

!pip install git+https://github.com/qubvel/segmentation_models import efficientnet.keras as efn efnB0_model = efn.EfficientNetB0(include_top=False, weights="imagenet", input_shape=(224, 224, 3)) efnB0_model.trainable = False 

I won’t include the top of this model because it does not fit the problem I am analyzing, that is why I set include_top to False. Moreover, I don’t want to train the parameters of the EfficientNet, therefore efnB0_model.trainable = False.

Now, I would like to apply the efnB0_model to the training data X_train one time (it is not necessary to do this more than once because the weights are already there and should not be changed) and freeze all this bottom layers afterwards.

Therefore, I created a DataGenerator_X.

class DataGenerator_X(Sequence):      def __init__(self, x_set, batch_size):         self.x = x_set         self.batch_size = batch_size      def __len__(self):         return math.ceil(len(self.x) / self.batch_size)      def __getitem__(self, idx):         batch_x = self.x[idx*self.batch_size : (idx + 1)*self.batch_size]         batch_x = [imread(file_name) for file_name in batch_x]         batch_x = np.array(batch_x)         batch_x = batch_x * 1./255                   return batch_x 

This DataGenerator_X creates batches and divides them by 255 (that is preferred by EfficicentNet).

After that I apply this DataGenerator_X to X_train (which is the file_path to my images).

training_generator = DataGenerator_X(X_train, batch_size=32) 

And then I predict the X_train values after running the bottom of the EfficientNet-B0 model on them (X_after_efn) in order to not have to do this every epoch again because the bottom layers of the model are all freezed. I do that by using this code:

X_after_efn_train = efnB0_model.predict(training_generator, verbose=1) 

What I am wondering is, if I have to apply all this steps to the validation data as well? So, do I also have to create validation_generator and X_after_efn_val?

Add Comment
1 Answer(s)

In short, yes. You’ll have to create a validation_generator and then run efnB0_model.predict() on it.

The main reason for this is the predict() function only takes a single set of input samples (check input x under predict method https://keras.io/api/models/model_training_apis/#predict-method).

On the bright side, it’s not too much work since you can use the same generator.

val_generator = DataGenerator_X(X_val, batch_size=32) X_after_efn_val = efnB0_model.predict(val_generator, verbose=1) 

You can also check the ‘using bottleneck features’ part of this blog. They do something similar. https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

Answered on July 16, 2020.
Add Comment

Your Answer

By posting your answer, you agree to the privacy policy and terms of service.