Spark in me - Internet, data science, math, deep learning, philo

snakers4 @ telegram, 1365 members, 1673 posts since 2016

All this - lost like tears in rain.

Data science, deep learning, sometimes a bit of philosophy and math. No bs.

Our website
Our chat
DS courses review

Posts by tag «cv»:

snakers4 (spark_comment_bot), June 21, 14:13

Playing with multi-GPU small batch-sizes

If you play with SemSeg with a big model with large images (HD, FullHD) - you may face a situation when only one image fits to one GPU.

Also this is useful if your train-test split is far from ideal and or you are using pre-trained imagenet encoders for a SemSeg task - so you cannot really update your bnorm params.

Also AFAIK - all the major deep-learning frameworks:

(0) do not have batch norm freeze options on evaluation (batch-norm contains 2 sets of parameters - learnable and updated on inference

(1) calculate batch-norm for each GPU separately

It all may mean, that your models may severely underperform in inference for these situations.


(0) Sync batch-norm. I believe to do it properly you will have to modify the framework you are using, but there is a PyTorch implementation done for the CVPR 2018 - also an explanation here - I guess if its multi-GPU wrappers for model can be used for any models - then we are in the money)

(1) Use affine=False in your batch-norm. But probably in this case imagenet initialization will not help - you will have to train your model from scratch completely

(2) Freeze your encoder batch-norm params completely (though I am not sure - they do not seem to be freezing the running mean parameters) - probably this also needs m.trainable = False or something like this

(3) Use recent Facebook group norm -

This is a finicky topic - please tell in comments about your experiences and tests



Like this post or have something to say => tell us more in the comments or donate!

How to train with frozen BatchNorm?

Since pytorch does not support syncBN, I hope to freeze mean/var of BN layer while trainning. Mean/Var in pretrained model are used while weight/bias are learnable. In this way, calculation of bottom_grad in BN will be different from that of the novel trainning mode. However, we do not find any flag in the function bellow to mark this difference. pytorch/torch/csrc/cudnn/BatchNorm.cpp void cudnn_batch_norm_backward( THCState* state, cudnnHandle_t handle, cudnnDataType_t dataType, THVo...

snakers4 (Alexander), January 11, 05:49

Trick for image preprocessing - histogram equalization