Now they stack ... normalization!
Tough to choose between BN / LN / IN?
Now a stacked version with attention exists!
Also, their 1D implementation does not work, but you can hack their 2D (actually BxCxHxW) layer to work with 1D (actually BxCxW) data =)
Code for Switchable Normalization from "Differentiable Learning-to-Normalize via Switchable Normalization", https://arxiv.org/abs/1806.10779 - switchablenorms/Switchable-Normalization