Spark in me - Internet, data science, math, deep learning, philo

snakers4 @ telegram, 1797 members, 1726 posts since 2016

All this - lost like tears in rain.

Data science, ML, a bit of philosophy and math. No bs.

Our website
- spark-in.me
Our chat
- t.me/joinchat/Bv9tjkH9JHYvOr92hi5LxQ
DS courses review
- goo.gl/5VGU5A
- goo.gl/YzVUKf

March 07, 15:42

Our experiments with Transformers, BERT and generative language pre-training

TLDR

For morphologically rich languages pre-trained Transformers are not a silver bullet and from a layman's perspective they are not feasible unless someone invests huge computational resources into sub-word tokenization methods that work well + actually training these large networks.

On the other hand we have definitively shown that:

- Starting a transformer with Embedding bag initialized via FastText works and is relatively feasible;

- On complicated tasks - such transformer significantly outperforms training from scratch (as well as naive models) and shows decent results compared to state-of-the-art specialized models;

- Pre-training worked, but it overfitted more thatn FastText initialization and given the complexity required for such pre-training - it is not useful;

spark-in.me/post/bert-pretrain-ru

All in all this was a relatively large gamble, which did not pay off - on some more down-to-earth task we hoped the Transformer would excel at - it did not.

#deep_learning

Complexity / generalization /computational cost in modern applied NLP for morphologically rich languages

Complexity / generalization /computational cost in modern applied NLP for morphologically rich languages. Towards a new state of the art? Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


An approach to ranking search results with no annotation

Just a small article with a novel idea:

- Instead of training a network with CE - just train it with BCE;

- Source additional structure from the inner structure of your domain (tags, matrix decomposition methods, heuristics, etc);

spark-in.me/post/classifier-result-sorting

Works best if your ontology is relatively simple.

#deep_learning

Learning to rank search results without annotation

Solving search ranking problem Статьи автора - http://spark-in.me/author/adamnsandle Блог - http://spark-in.me