Spark in me - Internet, data science, math, deep learning, philo

snakers4 @ telegram, 1369 members, 1636 posts since 2016

All this - lost like tears in rain.

Data science, deep learning, sometimes a bit of philosophy and math. No bs.

Our website
- spark-in.me
Our chat
- goo.gl/WRm93d
DS courses review
- goo.gl/5VGU5A
- goo.gl/YzVUKf

snakers4 (Alexander), November 19, 13:59

Forwarded from Loss function porn:
"80 years of AI research. Epic battle between connectionist (~neural networks) and symbolic (~rule based) methods. Who will win?"
👤 @OriolVinyalsML (twitter)
📉 @loss_function_porn

snakers4 (Alexander), November 15, 08:09

DS/ML digest 29

spark-in.me/post/2018_ds_ml_digest_29

#digest

#deep_learning

#data_science

2018 DS/ML digest 29

2018 DS/ML digest 29 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), November 14, 08:36

An intro to RL

Though published by OpenAI with TF, this is simply amazing:

- spinningup.openai.com/en/latest/spinningup/rl_intro.html

#rl

snakers4 (Alexander), November 12, 19:26

youtu.be/zL6ltnSKf9k

This AI Learned To Isolate Speech Signals
The paper "Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation " is available here: looking-to-li...

snakers4 (Alexander), November 12, 09:09

When it is colder, under full load GPUs run at 70C

snakers4 (Alexander), November 10, 18:32

Towards Data Science

Our article was accepted to their publication:

- towardsdatascience.com/building-client-routing-semantic-search-in-the-wild-14db04687c7e

Also when you have published once there, then you can just publish your work on TDS on recurrent basis =)

I doubt that this will be properly distributed to all 130k of their subs, but nevertheless this is a milestone.

#data_science

Building client routing / semantic search in the wild

A comparison of novel NLP techniques within an applied business setting


snakers4 (Alexander), November 10, 10:30

Playing with Transformer

TLDR - use only pre-trained.

On classification tasks performed the same as classic models.

On seq2seq - much worse time / memory wise. Inference is faster though.

#nlp

snakers4 (Alexander), November 09, 13:48

Fast-text trained on a random mix of Russian Wikipedia / Taiga / Common Crawl

On our benchmarks was marginally better than fast-text trained on Araneum from Rusvectors.

Download link

goo.gl/g6HmLU

Params

Standard params - (3,6) n-grams + vector dimensionality is 300.

Usage:

import fastText as ft

ft_model_big = ft.load_model('model')And then just refer to

github.com/facebookresearch/fastText/blob/master/python/fastText/FastText.py

#nlp

snakers4 (Alexander), November 06, 13:45

DS/ML digest 28

Google open sources pre-trained BERT ... with 102 languages ...

spark-in.me/post/2018_ds_ml_digest_28

#digest

#deep_learning

#data_science

2018 DS/ML digest 28

2018 DS/ML digest 28 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), November 06, 12:38

A small saga about keeping GPUs cool

(1) 1-2 GPUs with blower fans (or turbo fans) in a full tower

-- idle 40-45C

-- full load - 80-85C

(2) 3-4 GPUs with blower fans (or turbo fans) in a full tower

-- idle - 45-55C

-- full load - 85-95С

Also with 3-4+ GPUs your room starts to heat up significantly + even without full fan speed / overclocking the sound is not very pleasant.

Solutions:

(0) Add a corrugated air duct to dump heat outside minus 3-5C under load;

(1) Add a high-pressure fan to blow between the GPUs minus 3-5C under load;

(2) Place the tower on the balcony minus 3-5C under load;

In the end it is possible to achieve <75C under full load on 4 or even 6 GPUs.

#deep_learning

snakers4 (Alexander), November 05, 14:59

Forwarded from Just links:

DropBlock: A regularization method for convolutional networks arxiv.org/abs/1810.12890

Forwarded from Just links:

github.com/Randl/DropBlock-pytorch

Randl/DropBlock-pytorch

Implementation of DropBlock in Pytorch. Contribute to Randl/DropBlock-pytorch development by creating an account on GitHub.


snakers4 (Alexander), November 03, 10:04

Also reposts on additional platforms

- Habr - habr.com/post/428674/

Please support us if you have an account.

Building client routing / semantic search at Profi.ru

Building client routing / semantic search and clustering arbitrary external corpuses at Profi.ru TLDR This is a very short executive summary (or a teaser) about...


snakers4 (Alexander), November 03, 09:40

Building client routing / semantic search and clustering arbitrary external corpuses at Profi.ru

A brief executive summary about what we achieved at Profi.ru.

If you have similar experience or have anything similar to share - please do not hesitate to contact me.

Also we are planning to extend this article into a small series, if it gains momentum. So please like / share the article if you like it.

spark-in.me/post/profi-ru-semantic-search-project

#nlp

#data_science

#deep_learning

Building client routing / semantic search and clustering arbitrary external corpuses at Profi.ru

Building client routing / semantic search and clustering arbitrary external corpuses at Profi.ru Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), October 30, 08:04

Forwarded from Админим с Буквой:

Google запускает reCaptcha v3

youtu.be/tbvxFW4UJdU

#news

Introducing reCAPTCHA v3
reCAPTCHA v3 is a new version that detects abusive traffic on your website without user friction. It returns a score for each request you send to reCAPTCHA a...

snakers4 (Alexander), October 27, 17:58

www.youtube.com/watch?v=F-00NhYUnH4

This AI Learned How To Generate Human Appearance
Pick up cool perks on our Patreon page: › www.patreon.com/TwoMinutePapers The paper "A Variational U-Net for Conditional Appearance and Shape Generat...

snakers4 (Alexander), October 27, 10:16

Canonical one-hot encoding one-liner in PyTorch

Or 2 liner, whatever)

# src - is the input tensor (batch,indexes)

trg_oh = torch.FloatTensor(src.size(0), src.size(1), self.tgt_vocab).zero_().to(self.device)

trg_oh.scatter_(2, trg, 1)#deep_learning

snakers4 (Alexander), October 26, 12:31

A sticker pack for our channel / group

We decided to draw a sticker pack for our telegram channel / group with @birdborn

Please help select the best stickers!

Please vote here:

goo.gl/forms/dPPUADKEM4Zq1YkI2

(Russian)

С каких стикеров начать?

Выбираем топ стикеров, с которым начнем рисовать!


snakers4 (Alexander), October 24, 09:11

Concurrent Spatial and Channel Squeeze &amp; Excitation in Fully Convolutional Networks

- Essentially attention for semseg model - channel-wise attention, spatial and mixed attention

- Paper arxiv.org/abs/1803.02579

- Implementation www.kaggle.com/c/tgs-salt-identification-challenge/discussion/66178

#deep_learning

snakers4 (Alexander), October 23, 07:04

Do you read digests?

anonymous poll

Yes – 37

👍👍👍👍👍👍👍 61%

Love them – 12

👍👍 20%

No – 11

👍👍 18%

I have an idea how to improve them (PM me) – 1

▫️ 2%

👥 61 people voted so far.

snakers4 (Alexander), October 23, 06:28

DS/ML digest 27

NLP in the focus again!

spark-in.me/post/2018_ds_ml_digest_27

Also your humble servant learned how to do proper NMT =)

#digest

#deep_learning

#data_science

2018 DS/ML digest 27

2018 DS/ML digest 27 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), October 22, 11:26

In case of github failure

They have a blog with current statuses

status.github.com/messages

snakers4 (Alexander), October 22, 05:43

Amazing articles about image hashing

Also a python library

- Library github.com/JohannesBuchner/imagehash

- Articles:

fullstackml.com/wavelet-image-hash-in-python-3504fdd282b5http://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html

www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html

www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html

#data_science

#computer_vision

JohannesBuchner/imagehash

A Python Perceptual Image Hashing Module. Contribute to JohannesBuchner/imagehash development by creating an account on GitHub.


Text iterators in PyTorch

Looks like PyTorch has some handy data-processing / loading tools for text models - torchtext.readthedocs.io.

It is explained here - bastings.github.io/annotated_encoder_decoder/ - how to use them with pack_padded_sequence and pad_packed_sequence to boost PyTorch NLP models substantially.

#nlp

#deep_learning

snakers4 (Alexander), October 19, 18:34

youtu.be/uEJ71VlUmMQ

Detecting Faces (Viola Jones Algorithm) - Computerphile
Deep learning is used for everything these days, but this face detection algorithm is so neat its still in use today. Dr Mike Pound on the Viola/Jones algori...

snakers4 (Alexander), October 17, 14:41

Forwarded from Sava Kalbachou:

twitter.com/fchollet/status/1052228463300493312

François Chollet

Here is the same dynamic RNN implemented in 4 different frameworks (TensorFlow/Keras, MXNet/Gluon, Chainer, PyTorch). Can you tell which is which?


I guess PyTorch is in the bottom left corner, but realistically the author of this snippet did a lot of import A as B

snakers4 (Alexander), October 16, 05:15

Google's super resolution zoom

Finally Google made something interesting

www.youtube.com/watch?v=z-ZJqd4eQrc

ai.googleblog.com/2018/10/see-better-and-further-with-super-res.html

Super Res Zoom

snakers4 (Alexander), October 16, 03:47

Mixed precision distributed training ImageNet example in PyTorch

github.com/NVIDIA/apex/blob/master/examples/imagenet/main.py

#deep_learning

NVIDIA/apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch - NVIDIA/apex


snakers4 (Alexander), October 15, 17:11

Looks like mixed precision training ... is solved in PyTorch

Lol - and I could not find it

github.com/NVIDIA/apex/tree/master/apex/amp

#deep_learning

NVIDIA/apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch - NVIDIA/apex


snakers4 (Alexander), October 15, 16:56

An Open source alternative to Mendeley

Looks like that Zotero is also cross-platform, and open-source

Also you can import the whole Mendeley library with 1 button push:

www.zotero.org/support/kb/mendeley_import

#data_science

kb:mendeley import [Zotero Documentation]

Zotero is a free, easy-to-use tool to help you collect, organize, cite, and share research.


www.youtube.com/watch?v=KJAnSyB6mME

PyTorch developer conference part 1
Sessions on Applied Research in Industry & Developer Education. Talks from @karpathy (@Tesla), @ctnzr (@nvidia), @ftzo (Pyro/@UberEng), @MarkNeumannnn (@alle...

snakers4 (Alexander), October 15, 09:33

DS/ML digest 26

More interesting NLP papers / material ...

spark-in.me/post/2018_ds_ml_digest_26

#digest

#deep_learning

#data_science

2018 DS/ML digest 26

2018 DS/ML digest 26 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), October 12, 19:11

www.youtube.com/watch?v=kBFMsY5ZP0o

This AI Senses Humans Through Walls
Pick up cool perks on our Patreon page: › www.patreon.com/TwoMinutePapers Crypto and PayPal links are available below. Thank you very much for your g...