Spark in me - Internet, data science, math, deep learning, philo

snakers4 @ telegram, 1317 members, 1587 posts since 2016

All this - lost like tears in rain.

Data science, deep learning, sometimes a bit of philosophy and math. No bs.

Our website
- spark-in.me
Our chat
- goo.gl/WRm93d
DS courses review
- goo.gl/5VGU5A
- goo.gl/YzVUKf

Posts by tag «deep_learning»:

snakers4 (Alexander), September 20, 16:06

DS/ML digest 24

Key topics of this one:

- New method to calculate phrase/n-gram/sentence embeddings for rare and OOV words;

- So many releases from Google;

spark-in.me/post/2018_ds_ml_digest_24

If you like our digests, you can support the channel via:

- Sharing / reposting;

- Giving an article a decent comment / a thumbs-up;

- Buying me a coffee (links on the digest);

#digest

#deep_learning

#data_science

2018 DS/ML digest 24

2018 DS/ML digest 24 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), September 06, 06:18

SeNet

- arxiv.org/abs/1709.01507;

- A 2017 Imagenet winner;

- Mostly ResNet-152 inspired network;

- Transfers well (ResNet);

- Squeeze and Excitation (SE) block, that adaptively recalibratess channel-wise feature responses by explicitly modelling in- terdependencies between channels;

- Intuitively looks like - convolution meet the attention mechanism;

- SE block:

- pics.spark-in.me/upload/aa50a2559f56faf705ad6639ac973a38.jpg

- Reduction ratio r to be 16 in all experiments;

- Results:

- pics.spark-in.me/upload/db2c98330744a6fd4dab17259d5f9d14.jpg

#deep_learning

snakers4 (Alexander), September 06, 05:57

Chainer - a predecessor of PyTorch

Looks like

- PyTorch was based not only on Torch, but also its autograd was forked from Chainer;

- Chainer looks like PyTorch ... but not by Facebook, but by independent Japanese group;

- A quick glance through the docs confirms that PyTorch and Chainer APIs look 90% identical (both numpy inspired, but using different back-ends);

- Open Images 2nd place was taken by people using Chainer with 512 GPUs;

- I have yet to confirm myself that PyTorch can work with a cluster (but other people have done it) github.com/eladhoffer/convNet.pytorch;

www.reddit.com/r/MachineLearning/comments/7lb5n1/d_chainer_vs_pytorch/

docs.chainer.org/en/stable/comparison.html

#deep_learning

eladhoffer/convNet.pytorch

ConvNet training using pytorch. Contribute to eladhoffer/convNet.pytorch development by creating an account on GitHub.


Also - thanks for all DO referral link supporters - now finally hosting of my website is free (at least for next ~6 months)!

Also today I published a 200th post on spark-in.me. Ofc not all of these are proper long articles, but nevertheless it's cool.

snakers4 (Alexander), September 06, 05:48

DS/ML digest 23

The key topic of this one - is this is insanity

- vid2vid

- unsupervised NMT

spark-in.me/post/2018_ds_ml_digest_23

If you like our digests, you can support the channel via:

- Sharing / reposting;

- Giving an article a decent comment / a thumbs-up;

- Buying me a coffee (links on the digest);

Let's spread the right DS/ML ideas together.

#digest

#deep_learning

#data_science

2018 DS/ML digest 23

2018 DS/ML digest 23 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), September 03, 06:27

Training a MNASNET from scratch ... and failing

As a small side hobby we tried training new Google's mobile network from scratch and failed:

- spark-in.me/post/mnasnet-fail-alas

- github.com/snakers4/mnasnet-pytorch

Maybe you know how to train it properly?

Also now you can upvote articles on spark in me! =)

#deep_learning

Training your own MNASNET

Training your own MNASNET Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), September 02, 06:22

A small hack to spare PyTorch memory when resuming training

When you resume from a checkpoint, consider adding this to save GPU memory:

del checkpoint

torch.cuda.empty_cache()

#deep_learning

snakers4 (Alexander), August 31, 13:59

DS/ML digest 22

spark-in.me/post/2018_ds_ml_digest_22

#digest

#deep_learning

#data_science

2018 DS/ML digest 22

2018 DS/ML digest 22 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), August 31, 13:38

ADAMW to be integrated into upstream PyTorch?

github.com/pytorch/pytorch/pull/3740

#deep_learning

Fixing Weight Decay Regularization in Adam by jingweiz · Pull Request #3740 · pytorch/pytorch

Hey, We added SGDW and AdamW in optim, accoridng to the new ICLR submission from Loshchilov and Hutter: Fixing Weight Decay Regularization in Adam. We also found some inconsistency of the current i...


snakers4 (Alexander), August 29, 08:16

Crowd-AI maps repo

Just opened my repo for crowd AI maps 2018.

Did not pursue this competition till the end, so it is not polished, .md is not updated. Use it at your own risk!

github.com/snakers4/crowdai-maps-2018

spark-in.me/post/a-small-case-for-search-of-structure-within-your-data

#deep_learning

snakers4/crowdai-maps-2018

CrowdAI mapping challenge 2018 solution. Contribute to snakers4/crowdai-maps-2018 development by creating an account on GitHub.


snakers4 (Alexander), August 26, 08:16

A small bug in PyTorch to numpy conversions

Well, maybe a feature =)

When you do something like this

a.permute(0, 2, 3, 1).numpy()

or

a.view((same shape here)).numpy()

you may expect behaviour similar to np.reshape.

But no. Looks like now all of these functions are implemented in C and they produce artifacts. To avoid artifacts call:

a.permute(0, 2, 3, 1).contiguous().numpy()

a.view((same shape here)).contiguous().numpy()

It is also iteresting, because when you apply some layers in PyTorch error is raised when you do not use .contiguous().

#deep_learning

snakers4 (Alexander), August 21, 13:31

2018 DS/ML digest 21

spark-in.me/post/2018_ds_ml_digest_21

#digest

#deep_learning

#nlp

2018 DS/ML digest 21

2018 DS/ML digest 21 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), August 20, 10:57

PyTorch - a brief state of sparse operations

TLDR - they are not there yet

- github.com/pytorch/pytorch/issues/9674

If you would like to see them implemented faster, write here

- github.com/pytorch/pytorch/issues/10043

#deep_learning

The state of sparse Tensors #9674

This note tries to summarize the current state of sparse tensor in pytorch. It describes important invariance and properties of sparse tensor, and various things need to be fixed (e.g. empty sparse tensor). It also shows some details of ...


snakers4 (Alexander), August 19, 13:20

Nice down-to-earth post about Titan V vs 1080 Ti

medium.com/@u39kun/titan-v-vs-1080-ti-head-to-head-battle-of-the-best-desktop-gpus-on-cnns-d55a19866b7c

Also it seems that new generation nvidia GPUs will have 12 GB of RAM tops.

1080Ti is the best option for CNNs now.

#deep_learning

Titan V vs 1080 Ti — Head-to-head battle of the best desktop GPUs on CNNs. Is Titan V worth it? 110 TFLOPS! no brainer, right?

NVIDIA’s Titan V is the latest “desktop” GPU built upon the Volta architecture boasting 110 “deep learning” TFLOPS in the spec sheet. That…


snakers4 (Alexander), August 13, 05:24

Float16 / half training in PyTorch

Tried to do it in the most obvious way + some hacks from here

discuss.pytorch.org/t/resnet18-throws-exception-on-conversion-to-half-floats/6696

Did anybody do it successfully on real models?

#deep_learning

Resnet18 throws exception on conversion to half floats

Hey, I have tried to launch the following code: from torchvision import models resnet = models.resnet18(pretrained=True).cpu() resnet.half() and have got an exception: libc++abi.dylib: terminating with uncaught exception of type std::invalid_argument: Unsupported tensor type Sounds like the halftensor type is not registered properly. But not sure why it’s the case. Using pytorch 0.2.0 py36_1cu75 soumith Any advice how to fix it?


snakers4 (Alexander), August 12, 11:15

2018 DS/ML digest 20

spark-in.me/post/2018_ds_ml_digest_20

#deep_learning

#digest

#data_science

2018 DS/ML digest 20

2018 DS/ML digest 20 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), August 03, 02:23

snakers4 (Alexander), July 31, 18:33

Autofocus for semseg?

arxiv.org/abs/1805.08403

I have not seen people for whom DeepLab worked...and in my tests dilated convolutions were the same...though some claim they help with high-res images with small objects...

Ideas:

(0) Autofocus layer, a novel module that enhances the multi-scale processing of CNNs by learning to select the ‘appropriate’ scale for identifying different objects in an image

(1) Layer description

pics.spark-in.me/upload/2f562fb9d12d76c36fa8777713de9716.jpg

(2) Implementation github.com/yaq007/Autofocus-Layer/blob/master/models.py

I believe this will work best for 3D images

#deep_learning

snakers4 (Alexander), July 31, 05:47

2018 DS/ML digest 19

Market / data / libraries

(0) 32k lesions image dataset open-sourced

- goo.gl/CUQwnv

- nihcc.app.box.com/v/DeepLesion

(1) A new Distill article about Differentiable Image Parameterizations

- Usually images are parametrized as RGB values (normalized)

- Idea - use different (learnable) parametrization

- distill.pub/2018/differentiable-parameterizations/

- Parametrizing resulting image with fourier transform enables to use different architectures with style transfer distill.pub/2018/differentiable-parameterizations/#figure-style-transfer-diagram

- Working with transparent images

(2) Lip reading with 40% Word Error Rate arxiv.org/pdf/1807.05162.pdf

(3) Joing auto architecture + hyper param search arxiv.org/pdf/1807.06906.pdf (*)

(4) rl-navigation.github.io/deployable/

(5) New CNN architectures from ICML www.facebook.com/icml.imls/videos/429607650887089/%20 (*)

(6) Jupiter notebook widget for text annotaion github.com/natasha/ipyannotate

(7) A bit more debunking of auto-ml by fast.ai www.fast.ai/2018/07/23/auto-ml-3/

(8) A small intro to Bayes methods alexanderdyakonov.wordpress.com/2018/07/30/%d0%b1%d0%b0%d0%b9%d0%b5%d1%81%d0%be%d0%b2%d1%81%d0%ba%d0%b8%d0%b9-%d0%bf%d0%be%d0%b4%d1%85%d0%be%d0%b4/

(9) Criminal face recognition 20% false positives - www.nytimes.com/2018/07/26/technology/amazon-aclu-facial-recognition-congress.html?

(10) Denoising images wo noiseless ground-truth news.developer.nvidia.com/ai-can-now-fix-your-grainy-photos-by-only-looking-at-grainy-photos/?ncid=--45511

NLP

(0) Autoencoders for text habr.com/company/antiplagiat/blog/418173/ - no clear conclusion?

(1) RNN use cases overview indico.cern.ch/event/722319/contributions/3001310/attachments/1661268/2661638/IML-Sequence.pdf

(2) ACL 2018 notes ruder.io/acl-2018-highlights/

Hardware

(0) Edge embeddable TPU devices aiyprojects.withgoogle.com/edge-tpu ?

(1) GeForce 11* finally coming soon? Prices for 1080Ti are falling now...

#digest

#deep_learning

NIH Clinical Center releases dataset of 32,000 CT images

Lesion data may make it easier for scientific community to identify tumor growth or new disease


snakers4 (Alexander), July 31, 04:42

Airbus ship detection challenge

On a surface this looks like a challenging and interesting competition:

- www.kaggle.com/c/airbus-ship-detection

- Train / test sets - 14G / 12G

- Downside - Kaggle and very fragile metric

- Upside - a separate significant price for fast algorithms!

- 768x768 images seem reasonable

#deep_learning

#data_science

Airbus Ship Detection Challenge

Find ships on satellite images as quickly as possible


snakers4 (Alexander), July 30, 05:48

The reality of human face recognition

There is a lot of hype related to the surveillance state / 1984 / Chinese offline cameras.

Cannot help but feature this amazing article from Russian engineers (RU):

habr.com/company/recognitor/blog/418127/

#deep_learning

Правда и ложь систем распознавания лиц

Пожалуй нет ни одной другой технологии сегодня, вокруг которой было бы столько мифов, лжи и некомпетентности. Врут журналисты, рассказывающие о технологии, врут...


snakers4 (Alexander), July 28, 05:13

New Keras version

github.com/keras-team/keras/releases/tag/2.2.1

No real major changes...

#deep_learning

keras-team/keras

Deep Learning for humans. Contribute to keras-team/keras development by creating an account on GitHub.


snakers4 (Alexander), July 27, 03:17

The truth about ML courses

cv-blog.ru/?p=238

#deep_learning

snakers4 (Alexander), July 23, 06:26

My post on open images stage 1

For posterity

Please comment

spark-in.me/post/playing-with-google-open-images

#deep_learning

#data_science

Solving class imbalance on Google open images

In this article I propose an appoach to solve a severe class imbalance on Google open images Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), July 23, 05:15

2018 DS/ML digest 18

Highlights of the week

(0) RL flaws

thegradient.pub/why-rl-is-flawed/

thegradient.pub/how-to-fix-rl/

(1) An intro to AUTO-ML

www.fast.ai/2018/07/16/auto-ml2/

(2) Overview of advances in ML in last 12 months

www.stateof.ai/

Market / applied stuff / papers

(0) New Nvidia Jetson released

www.phoronix.com/scan.php?page=news_item&px=NVIDIA-Jetson-Xavier-Dev-Kit

(1) Medical CV project in Russia - 90% is data gathering

cv-blog.ru/?p=217

(2) Differentiable architecture search

arxiv.org/pdf/1806.09055.pdf

-- 1800 GPU days of reinforcement learning (RL) (Zoph et al., 2017)

-- 3150 GPU days of evolution (Real et al., 2018)

-- 4 GPU days to achieve SOTA in CIFAR => transferrable to Imagenet with 26.9% top-1 error

(3) Some basic thoughts about hyper-param tuning

engineering.taboola.com/hitchhikers-guide-hyperparameter-tuning/

(4) FB extending fact checking to mark similar articles

www.poynter.org/news/rome-facebook-announces-new-strategies-combat-misinformation

(5) Architecture behind Alexa choosing skills goo.gl/dWmXZf

- Char-level RNN + Word-level RNN

- Shared encoder, but attention is personalized

(6) An overview of contemporary NLP techniques

medium.com/@ageitgey/natural-language-processing-is-fun-9a0bff37854e

(7) RNNs in particle physics?

indico.cern.ch/event/722319/contributions/3001310/attachments/1661268/2661638/IML-Sequence.pdf?utm_campaign=Revue%20newsletter&utm_medium=Newsletter&utm_source=NLP%20News

(8) Google cloud provides PyTorch images

twitter.com/i/web/status/1016515749517582338

NLP

(0) Use embeddings for positions - no brainer

twitter.com/i/web/status/1018789622103633921

(1) Chatbots were a hype train - lol

medium.com/swlh/chatbots-were-the-next-big-thing-what-happened-5fc49dd6fa61

The vast majority of bots are built using decision-tree logic, where the bot’s canned response relies on spotting specific keywords in the user input.Interesting links

(0) Reasons to use OpenStreetMap

www.openstreetmap.org/user/jbelien/diary/44356

(1) Google deployes its internet ballons

goo.gl/d5cv6U

(2) Amazing problem solving

nevalalee.wordpress.com/2015/11/27/the-hotel-bathroom-puzzle/

(3) Nice flame thread about CS / ML is not science / just engineering etc

twitter.com/RandomlyWalking/status/1017899452378550273

#deep_learning

#data_science

#digest

RL’s foundational flaw

RL as classically formulated has lately accomplished many things - but that formulation is unlikely to tackle problems beyond games. Read on to see why!


snakers4 (Alexander), July 22, 08:55

Playing with open-images

Did a benchmark of multi-class classification models and approaches useful in general with multi-tier classificators.

The basic idea is - follow the graph structure of class dependencies - train a good multi-class classifier => train coarse semseg models for each big cluster.

What worked

- Using SOTA classifiers from imagenet

- Pre-training with frozen encoder (otherwise the model performes worse)

- Best performing architecture so far - ResNet152 (a couple of others to try as well)

- Different resolutions => binarise them => divide into 3 major clusters (2:1,1:2,1:1)

- Using adaptive pooling for different aspect ratio clusters

What did not work or did not significantly improve results

- Oversampling

- Using modest or minor augs (10% or 25% of images augmented)

What did not work

- Using 1xN + Nx1 convolutions instead of pooling - too heavy

- Using some minimal avg. pooling (like 16x16), then using different 1xN + Nx1 convolutions for different clusters - performed mostly worse than just adaptive pooling

Yet to try

- Focal loss

- Oversampling + augs

#deep_learning

snakers4 (Alexander), July 21, 11:02

Found an amazing explanation about Python's super here

stackoverflow.com/a/27134600

Understanding Python super() with __init__() methods

I'm trying to understand the use of super(). From the looks of it, both child classes can be created, just fine. I'm curious to know about the actual difference between the following 2 child clas...


Playing with focal loss for multi-class classification

Playing with this Loss

gist.github.com/snakers4/5739ade67e54230aba9bd8a468a3b7be

If anyone has a better option - please PM me / or comment in the gist.

#deep_learning

#data_science

Multi class classification focal loss

Multi class classification focal loss . GitHub Gist: instantly share code, notes, and snippets.


snakers4 (Alexander), July 21, 07:51

Yet another kaggle competition with high prizes and easy challenge

www.kaggle.com/c/tgs-salt-identification-challenge

#deep_learning

TGS Salt Identification Challenge

Segment salt deposits beneath the Earth's surface


snakers4 (Alexander), July 18, 05:39

Lazy failsafe in PyTorch Data Loader

Sometimes you train a model and testing all the combinations of augmentations / keys / params in your dataloader is too difficult. Or the dataset is too large, so it would take some time to check it properly.

In such cases I usually used some kind of failsafe try/catch.

But looks like even simpler approach works:

if img is None:

# do not return anything

pass

else:

return img

#deep_learning

#pytorch

snakers4 (Alexander), July 17, 08:51

Colab SeedBank

- TF is everywhere (naturally) - but at least they use keras

- On the other hand - all of the files are (at least now) downloadable via .ipynb or .py

- So - it may be a good place to look for boilerplate code

Also interesting facts, that are not mentioned openly

- Looks like they use Tesla K80s, which practically are 2.5-3x slower than 1080Ti

(medium.com/initialized-capital/benchmarking-tensorflow-performance-and-cost-across-different-gpu-options-69bd85fe5d58)

- Full screen notebook format is clearly inspired by Jupyter plugins

- Ofc there is a time limit for GPU scripts and GPU availability is not guaranteed (reported by people who used it)

- Personally - it looks a bit like slow instances from FloydHub - time limitations / slow GPU etc/etc

In a nutshell - perfect source of boilerplate code + playground for new people.

#deep_learning

Benchmarking Tensorflow Performance and Cost Across Different GPU Options

Machine learning practitioners— from students to professionals — understand the value of moving their work to GPUs . Without one, certain…


snakers4 (Alexander), July 17, 08:32