Spark in me - Internet, data science, math, deep learning, philo

snakers4 @ telegram, 1365 members, 1673 posts since 2016

All this - lost like tears in rain.

Data science, deep learning, sometimes a bit of philosophy and math. No bs.

Our website
Our chat
DS courses review

Posts by tag «digest»:

snakers4 (Alexander), January 15, 08:33

First 2019 DS / ML digest

No particular highlights - just maybe ML industrialization vector is here to stay?




2019 DS/ML digest 01

2019 DS/ML digest 01 Статьи автора - Блог -

snakers4 (Alexander), December 19, 08:16

DS/ML digest 32


- A way to replace softmax in NMT;

- Large visual reasoning dataset;

- PyText;




2018 DS/ML digest 32

2018 DS/ML digest 32 Статьи автора - Блог -

snakers4 (Alexander), December 09, 07:59

DS/ML digest 31

Highlights of the week:

- PyTorch 1.0 released;

- Drawing with GANs;

- BERT explained;




2018 DS/ML digest 31

2018 DS/ML digest 31 Статьи автора - Блог -

snakers4 (Alexander), November 28, 11:55

DS/ML digest 30




2018 DS/ML digest 30

2018 DS/ML digest 30 Статьи автора - Блог -

snakers4 (Alexander), November 15, 08:09

DS/ML digest 29




2018 DS/ML digest 29

2018 DS/ML digest 29 Статьи автора - Блог -

snakers4 (Alexander), November 06, 13:45

DS/ML digest 28

Google open sources pre-trained BERT ... with 102 languages ...




2018 DS/ML digest 28

2018 DS/ML digest 28 Статьи автора - Блог -

snakers4 (Alexander), October 23, 06:28

DS/ML digest 27

NLP in the focus again!

Also your humble servant learned how to do proper NMT =)




2018 DS/ML digest 27

2018 DS/ML digest 27 Статьи автора - Блог -

snakers4 (Alexander), October 15, 09:33

DS/ML digest 26

More interesting NLP papers / material ...




2018 DS/ML digest 26

2018 DS/ML digest 26 Статьи автора - Блог -

snakers4 (Alexander), September 28, 11:12

DS/ML digest 25




2018 DS/ML digest 25

2018 DS/ML digest 25 Статьи автора - Блог -

snakers4 (Alexander), September 20, 16:06

DS/ML digest 24

Key topics of this one:

- New method to calculate phrase/n-gram/sentence embeddings for rare and OOV words;

- So many releases from Google;

If you like our digests, you can support the channel via:

- Sharing / reposting;

- Giving an article a decent comment / a thumbs-up;

- Buying me a coffee (links on the digest);




2018 DS/ML digest 24

2018 DS/ML digest 24 Статьи автора - Блог -

snakers4 (Alexander), September 06, 05:48

DS/ML digest 23

The key topic of this one - is this is insanity

- vid2vid

- unsupervised NMT

If you like our digests, you can support the channel via:

- Sharing / reposting;

- Giving an article a decent comment / a thumbs-up;

- Buying me a coffee (links on the digest);

Let's spread the right DS/ML ideas together.




2018 DS/ML digest 23

2018 DS/ML digest 23 Статьи автора - Блог -

snakers4 (Alexander), August 31, 13:59

DS/ML digest 22




2018 DS/ML digest 22

2018 DS/ML digest 22 Статьи автора - Блог -

snakers4 (Alexander), August 21, 13:31

2018 DS/ML digest 21




2018 DS/ML digest 21

2018 DS/ML digest 21 Статьи автора - Блог -

snakers4 (Alexander), August 12, 11:15

2018 DS/ML digest 20




2018 DS/ML digest 20

2018 DS/ML digest 20 Статьи автора - Блог -

snakers4 (Alexander), July 31, 05:47

2018 DS/ML digest 19

Market / data / libraries

(0) 32k lesions image dataset open-sourced



(1) A new Distill article about Differentiable Image Parameterizations

- Usually images are parametrized as RGB values (normalized)

- Idea - use different (learnable) parametrization


- Parametrizing resulting image with fourier transform enables to use different architectures with style transfer

- Working with transparent images

(2) Lip reading with 40% Word Error Rate

(3) Joing auto architecture + hyper param search (*)


(5) New CNN architectures from ICML (*)

(6) Jupiter notebook widget for text annotaion

(7) A bit more debunking of auto-ml by

(8) A small intro to Bayes methods

(9) Criminal face recognition 20% false positives -

(10) Denoising images wo noiseless ground-truth


(0) Autoencoders for text - no clear conclusion?

(1) RNN use cases overview

(2) ACL 2018 notes


(0) Edge embeddable TPU devices ?

(1) GeForce 11* finally coming soon? Prices for 1080Ti are falling now...



NIH Clinical Center releases dataset of 32,000 CT images

Lesion data may make it easier for scientific community to identify tumor growth or new disease

snakers4 (Alexander), July 23, 05:15

2018 DS/ML digest 18

Highlights of the week

(0) RL flaws

(1) An intro to AUTO-ML

(2) Overview of advances in ML in last 12 months

Market / applied stuff / papers

(0) New Nvidia Jetson released

(1) Medical CV project in Russia - 90% is data gathering

(2) Differentiable architecture search

-- 1800 GPU days of reinforcement learning (RL) (Zoph et al., 2017)

-- 3150 GPU days of evolution (Real et al., 2018)

-- 4 GPU days to achieve SOTA in CIFAR => transferrable to Imagenet with 26.9% top-1 error

(3) Some basic thoughts about hyper-param tuning

(4) FB extending fact checking to mark similar articles

(5) Architecture behind Alexa choosing skills

- Char-level RNN + Word-level RNN

- Shared encoder, but attention is personalized

(6) An overview of contemporary NLP techniques

(7) RNNs in particle physics?

(8) Google cloud provides PyTorch images


(0) Use embeddings for positions - no brainer

(1) Chatbots were a hype train - lol

The vast majority of bots are built using decision-tree logic, where the bot’s canned response relies on spotting specific keywords in the user input.Interesting links

(0) Reasons to use OpenStreetMap

(1) Google deployes its internet ballons

(2) Amazing problem solving

(3) Nice flame thread about CS / ML is not science / just engineering etc




RL’s foundational flaw

RL as classically formulated has lately accomplished many things - but that formulation is unlikely to tackle problems beyond games. Read on to see why!

snakers4 (spark_comment_bot), July 13, 05:22

2018 DS/ML digest 17

Highlights of the week

(0) Troubling trends with ML scholars

(1) NLP close to its ImageNet stage?

Papers / posts / articles

(0) Working with multi-modal data

- concatenation-based conditioning

- conditional biasing or scaling ("residual" connections)

- sigmoidal gating

- all in all this approach seems like a mixture of attention / gating for multi-modal problems

(1) Glow, a reversible generative model which uses invertible 1x1 convolutions

(2) Facebooks moonshots - I kind of do not understand much here


(3) RL concept flaws?


(4) Intriguing failures of convolutions - this is fucking amazing

(5) People are only STARTING to apply ML to reasoning

Yet another online book on Deep Learning

(1) Kind of standard!/book/grokking-deep-learning/chapter-1/v-10/1

Libraries / code

(0) Data version control continues to develop




Like this post or have something to say => tell us more in the comments or donate!

Troubling Trends in Machine Learning Scholarship

By Zachary C. Lipton* & Jacob Steinhardt* *equal authorship Originally presented at ICML 2018: Machine

snakers4 (Alexander), July 04, 07:57

2018 DS/ML digest 15

What I filtered through this time

Market / news

(0) Letters by big company employees against using ML for weapons

- Microsoft

- Amazon

(1) Facebook open sources Dense Pose (eseentially this is Mask-RCNN)


Papers / posts / NLP

(0) One more blog post about text / sentence embeddings

- key idea different weighting

(1) One more sentence embedding calculation method

- ?

(2) Posts explaing NLP embeddings

- - some basics - SVD / Word2Vec / GloVe

-- SVD improves embedding quality (as compared to ohe)?

-- use log-weighting, use TF-IDF weighting (the above weighting)

- - word embedding properties

-- dimensions vs. embedding quality

(3) Spacy + Cython = 100x speed boost - - good to know about this as a last resort

- described use-case

you are pre-processing a large training set for a DeepLearning framework like pyTorch/TensorFlow

or you have a heavy processing logic in your DeepLearning batch loader that slows down your training

(4) Once again stumbled upon this -

(5) Papers

- Simple NLP embedding baseline

- NLP decathlon for question answering

- Debiasing embeddings

- Once again transfer learning in NLP by open-AI -




Download full.pdf 0.04 MB

snakers4 (Alexander), July 02, 04:51

2018 DS/ML digest 14

Amazing article - why you do not need ML


- I personally love plain-vanilla SQL and in 90% of cases people under-use it

- I even wrote 90% of my JSON API on our blog in pure PostgreSQL xD

Practice / papers

(0) Interesting papers from CVPR

(1) Some down-to-earth obstacles to ML deploy

(2) Using synthetic data for CNNs (by Nvidia) -

(3) This puzzles me - so much effort and engineering spent on something ... strange and useless -

On paper they do a cool thing - investigate transfer learning between different domains, but in practice it is done on TF and there is no clear conclusion of any kind

(4) VAE + real datasets - only small Imagenet (64x64)

(5) Understanding the speed of models deployed on mobile -

(6) A brief overview of multi-modal methods

Visualizations / explanations

(0) Amazing website with ML explanations

(1) PCA and linear VAEs are close




No, you don't need ML/AI. You need SQL

A while ago, I did a Twitter thread about the need to use traditional and existing tools to solve everyday business problems other than jumping on new buzzwords, sexy and often times complicated technologies.

snakers4 (Alexander), June 28, 07:43

2018 DS/ML digest 13

Blog posts / articles:

(0) Google notes on CNN generalization -

(1) Google to teaching robots in virtual environment and then trasferring models to reality -

(2) Google's object tracking via image colorization -

(2) Interesting articles about VAEs:

- A small intro into VAEs

- A small intuitive intro (super super cool and intuitive)

- KL divergence explained

- A more formal write-up

- In (RU)

- Converting a FC layer into a conv layer

- A post by Fchollet

A good in-depth write-up on object detection:


- finally a decent explanation of YOLO parametrization[email protected]

- best comparison of YOLO and SSD ever -[email protected]

Papers with interesting abstracts (just good to know sich things exist)

- Low-bit CNNs -

- Automated Meta ML -

- Idea - use ResNet blocks for boosting -

- 2D-discrete-Fourier transform (2D-DFT) to encode rotational invariance in neural networks -

- Smallify the CNNs -

- BLEU review as a metric - conclusion - it is good on average to measure MT performance -

"New" ideas in SemSeg:

- UNET + conditional VAE

- Dilated convolutions for larget satellite images - looks like that this works only if you have high resolution with small objects



How Can Neural Network Similarity Help Us Understand Training and Generalization?

Posted by Maithra Raghu, Google Brain Team and Ari S. Morcos, DeepMind In order to solve tasks, deep neural networks (DNNs) progressively...

snakers4 (Alexander), June 23, 12:10

Interesting links about Internet

- Ben Evans' digest -

- China plans to track cars -

- Ben Evans - content is not king anymore - distribution / eco-system are

- Google opens AI center in Ghana -

- (RU) A funny case on censorship in Russia - funny article deleted from habr -

-- It kind of clearly shows that you cannot safely post anything to habr

- India + WhatsApp + lynch mobs -

- Tor foundation about web-tracking and Facebook -

- Docker image jacking for crypto-mining -

- Ethereum - 75% transactions automated bots -

- (RU) - analyzing fake elections in Russia - 3-10M votes are fake -


2018 DS/ML digest 12

As usual, this is whatever I found really interesting / worth reading.

Implementations / papers / ideas


You can count bees well with UNet -


A really super cool idea - use affine transformations in 3D to stack augmentations on the level of transformation matrices

(3D augs are costly)


- both for rotation and scaling

- note a couple of things for easier understanding:

-- there is offset in tranformations - because the coordinate center is not in "center"

-- zoom essentially scales unit vectors after applying the offset

- 3Blue1Brown videos about linear algebra -


A top solution from Google's Landmark Challenge -


- ensemble of features / skip connections from a CNN (ResNeXt)


- use KNN + augment the extracted features by averaging with similar images

- query expansion (use the fact that different crops of the same landmark remain the same landmark)


(RU) A super cool series about interestring clustering algorithms

- Affinity propagation




- (spoiler - in practice use awesome HDBSCAN library)


Brief review of image super-resolution techniques


- In a nutshell try in this order FCN CNNs, auto-encoders with skip connections or GANs


SOTA NLP by open-ai

Key ideas

- Train a transformer language models on large corpus in an unsupervised way

- Fine-tune on a smaller task

- Profit


- "Our approach requires an expensive pre-training step - 1 month on 8 GPUs" (probably this should be discounted somewhat)

- TF and unreadable enterprise code


One more claimed SOTA word embedding set


A cool github page by Sebastian Ruder to track major NLP tasks



Amazing visual explanations of how decision trees work


- it explains visually how overfitting occurs in decisions tree models


CIFAR T-SNE can be done in real-time on the GPU + tensorflow.js integration

- Blog

- Website

- Arxiv -

- Demo -

(2) Why people fail to use d3.js -


(0) Nice idea - use available tools and videos to collect datasets




snakers4 (Alexander), June 12, 10:48

Interesting links about Internet

- Ben Evans' digest -

- Why it took so much time to create previews for Wikipedia -

- Google postulating its AI principles?

- Google product alternatives - - I personally started to switch to more open-source stuff lately, but Docs and Android have no real options

- The future of ML in embedded devices - (sound ideas, but a post is by an evangelist)

- Yahoo messenger shutting down (20 years!) - - hi ICQ

- Microsoft Buys GitHub for $7.5 Billion - 16z write-up -

- NYC medallions dropped 5x in price -

- JD covers villages in China with drone delivery already -


snakers4 (spark_comment_bot), June 06, 07:55

2018 DS/ML digest 11



New Andrew Ng paper on radiology datasets

YouTube 8M Dataset post

As mentioned before - this is more or less blatant TF marketing

New papers / models / architectures

(0) Google RL search for optimal augmentations

- Blog, paper

- Finally Google paid attention to augmentations

- 83.54% top1 accuracy on ImageNet

- Discrete search problem, each policy consists of 5 sub-policies each each operation associated with two hyperparameters: probability and magnitude

- Training regime cosine decay for 200 epochs

- Top accuracy on ImageNet

- Best policy

- Typical examples of augmentations


Training CNNs with less data

Key idea - with clever selection of data you can decrease annotation costs 2-3x


Regularized Evolution for Image Classifier Architecture Search (AmoebaNet)

- The first controlled comparison of the two search algorithms (genetic and RL)

- Mobile-size ImageNet (top-1 accuracy = 75.1% with 5.1M parameters)

- ImageNet (top-1 accuracy = 83.1%)

Evolution vs. RL at Large-Compute Scale

• Evolution and RL do equally well on accuracy

• Both are significantly better than Random Search

• Evolution is faster

But the proper description of the architecture is nowhere to be seen...

Libraries / code / frameworks

(0) OpenCV installation for Ubuntu18 from source (if you need e.g. video support)

News / market

(0) Idea adversarial filters for apps -

(1) A list of 30 best practices for amateur ML / DL specialits -

- Some ideas about tackling naive NLP problems

- PyTorch allegedly supports just freezing bn layers

- Also a neat idea I tried with inception nets - assign different learning rates to larger models when fine-tuning them

(2) Stumbled upon a reference on NAdam as optimizer as being a bit better than Adam

It is also described in this popular article

(3) Barcode reader via OpenCV



Like this post or have something to say => tell us more in the comments or donate!

snakers4 (Alexander), June 05, 14:42

A very useful combination in tmux

You can resize your panes via pressing

- first ctrl+b

- hold ctrl

- press arrow keys several time holding ctrl


- profit



Digest about Internet

(0) Ben Evans Internet digest -

(1) GitHub purchased by Microsoft -

-- If you want to migrate - there are guides already -

(2) And a post on how Microsoft kind of ruined Skype -

-- focus on b2b

--lack of focus, constant redesigns, faltering service

(3) No drop in FB usage after its controversies -

(4) Facebook allegedly employes 1200 moderators for Germany -

(5) Looks like many Linux networking tools have been outdated for years



snakers4 (spark_comment_bot), May 21, 06:21

2018 DS/ML digest 11

Cool thing this week

(0) ML vs. compute stidy since 2012 - chart / link


(0) Once again about Google Duplex

(1) Google announcements from Google IO

-- Email autocomplete

We encode the subject and previous email by averaging the word embeddings in each field. We then join those averaged embeddings, and feed them to the target sequence RNN-LM at every decoding step, as the model diagram below shows.

-- Learning Semantic Textual Similarity from Conversations blog, paper. Something in the lines of Sentence2Vec, but for conversations, self-supervised, uses attention and embedding averaging

-- Google Clips device + interesting moment estimation on the device. Looks like MobileNet distillation into a small network with some linear models on top

Libraries / tools / papers

(0) SaaS NLP annotation tool

(1) CNNs allegedly can reconstruct low light images? Blog, paper, Looks cool AF

(2) Cool thing to try in a new project - postgres restful API wrapper - such things require a lot of care though, but can elimininate a lot of useless work for small projects.

For my blog I had to write a simple business tier layer myself. I doubt that I could use this w/o overengineering because I constructed open-graph tags for example in SQL queries for example

Job / job market

(0) (RU) Realistic IT immigration story


(0) Last week open images dataset was updated. I downloaded the small one for the sake of images. Though the download process itself is a bit murky




Like this post or have something to say => tell us more in the comments or donate!

snakers4 (spark_comment_bot), May 13, 11:25

2018 DS/ML digest 10


(0) Some moonshots by Google in working with electronic health records

(1) Google duplex - a narrow domain bot that makes calls for you

(2) Nature wants to make its ML journal ... paid

(3) Standford DawnBench - training Imagenet encoders as quickly and cheaply as possible

(4) Facebook achieves 85% on Imagenet by training on 1bn images in 336 GPUs in a week

(5) Learning the models of the surrounding world based on a DOOM like game

Practice / libraries / code

(0) A smarter and new way to ensemble CNNs

- Traditional approach - ensemble CNNS with different architecture - and just vote / average / apply linear regression on top

- Newer approach - use Cyclic Learning rate

- Even newer approach - model snapshot ensembling

- Stochastic Weight Averaging

-- store running average of the models

-- train one model with CLR

-- at the end of each lr update (or epoch) - do a running average of the models with some weights

-- the gist of the method is located on this line

-- I do understand why the update bnorm params, but I do not understand why it cannot be done just running 1 train epoch

- Papers on CNN ensembling 1 2 3

(1) (RU) Small amount of technocal details, but face-detection + face hashing works in retail (+human operator) given an HD camera

(2) (RU) Pose estimation

(3) Numpy autograd

"New" papers worth mentioning

(0) SqueezeNext

- Module comparsion

- Key changes

(i) more aggressive channel reduction by incorporating a two-stage squeeze module

(ii separable 3 × 3 convolutions

(iii) element-wise addition skip co

nection similar to ResNet

- Performance

(1) GANs to generate full-body anime characters in different poses


(0) (does not work in Firefox) Visualizing encoder-decoder networks for translation




Like this post or have something to say => tell us more in the comments or donate!

Deep Learning for Electronic Health Records

Posted by Alvin Rajkomar MD, Research Scientist and Eyal Oren PhD, Product Manager, Google AI When patients get admitted to a hospital, th...

snakers4 (Alexander), May 10, 05:39


Interesting links about Internet

(0) Ben Evans

Russia / CIS

(0) Telegram has a new proxy setting in alpha, though no proper stand-alone solutions are published

(1) Western media now cover Telegram

Global / tech

(0) Xiaomi to file for an IPO - US$10 - US$100bn

(1) Yet another drag and drop ML that will (m?) fail - - this is so American

(2) Now all "major" apps heavily feature "stories" as main mobile format -

Yet another reason to quit all social media and just use professional apps / messaging

Add up all this bs => this is the reason normal people do not use social media for real now

(3) Tesla most shorted tech company now - xD


(0) YouTube - 1.8bn users with 1+ login

(1) WhatsApp m70bn messages per day (vs. 20bn max with SMS)



snakers4 (Alexander), May 01, 16:52

2018 DS/ML digest 9

Market / libraries

(0) Tensorflow + Swift - wtf -

(1) Geektimes / going international -

(2) A service for renting GPUs ... from people

- Reddit

- Link

- Looks LXC based (afaik - the only user friendly alternative to Docker)

- Cool in theory, no idea how secure this is - we can assume as secure as providing a docker container to stranger

- They did not reply me in a week

(3) A friend sent me a new list of ... new yet another PyTorch NLP libraries

-, (AllenNLP is the biggest library like this)

- I believe that such libraries are more or less useless for real tasks, but cool to know they exist

(4) New SpaceNet 4?

(5) A new super cool competition on Kaggle about particle physics?

Tutorials / basics

(0) Bias vs. Variance (RU)

(1) Yet another magic Jupyter guideline collection -

Real world ML applications

(0) Resnet + object detection (RU) - people wo helmets 90% accuracy -

(1) about using embeddings with Tabular data -

Very similar to our approach on electricity

I personally do not recommend using their library by all means

(2) Comparing Google TPU vs. V100 with ResNet50 -

- speed -

- pricing -

- but ... buying GPUs is much cheaper

(3) Other blog posts about embeddings + tabular data

- Sales prediction

- Taxi drive prediction

MLP + classification + embeddings - /

(4) Albu's solution to SpaceNet - augmentations

CNN overview

Neural network part:

Split data to 4 folds randomly but the same number of each city tiles in every fold

Use resnet34 as encoder and unet-like decoder (conv-relu-upsample-conv-relu) with skip connection from every layer of network. Loss function: 0.8*binary_cross_entropy + 0.2*(1 – dice_coeff). Optimizer – Adam with default params.

Train on image crops 512*512 with batch size 11 for 30 epoch (8 times more images in one epoch)

Train 20 epochs with lr 1e-4

Train 5 epochs with lr 2e-5

Train 5 epochs with lr 4e-6

Predict on full image with padding 22 on borders (1344*1344).

Merge folds by mean

Jobs / job market

(0) Developers by country by scraping GitHub -

- developers count vs. GDP R^2 = 84%

- developers count vs. population - R^2 = 50%


(0) Interactive tool for visualizing convolutions -


(0) Open Images v4 outsourced


- the dataset itself

- categories





swift - Swift for TensorFlow documentation repository.

snakers4 (Alexander), April 15, 08:06

2018 DS/ML digest 8

As usual my short bi-weekly (or less) digest of everything that passed my BS detector

Market / blog posts

(0) about the importance of accessibility in ML -

(1) Some interesting news about market, mostly self-driving cars (the rest is crap) -

(2) US$600m investment into Chinese face recognition -

Libraries / frameworks / tools

(0) New 5 point face detector in Dlib for face alignment task -

(1) Finally a more proper comparsion of XGB / LightGBM / CatBoost - (also see my thoughts here //

(3) CNNs on FPGAs by ZFTurbo



(4) Data version control - looks cool



-- but I will not use it - becasuse proper logging and treating data as immutable solves the issue

-- looks like over-engineering for the sake of overengineering (unless you create 100500 datasets per day)


(0) TF Playground to seed how simplest CNNs work -


(0) Looks like GAN + ResNet + Unet + content loss - can easily solve simpler tasks like deblurring

(1) You can apply dilated convolutions to NLP tasks -

(2) High level overview of face detection in -

(3) Alternatives to DWT and Mask-RCNN / RetinaNet?

- Has anybody tried anything here?


(0) A more disciplined approach to training CNNs - (LR regime, hyper param fitting etc)

(1) GANS for iamge compression -

(2) Paper reviews from ODS - mostly moonshots, but some are interesting



(3) SqueezeNext - the new SqueezeNet -




snakers4 (Alexander), April 07, 11:52

Internet digest

- Ben Evans -

- About autonomous cars - - autonomy will vary based on the route / conditions / situation / use case

- FB delays its speaker -

- Foxconn buys Belking

- Amazon music > 10m subs -

- The Economist about ML in business -

- Apple to make its own chips -