Spark in me - Internet, data science, math, deep learning, philo

snakers4 @ telegram, 1326 members, 1561 posts since 2016

All this - lost like tears in rain.

Data science, deep learning, sometimes a bit of philosophy and math. No bs.

Our website
Our chat
DS courses review

snakers4 (Alexander), August 17, 11:45

Found all Ipython's rich display capabilities in one place

Notebook on nbviewer

Check out this Jupyter notebook!

snakers4 (Alexander), August 16, 06:00

Google updates its transformer


Moving Beyond Translation with the Universal Transformer

Posted by Stephan Gouws, Research Scientist, Google Brain Team and Mostafa Dehghani, University of Amsterdam PhD student and Google Research...

snakers4 (Alexander), August 15, 02:18

NVIDIA's AI Makes Amazing Slow-Mo Videos
The paper "Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation" is available here:

snakers4 (Alexander), August 13, 11:04

Yet another crowd GPU rent service?

Create Instance | Console

Search available instances, configure launch settings, create instances

snakers4 (Alexander), August 13, 05:24

Float16 / half training in PyTorch

Tried to do it in the most obvious way + some hacks from here

Did anybody do it successfully on real models?


Resnet18 throws exception on conversion to half floats

Hey, I have tried to launch the following code: from torchvision import models resnet = models.resnet18(pretrained=True).cpu() resnet.half() and have got an exception: libc++abi.dylib: terminating with uncaught exception of type std::invalid_argument: Unsupported tensor type Sounds like the halftensor type is not registered properly. But not sure why it’s the case. Using pytorch 0.2.0 py36_1cu75 soumith Any advice how to fix it?

snakers4 (Alexander), August 13, 04:34

Untar all the archives in the folder, deleting them

find . -name '*.tar' -execdir tar -xvf '{}' ; -execdir rm '{}' ;


snakers4 (Alexander), August 12, 11:15

2018 DS/ML digest 20




2018 DS/ML digest 20

2018 DS/ML digest 20 Статьи автора - Блог -

snakers4 (Alexander), August 12, 06:32

A small post on faster distance calculation

- Post

- Medium

Please support, if you like the posts!

Speeding up word distance calculation 100x

My plain approach to faster and more practical cosine distance calculation Статьи автора - Блог -

snakers4 (Alexander), August 12, 05:21

New publication format

Decided to try a new, more streamlined, fast and automated approach to publishing a bit longer posts.

(0) Write a note in md format

(1) Transform to HTML automatically => post on

(2) Repost to medium via automated import

(3) Repost to Reddit / (if they start accepting English articles) via md


(4) Profit - 4 publications at the cost and time of one)

Decided to start with porting 2 latest articles to medium



Please tell me what you think in the comments!

Also md can be transformed almost to any format using pandoc)


Playing with Crowd-AI mapping challenge — or how to improve your CNN performance with self-supervised techniques

Originally published at on July 15, 2018.

snakers4 (Alexander), August 12, 05:01

Pre-trained ShuffleNet on PyTorch, if anybody needs

Forwarded from Just links:


ShuffleNetV2-pytorch - Implementation of ShuffleNetV2 for pytorch

snakers4 (Alexander), August 10, 11:39

Using numba

Looks like ... it just works when it works.

For example this cosine distance calculation function works ca 10x faster.

@numba.jit(target='cpu', nopython=True)

def fast_cosine(u, v):

m = u.shape[0]

udotv = 0

u_norm = 0

v_norm = 0

for i in range(m):

if (np.isnan(u[i])) or (np.isnan(v[i])):


udotv += u[i] * v[i]

u_norm += u[i] * u[i]

v_norm += v[i] * v[i]

u_norm = np.sqrt(u_norm)

v_norm = np.sqrt(v_norm)

if (u_norm == 0) or (v_norm == 0):

ratio = 1.0


ratio = udotv / (u_norm * v_norm)

return 1-ratioAlso looks like they recently were supported by NumFocus


Sponsored Projects | pandas, NumPy, Matplotlib, Jupyter, + more - NumFOCUS

Explore NumFOCUS Sponsored Projects, including: pandas, NumPy, Matplotlib, Jupyter, rOpenSci, Julia, Bokeh, PyMC3, Stan, nteract, SymPy, FEniCS, PyTables...

snakers4 (Alexander), August 09, 15:10

Thanks for everybody who used our DO / affiliate links!

They finally started to vest)

Affiliate links:

There were a couple of simplistic guides:

- Socks5

- OpenVPN


DigitalOcean: Cloud Computing, Simplicity at Scale

Providing developers and businesses a reliable, easy-to-use cloud computing platform of virtual servers (Droplets), object storage ( Spaces), and more.

Amazing image examples from open images dataset

Forwarded from Just links:
Forwarded from Just links:

snakers4 (Alexander), August 08, 02:57

New android

Neural Networks API 1.1

Android 9 adds an updated version of the Neural networks API, to extend Android's support for accelerated on-device machine learning. Neural Networks 1.1 adds support for nine new ops -- Pad, BatchToSpaceND, SpaceToBatchND, Transpose, Strided Slice, Mean, Div, Sub, and Squeeze. A typical way to take advantage of the APIs is through TensorFlow Lite.

Introducing Android 9 Pie

After more than a year of development and months of testing by early adopters, we're ready to launch Android 9 Pie, the latest release of Android, to the world. Android 9 harnesses the power of machine learning to make your phone smarter, simpler, and tailored to you. Read all about the new consumer features here. For developers, Android 9 includes many new ways to enhance your apps and build new experiences to drive engagement.

snakers4 (Alexander), August 07, 05:51

Proper commits / using git

If you follow this guide and do proper commits

Then you will get this on GitHub as a reward

[Insert comment here about Microsoft's acquisition of Github]


snakers4 (Alexander), August 07, 04:04


Wrote a couple of posts about UMAP before.

Since last time, they extended their docs and published a paper:

- How it works (topology) - I kind of understand 50% of this

- Paper (have not read yet)

What I really like about UMAP author - he answers questions on the forums / invested a lot of time into explaining how UMAP and HDBSCAN work / built stellar docs and is overall a nice guy.

What I really like in practice - this combination works really well:




umap - Uniform Manifold Approximation and Projection

snakers4 (Alexander), August 07, 01:27

DeepMind Has A Superhuman Level Quake 3 AI Team
Pick up cool perks on our Patreon page: The paper "Human-level performance in first-person multiplayer games with pop...

snakers4 (Alexander), August 06, 10:44

NLP - naive preprocessing

A friend has sent me a couple of gists



Useful boilerplate



GitHub is where people build software. More than 28 million people use GitHub to discover, fork, and contribute to over 85 million projects.

snakers4 (Alexander), August 06, 05:16

snakers4 (Alexander), August 03, 02:23

snakers4 (Alexander), August 01, 18:05

Finally found a decent python module building guide


The Definitive Guide to Python import Statements | Chris Yeh

Stanford University, Class of 2018

snakers4 (Alexander), July 31, 18:33

Autofocus for semseg?

I have not seen people for whom DeepLab worked...and in my tests dilated convolutions were the same...though some claim they help with high-res images with small objects...


(0) Autofocus layer, a novel module that enhances the multi-scale processing of CNNs by learning to select the ‘appropriate’ scale for identifying different objects in an image

(1) Layer description

(2) Implementation

I believe this will work best for 3D images


snakers4 (Alexander), July 31, 05:53

Yet another python tricks book


Python Training by Dan Bader –

Dan Bader helps Python developers become more awesome. His tutorials, videos, and trainings have reached over half a million developers around the world.

snakers4 (Alexander), July 31, 05:47

2018 DS/ML digest 19

Market / data / libraries

(0) 32k lesions image dataset open-sourced



(1) A new Distill article about Differentiable Image Parameterizations

- Usually images are parametrized as RGB values (normalized)

- Idea - use different (learnable) parametrization


- Parametrizing resulting image with fourier transform enables to use different architectures with style transfer

- Working with transparent images

(2) Lip reading with 40% Word Error Rate

(3) Joing auto architecture + hyper param search (*)


(5) New CNN architectures from ICML (*)

(6) Jupiter notebook widget for text annotaion

(7) A bit more debunking of auto-ml by

(8) A small intro to Bayes methods

(9) Criminal face recognition 20% false positives -

(10) Denoising images wo noiseless ground-truth


(0) Autoencoders for text - no clear conclusion?

(1) RNN use cases overview

(2) ACL 2018 notes


(0) Edge embeddable TPU devices ?

(1) GeForce 11* finally coming soon? Prices for 1080Ti are falling now...



NIH Clinical Center releases dataset of 32,000 CT images

Lesion data may make it easier for scientific community to identify tumor growth or new disease

snakers4 (Alexander), July 31, 05:18

Some interesting NLP related ideas from ACL 2018


- bag-of-embeddings is surprisingly good at capturing sentence-level properties, among other results

- language models are bad at modelling numerals and propose several strategies to improve them

- current state-of-the-art models fail to capture many simple inferences

- LSTM representations, even though they have been trained on one task, are not task-specific. They are often predictive of unintended aspects such as demographics in the data

- Word embedding-based methods exhibit competitive or even superior performance

Four common ways to introduce linguistic information into models:

- Via a pipeline-based approach, where linguistic categories are used as features;

- Via data augmentation, where the data is augmented with linguistic categories;

- Via multi-task learning;


ACL 2018 Highlights: Understanding Representations

This post reviews two themes of ACL 2018: 1) gaining a better understanding what models capture and 2) to expose them to more challenging settings.

snakers4 (Alexander), July 31, 04:42

Airbus ship detection challenge

On a surface this looks like a challenging and interesting competition:


- Train / test sets - 14G / 12G

- Downside - Kaggle and very fragile metric

- Upside - a separate significant price for fast algorithms!

- 768x768 images seem reasonable



Airbus Ship Detection Challenge

Find ships on satellite images as quickly as possible

snakers4 (Alexander), July 30, 05:48

The reality of human face recognition

There is a lot of hype related to the surveillance state / 1984 / Chinese offline cameras.

Cannot help but feature this amazing article from Russian engineers (RU):


Правда и ложь систем распознавания лиц

Пожалуй нет ни одной другой технологии сегодня, вокруг которой было бы столько мифов, лжи и некомпетентности. Врут журналисты, рассказывающие о технологии, врут...

snakers4 (Alexander), July 29, 16:50

DeepMind's AI Learns The Piano From The Masters of The Past
The paper "The challenge of realistic music generation: modelling raw audio at scale" is available here: drive.googl...

snakers4 (Alexander), July 28, 05:13

New Keras version

No real major changes...



keras - Deep Learning for humans

snakers4 (Alexander), July 27, 03:17

The truth about ML courses


snakers4 (Alexander), July 25, 15:13