Spark in me - Internet, data science, math, deep learning, philo

snakers4 @ telegram, 1227 members, 1357 posts since 2016

All this - lost like tears in rain.

Internet, data science, math, deep learning, philosophy. No bs.

Our website
Our chat
DS courses review

snakers4 (Alexander), February 11, 06:46

Datashader Revealing the Structure of Genuinely Big Data | SciPy 2016 | James A Bednar
Current plotting tools are inadequate for revealing the distributions of large, complex datasets, both because of technical limitations and because the resul...

snakers4 (Alexander), February 10, 17:31

So, I accidentally was able to talk to the Vice President of GameWorks in Nvidia in person =)

All of this should be taken with a grain of salt. I am not endorsing Nvidia.

- In the public part of the speech he spoke about public Nvidia research projects - most notable / fresh was Nvidia Holodeck - their VR environment

- Key insight - even despite the fact that Rockstar forbid to use GTA images for deep learning, he believes that artificial images used for annotation will be the future of ML because game engines and OS are the most complicated software ever

Obviously, I asked interesting question afterwards =) Most notably about about GPU market and forces

- GameWorks = 200 people doing AR / VR / CNN research

- The biggest team in Nvidia is 2000 - drivers

- Ofc he refused to reply when new generation GPUs will be released and whether the rumour about their current generation GPUs being not produced anymore is true

- He says they are mostly software company focusing on drivers

- Each generation cycle takes 3 years, Nvidia has only one architecture per generation, all the CUDA / ML stuff was planned in 2012-2014

- A rumour about Google TPU. Google has an internal quota - allegedly (!) they cannot buy more GPUs than TPUs, but TPUs are 1% utilized and allegedly they lure Nvidia people to optimize their GPUs to make sure they use this quota efficently

- AMD R&D spend on both CPU and GPU is less than Nvidia spend on GPU

- He says that newest AMD have more 30-40% FLOPs, but they are compared against previous generation consumer GT cards on synthetic tests. AMD does not have a 2000 people driver team...

- He says that Intel has 3-5 new architectures in the works - which may a problem


snakers4 (Alexander), February 10, 13:55

Forwarded from Linuxgram:

The 5 Coolest Things About VLC 3.0

VLC Chromecast support arrives in VLC 3.0, as do many other features! In this post we take a look at 5 changes that make this VLC release worth downloading.

snakers4 (Alexander), February 10, 10:47

Some idiomatic pandas for loading several dataframes at once quickly

import pandas as pd

def date_to_months(df,date_col,new_col):

df[date_col] = pd.to_datetime(df[date_col])

df[new_col] = df[date_col].apply(lambda x: str(x.year) + '_' + str(x.month).zfill(2))

return df

def clean_hdfs_artifacts(df):

df = df[df[df.columns[0]] != df.columns[0]]

return df

files = ['../data/photo_like_profile_2017_final.csv','../data/photo_like_profile_2018_final.csv']

likes_pr = [(pd.read_csv(fp)



.pipe(date_to_months, 'photo_like_timestamp','like_month')

) for fp in files]

likes_pr = pd.concat(likes_pr)

snakers4 (Alexander), February 10, 09:02

Took a first stab on playing with XGB on GPU and updated my Dockerfile


May not work, but below links / snippet may help




# complile with GPU support

git clone --recursive &&

cd xgboost &&

mkdir build &&

cd build &&

cmake .. -DUSE_CUDA=ON &&

make -j &&

cd ../ &&

# install python package

cd python-package &&

python3 install &&

cd ../

# test all is ok


python3 tests/benchmark/

Also LightGBM depends on old drivers, and does not work (yet) with nvidia-390 on Ubuntu (yet).


Dockerfile update

snakers4 (Alexander), February 10, 07:31

(!) For new people on the channel:

- This channel is a practitioner's channel on the following topics: Internet, Data Science, math, deep learning, philosophy

- Focus is on data science

and deep learning

- Don't get your opinion in a twist if your opinion differs. You are welcome to contact me via telegram @snakers41 and email -

- No bs and ads

- Every week or two (or three) I review some materials and do ML / Internet digests (I used to to digests of digests, but not I have no time to do that)

Give us a rating:



- Buy me a coffee

- Direct donations - - 5011673505 (paste this agreement number)

- Yandex -

Key / major links:

Our website


Our chat


DS courses review

(if you are a beginner)



GAN papers review



- (RU) habr

- article

- code


- article

- code

- code for TopCoder

Telegram Channels Bot

Discover the best channels 📢 available on Telegram. Explore charts, rate ⭐️ and enjoy updates!

snakers4 (Alexander), February 10, 07:13

So we started publishing articles / code / solutions to the recent SpaceNet3 challenge. A Russian article on will also be published soon.

- The original article

- The original code release

... and Jeremy Howard from retweeted our solution, lol



But to give some idea which pain the TopCoder platform induces on the contestants, you can read

- Data Download guide

- Final testing guide

- Code release for their verification process




How we participated in SpaceNet three Road Detector challenge

This article tells about our SpaceNet Challenge participation, semantic segmentation in general and transforming masks into graphs Статьи автора - Блог -

snakers4 (Alexander), February 08, 09:13 lesson 11 notes:

- Links

-- Video


- Semantic embeddings + imagenet can be powerful, but not deployable per se

- Training nets on smaller images usually works

- Comparing activation functions

- lr annealing

- linear learnable colour swap trick

- adding Batchnorm

- replacing max-pooling with avg_pooling

- lr vs batch-size

- dealing with noisy labels

- FC / max-pooling layer models are better for transfer-learning?

- size vs. flops vs. speed

- cyclical learning rate paper

- Some nice intuitions about mean shift clustering





Lesson 11: Cutting Edge Deep Learning for Coders
We’ve covered a lot of different architectures, training algorithms, and all kinds of other CNN tricks during this course—so you might be wondering: what sho...

Meta research on the CNNs

(also this amazing post

An Analysis of Deep Neural Network Models for Practical Applications

Key findings:

(1) power consumption is independent of batch size and architecture;

(2) accuracy and inference time are in a hyperbolic relationship;

(3) energy constraint = upper bound on the maximum achievable accuracy and model complexity;

(4) the number of operations is a reliable estimate of the inference time


- Accuracy and param number -

- Param efficiency -

Also a summary of architectural patterns



Deep Learning Scaling is Predictable, Empirically




- various empirical learning curves show robust power-law region

- new architectures slightly shift learning curves downwards

- model architecture exploration should be feasible with small training data sets

- it can be difficult to ensure that training data is large enough to see the power-law learning curve region

- irreducible error region

- each new hardware generation with improved FLOP rate can pro- vide a predictable step function improvement in relative DL model accuracy



Neural Network Architectures

Deep neural networks and Deep Learning are powerful and popular algorithms. And a lot of their success lays in the careful design of the…

snakers4 (Alexander), February 08, 05:20

Looks useless...but so cool!

Maybe in 1-2 years Reinforcement Learning will become a thing



IMPALA - a new and efficient distributed architecture capable of solving many tasks at the same time in DeepMind Lab. - blog - paper - the new DMLab-30 environments @GitHub

snakers4 (Alexander), February 07, 18:46

DeepMind Control Suite | Two Minute Papers #226
The paper "DeepMind Control Suite" and its source code is available here: We wo...

snakers4 (Alexander), February 07, 14:09

Following our blog post, we also posted a Russian translation of the Jungle competition to habrhabr




Соревнование Pri-matrix Factorization на DrivenData с 1ТБ данных — как мы заняли 3 место (перевод)

Привет, Хабр! Представляю вашему вниманию перевод статьи "Animal detection in the jungle — 1TB+ of data, 90%+ accuracy and 3rd place in the competition". Или...

snakers4 (Alexander), February 07, 13:04

A bokeh based library to visualize huge datasets





datashader - Turns even the largest data into images, accurately.

snakers4 (Alexander), February 07, 09:58

Internet digest

- Ben Evans -

- Ben Evans about smart home hype -

- Google closing Google Fiber -

- Amazon tracks warehouse slackers with wristbands -

- Apple music overtaking Spotify -

- Why people like infinite scroll

- Netflix personalizes artwork -

- Self-driving trucks => morel local trucking jobs



snakers4 (Alexander), February 06, 13:40

If, by any change you will have to pass sequential args in bash (```sh /param/one /param/two```) to pass them to python, this will help.

- bash

python3 --params $*

- python

import argparse

if __name__ == '__main__':

parser = argparse.ArgumentParser()

parser.add_argument('--params', nargs = '*', dest = 'params', help = 'topcoder args', default = argparse.SUPPRESS)

args = parser.parse_args()



snakers4 (Alexander), February 06, 09:49

Interesting thread on Kaggle

End of Data Science bubble?

snakers4 (Alexander), February 06, 05:23

We are starting to publish our code / solutions / articles from recent competitions (Jungle and SpaceNet three).

This time the code will be more polished / idiomatic, so that you can learn something from it!

Jungle competition

- Finally it was verified that we indeed won the 3rd place)


Blog posts



- An adaptation for will be coming soon

Code release and architecture:

- Code

- Architecture

-- 1st place (kudos to Dmytro) - simple and nice

-- Ours

-- 2nd place - 4-5 levels of stacking

Please comment under posts / share / buy us a coffee!

- Buy a coffee

- Rate our channel tg://resolve?domain=tchannelsbot&start=snakers4




Pri-matrix Factorization

Chimp&See has collected nearly 8,000 hours of footage reflecting chimpanzee habitats from camera traps across Africa. Your challenge is to build a model that identifies the wildlife in these videos.

snakers4 (Alexander), February 05, 15:46

Interesting thoughts from Deep Learning Summit 2018

- Brif history of Deep Learning


- Malware detection - code => CNN - supposedly 90% accuracy

- Uber uses LSTMs to model week-level driver data on city level

- Uber metrics

-- 4 Billion Trips in 2017

-- 15 Million Uber trips per day

-- 75 Million monthly active riders

-- 600+ cities across 78 countries

- Uber fraud prevention

-- Anomaly detection (when account is stolen) - city2vec, LSTM + MLP, tops 70% precision / 50-60% recall

-- Card image recognition and detection => via TF CNN models

- Facebook

-- The following models are deployed on mobile - object detection, style transfer, mask rcnn

-- Low bit-width networks (<8 bits)

- Google

-- Unsupervised video next action prediction ~ 70% accuracy

-- SoundNet - classifying sounds - 70% accuracy

- GAN applications

-- Simulated environments and training data

-- Missing data

-- Semi-supervised learning

-- Multiple correct answers

-- Realistic generation tasks

- Slack

-- used to learn embeddings


Schedule | Deep Learning Summit

RE•WORK events combine entrepreneurship, technology and science to solve some of the world's greatest challenges using emerging technology. We showcase the opportunities of exponentially accelerating technologies and their impact on business and society.

snakers4 (Alexander), February 05, 14:57

We also managed to get into top-10 in SpaceNet3 Road Detection challenge


(Final confirmation awaits)

Here is a sneak peak of our solution


A blog post + repo will follow





Flowchart Maker & Online Diagram Software is a free online diagramming application and flowchart maker . You can use it to create UML, entity relationship, org charts, BPMN and BPM, database schema and networks. Also possible are telecommunication network, workflow, flowcharts, maps overlays and GIS, electronic circuit and social network diagrams.

snakers4 (Alexander), February 05, 10:51

LDA is another technique used for topic mining (like NMF) but based on probabilistic graphical models


Topic Modeling with Scikit Learn

Latent Dirichlet Allocation (LDA) is a algorithms used to discover the topics that are present in a corpus. A few open source libraries…

snakers4 (Alexander), February 03, 13:26

Pi is IRRATIONAL: simplest proof on toughest test
In the last video of 2017 I showed you Lambert’s long but easy-to-motivate 1761 proof that pi is irrational. For today’s video Marty and I have tried to stre...

snakers4 (Alexander), February 02, 08:42

Very interesting high quality dataset with satellite images + LB


Just scientific interest.



snakers4 (Alexander), February 02, 05:17

2 links for quick look up into shell commands

- quick replacement to man pages

- explanation of each flag in a shell command


snakers4 (Alexander), February 02, 04:58

A more concise alternative to nvidia-smi

watch --color -n1.0 gpustat --colorInstallation:

pip3 install gpustat

Also you can use python bindings for GPU drivers, but I managed to find only drivers for python2.



snakers4 (Alexander), February 01, 11:25

2017 DS/ML digest 2


- One more RL library (last year saw 1 or 2)

- Speech recognition from facebook -

- Even better speech generation than WaveNet - - I cannot tell computer apart

Industry (overdue news)

- Nvidia does not like it's consumer GPUs deployed in data centers

- Clarifai kills forevery

- Google search and gorillas vs. black people -

Blog posts

- Baidu - dataset size vs. accuracy (log-scale)




- New Youtube actions dataset -


Papers - current topic - meta learning / CNN optimization and tricks

- Systematic evaluation of CNN advances on the ImageNet




- Cyclical Learning Rates for Training Neural Networks





- Large batch => train Imagenet in 15 mins


- Practical analysis of CNNs





snakers4 (Alexander), February 01, 09:31

Cyclical Learning rates are not merged in Pytorch yet, but they are in the PR stage




Adds Cyclical Learning Rates by thomasjpfan · Pull Request #2016 · pytorch/pytorch

Adds feature requested in #1909. Mimics the parameters from Since Cyclical Learning Rate (CLR) requires updating the learning rate after every batch, I added batc...

snakers4 (Alexander), February 01, 02:29

We Stole Tampons from the Cashier-less Amazon Go Store
Just how good is the security at the new Amazon Go store? The answer may surprise you... (but not really - it's pretty damn good) Receive an additional $25 c...

ML in action

snakers4 (Alexander), January 31, 12:24

Andrew Ng officially launches his $175M AI Fund

As the founder of the Google Brain deep learning project and co-founder of Coursera, Andrew Ng was one of the most recognizable names in the machine learning community when he became Baidu’s…

snakers4 (Alexander), January 31, 10:40

Interesting intutions to understand the mean-shift algorithm


Downside - sklearn implementation is slow, you will have to write your own GPU implementation.


Mean Shift Clustering Overview

An overview of mean shift clustering (one of my favorite algorithms) and some of its strengths and weaknesses.

snakers4 (Alexander), January 31, 07:14

Internet digest

- Ben Evans -

- RNNs + band names -

- Soldiers + fitness trackers = military bases -

- Google's new unit - security and ML -

- Apple produces TV content -

- Some bs rumours about Telegram ICO size -

- Twitter is plagued by bot-farms -

-- Easy to detect via similar registration dates -

- Podcast about financial innovations in the US -



Jeremy Fiance

recurrent neural network, trained on band names, generates fake @Coachella lineup - reminding us most band names are gibberish

snakers4 (Alexander), January 30, 11:01

Some advice on using UMAP algorithm properly from the author



Multi CPU / GPU capabilities? #37

@lmcinnes As you may have guessed I have several CPUs and GPUs at hand and I work with high-dimensional data. Now I am benching a 500k * 5k => 500k * 2 vector vs. PCA (I need a high level clusterin...