Spark in me - Internet, data science, math, deep learning, philo

snakers4 @ telegram, 1322 members, 1482 posts since 2016

All this - lost like tears in rain.

Data science, deep learning, sometimes a bit of philosophy and math. No bs.

Our website
Our chat
DS courses review

snakers4 (Alexander), April 27, 07:51

Forwarded from Админим с Буквой:

Релиз дистрибутива Ubuntu 18.04 LTS

Состоялся релиз дистрибутива Ubuntu 18.04 "Bionic Beaver", который отнесён к категории выпусков с длительным сроком поддержки (LTS), обновления для которых формируются в течение 5 лет. Установочные образы созданы для Ubuntu Desktop, Ubuntu Server, Ubuntu Cloud, Kubuntu, Ubuntu Budgie, Lubuntu, Ubuntu Studio, Ubuntu Kylin, Ubuntu MATE и Xubuntu.

snakers4 (Alexander), April 26, 19:31

AI Learns Real-Time 3D Face Reconstruction | Two Minute Papers #245
The paper "Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network" and its source code is available here:

snakers4 (Alexander), April 26, 04:24

On the surface looks like an interesting competition

Well, I said that about Power Laws - but then it turned out otherwise.

So far I can see CV, NLP and tables in one mix.


Avito Demand Prediction Challenge

Predict demand for an online classified ad

snakers4 (Alexander), April 25, 06:13

PyTorch 0.4 released


(1) Tensor / Variable merged

(2) Zero-dimensional Tensors

(3) dtypes

(4) migration guide



pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration

snakers4 (Alexander), April 24, 18:00

Internet digest

(0) Ben Evans -

ML / industry

(1) FB to design its own FPGAs / ML chips - ?

(2) Google willing to replicate iMessage, again

-- No mention of Telegram - but all Google's attempts are aeons behind Telegram

-- Google willing to go the hardest route - a standard enforced on the carrier + replace the messenging app

-- All of the previous attempts kind of did not work

(3) Facebook media backlash -

(4) Who makes LIDARs -

(5) Tesla over automation -


(1) British Telecom to switch to VOIP -

(2) Flickr purchased -


Facebook has a new job posting calling for chip designers

Facebook has posted a job opening looking for an expert in ASIC and FPGA, two custom silicon designs that companies can gear toward specific use cases — particularly in machine learning and artific…

snakers4 (Alexander), April 24, 10:36

Stupid errors

Found out why my model on DS Bowl generalized poorly.

I forgot to re-create and instance of optimizer after unfreezing the encoder.

optimizer = torch.optim.Adam(filter(lambda p: p.requires_grad, model.parameters()),

# Only finetunable params


Add optimizer re-creation on model encoder unfreeze · snakers4/[email protected]

ds_bowl_2018 - Kaggle Data Science Bowl 2018

snakers4 (Alexander), April 24, 06:47

Using a subset of GPUs without restarting the docker container

If you have multiple CUDA GPUs visible within your container, but you do not want to use them all (and PyTorch API is wobbly there now), then the docs advise you to

import os





snakers4 (Alexander), April 24, 04:37


keras - Deep Learning for humans

snakers4 (Alexander), April 24, 02:54

Predictive models running on clientside?

Looks like soon you will have not only to use AdBlock but also restrict WebGL usage


Tensorflow.js Explained
Tensorflow.js is Google's new Javascript verison of its popular Machine Learning library Tensorflow. This allows developers, hobbyists, and researchers to bu...

snakers4 (Alexander), April 22, 15:02

DWT article on

DS Bowl article is live on


Please support us with your likes.


Применяем Deep Watershed Transform в соревновании Kaggle Data Science Bowl 2018

Применяем Deep Watershed Transform в соревновании Kaggle Data Science Bowl 2018 Представляем вам перевод статьи по ссылке и оригинальный докеризированный код.

snakers4 (Alexander), April 22, 14:46

AI Photo Translation | Two Minute Papers #243
The paper "Toward Multimodal Image-to-Image Translation" and its source code is available here: Our Patreon page: https...

snakers4 (Alexander), April 22, 13:53

SOCK5 proxy configuration on Vultr

As of now, Vult was not (yet) blocked, probably because it is less known in the CIS. If you missed our Digital Ocean sock5 configuration guide, then you can follow this guide.

For us, both DO and Vulture work as of now.


You can use our referral links to create accounts



If you like the above guides, consider buying us a coffee




Playing with a simple SOCKS5 proxy server on Vultr and Ubuntu 16.04

Start your own proxy server Статьи автора - Блог -

Readable list comprehensions in Python

My list and dictionary comprehensions usually look like s**t


Examples of readable comprehension formating from SO

snakers4 (Alexander), April 22, 11:40

snakers4 (Alexander), April 21, 14:49

For windows users, that use their legacy machine as thin client to access Linux servers

Old habits die slowly. I use and old, but powerful Windows machine and so far doing everything on remote servers was ok, until I needed to commit to github using ssh agent forwarding.

But my key is stored locally and I do not want to use git bash or any windows based software, because it sucks. Also having a single source of truth on a remote Linux machine is better anyway. But I cannot store my key on the remote machine.

There is a solution - ssh-agent forwarding. In a nutshell:

- Install pageant, add your identity locally (.ppk private key file)

- Check allow agent forwarding in Putty

- Follow the below guides to check that all works

- profit


How To Use Pageant to Streamline SSH Key Authentication with PuTTY | DigitalOcean

Pageant is a PuTTY authentication agent. It holds your private keys in memory so that you can use them whenever you are connecting to a server. It eliminates the need to explicitly specify the relevant key to each Linux user account if you use more th

snakers4 (Alexander), April 21, 06:24

snakers4 (Alexander), April 20, 15:12

Forwarded from Админим с Буквой:

Немного о баш скриптовании

Порою возникает необходимость записать какие-то данные в фай с сохранением переноса строк и отступов. сделать это можно несколькими способами.

1) с помощью echo

echo -e 'This is first string

And this is second' > /path/to/file

Такой способ имеет только единственный плюс - он однострочник. В реальности он впринципе не читабельный.

Разбить этот однострочник можно, конечно, на несколько строк:

echo "


$variable - why not?

str N


И такой способ впринципе хорош до тех пор, пока вам не придется экранировать кавычки в тексте.

2) с помощью cat

cat > ceph.conf « EOF


mon_host = xx.xx.xx.xx:6789

auth_cluster_required = cephx

auth_service_required = cephx

auth_client_required = cephx


В отличии от первого способа переменные по-прежнему интерпретируются, а кавычки экранировать не нужно.


snakers4 (Alexander), April 20, 05:10

A note on CDNs and protecting your website against censorship



- Using a free / cheap CDN service can enable you to protect your domain hosted resource from censorship

- Unless CDN servers will be blocked (but I guess the CDN has more servers, than you, right?)

So, I host on Digital Ocean. And I do not want to move or start a CDN by myself. I read news, that Google abandoned some of its proxying tools because of such censorship events...interesting.

I knew that services like Cloudflare (**CDN**) forward your traffic somehow, but I was not sure what IP is actually seen by the user and whether all of the traffic is forwarded. Then I read their FAQ


It says

After a visitor's browser has done the initial DNS lookup, it begins making requests to retrieve the actual content of a website. These requests are directed to the IP address that was returned from the DNS lookup. Before Cloudflare, that address would have been With Cloudflare as the authoritative nameserver, the new address is Cloudflare’s data center at will serve as much of your website as it can from its local storage, and ask your web server at for any part of your website it doesn’t already have stored locally. The Cloudflare data center at will then provide your complete website to the visitor, so the visitor never talks directly to your web server at

So I tried their free-tier service (paid service starts from US$20-200, which is too steep) and it just works, though SSL certificates were issued ~90 mins after I changed my nameservers. It is as easy as:

- Backup your DNS settings somewhere

- Import to CloudFlare

- Change name servers in your domain registrar cabinet

- 90 mins and ... profit

Now I cannot see my direct DO server IP when I resolve my DNS:

$ dig +short



snakers4 (Alexander), April 20, 04:58

Useful Python abstractions / sugar / patterns

I already shared a book about patterns, which contains mostly high level / more complicated patters. But for writing ML code sometimes simple imperative function programming style is ok.

So - I will be posting about simple and really powerful python tips I am learning now.

This time I found out about map and filter, which are super useful for data preprocessing:


items = [1, 2, 3, 4, 5]

squared = list(map(lambda x: x**2, items))Filter

number_list = range(-5, 5)

less_than_zero = list(filter(lambda x: x < 0, number_list))

print(less_than_zero)Also found this book -



snakers4 (Alexander), April 19, 15:36

Given the current situation ... which post / guide would you like next?

DS / ML related (back log of hobby projects)! – 47

👍👍👍👍👍👍👍 69%

OpenVPN + Docker – 12

👍👍 18%

Dante proxy + Arubacloud + DigitalOcean + Vultr + Docker – 9

👍 13%

👥 68 people voted so far.

snakers4 (Alexander), April 18, 13:37

Nice ideas about unit testing ML code


How to unit test machine learning code.

Note: The popularity of this post has inspired me to write a machine learning test library. Go check it out!

snakers4 (Alexander), April 17, 19:14

Andrew NG released first 4 chapters of his new book

So far looks not really technical



Download Ng_MLY01.pdf 1.52 MB

snakers4 (Alexander), April 17, 08:50

DS Bowl 2018 top solution


This is really interesting...their approach to separation is cool

snakers4 (Alexander), April 17, 07:39

Nice realistic article about bias in embeddings by Google



Text Embedding Models Contain Bias. Here's Why That Matters.

Human data encodes human biases by default. Being aware of this is a good start, and the conversation around how to handle it is ongoing. At Google, we are actively researching unintended bias analysis and mitigation strategies because we are committed to making products that work well for everyone. In this post, we'll examine a few text embedding models, suggest some tools for evaluating certain forms of bias, and discuss how these issues matter when building applications.

snakers4 (Alexander), April 17, 06:30

Also what is interesting, despite the fact that geektimes blocked my SOCKS proxy post and the fact that marketing based web-sites stole it (in Russian), I received the following feedback:

- 3 people thanked me in the ODS channel

- 3 people thanked me via email

- 2 people thanked me in geektimes PM

Also this is also interesting - my referral link was hit 165 times and ~50 people registered =)


So if you missed the fun

- Post

- Referral link

- Note that the final config is in the comments and here (thanks to and its admin)

sudo apt update && apt upgrade


dpkg -i dante-server_1.4.2+dfsg-2build1_amd64.deb

echo '

logoutput: syslog /var/log/danted.log

internal: eth0 port = 1080

external: eth0

socksmethod: username

user.privileged: root

user.unprivileged: nobody

client pass {

from: to:

log: error


socks pass {

from: to:

command: connect

log: error

socksmethod: username

}' > /etc/danted.conf

# basic ufw installation

sudo apt-get install ufw

sudo ufw status


sudo ufw allow ssh

sudo ufw allow proto tcp from any to any port 1080

sudo ufw status numbered

sudo ufw enable

sudo systemctl enable danted

sudo useradd --shell /usr/sbin/nologin av_socks && sudo passwd av_socks

So, thanks to bykvaadm for his feedback and support and to everybody else.



Captured with Lightshot

Also someone just bought us a coffee


Please consider supporting us for more quality content

Usually it takes several hours (to a month if it is about a competition) to write and does not pay well

And when people steal your content to put their refcodes in it, it's painful (

Buy Alexander Veysov a Coffee -

A practitioner in the field of Data Science / Deep Learning

snakers4 (Alexander), April 16, 10:17

A draft of the article about DS Bowl 2018 on Kaggle.

This time this was a lottery.

Good that I did not really spend much time, but this time I learned a lot about watershed and some other instance segmentation methods!

An article is accompanied by a dockerized PyTorch code release on GitHub:



This is a beta, you are welcome to comment and respond.





Applying Deep Watershed Transform to Kaggle Data Science Bowl 2018 (dockerized solution)

In this article I will describe my solution to the DS Bowl 2018 and why it was a lottery and post a link to my dockerized solution Статьи автора - Блог -

snakers4 (Alexander), April 15, 09:49


(0), Leslie N. Smith US Naval Research Laboratory

(1) Will serve as a good intuition starter if you have little experience (!)

(2) Some nice ideas:

- The test/validation loss is a good indicator of the network’s convergence - especially in early epochs

- The amount of regularization must be balanced for each dataset and architecture

- The practitioner’s goal is obtaining the highest performance while minimizing the needed computational time

(smaller batch - less stability and faster convergence)

- Optimal momentum value(s) will improve network training

(3) The author does not study the difference between SGD and Adam in depth =( Adam kind of solves much of his pains

(4) In my practice the following approach works best:

- Aggressive training with Adam to find the optimal LR

- Apply various LR decay regimes to determine the optimal

- Use low LR or CLR in the end to converge to a lower value (possible overfitting)

- Test on test / delayed test end-to-end

- In my experience - a strong model with good params will start with test/val set loss much lower / target metric much higher than on the train set

- In some applications if your CNN is memory intesive - you just opt for the largest batch possible (usually >6-8 works)

- Also there is no mention of augmentations - they usually help reduce overfitting much better than hyper parameters


Nice read about systemctl


How To Use Systemctl to Manage Systemd Services and Units | DigitalOcean

Systemd is an init system and system manager that is widely becoming the new standard for Linux machines. While there is considerable controversy as to whether systemd is an improvement over the init systems it is replacing, the majority of distributi

snakers4 (Alexander), April 15, 08:06

2018 DS/ML digest 8

As usual my short bi-weekly (or less) digest of everything that passed my BS detector

Market / blog posts

(0) about the importance of accessibility in ML -

(1) Some interesting news about market, mostly self-driving cars (the rest is crap) -

(2) US$600m investment into Chinese face recognition -

Libraries / frameworks / tools

(0) New 5 point face detector in Dlib for face alignment task -

(1) Finally a more proper comparsion of XGB / LightGBM / CatBoost - (also see my thoughts here

(3) CNNs on FPGAs by ZFTurbo



(4) Data version control - looks cool



-- but I will not use it - becasuse proper logging and treating data as immutable solves the issue

-- looks like over-engineering for the sake of overengineering (unless you create 100500 datasets per day)


(0) TF Playground to seed how simplest CNNs work -


(0) Looks like GAN + ResNet + Unet + content loss - can easily solve simpler tasks like deblurring

(1) You can apply dilated convolutions to NLP tasks -

(2) High level overview of face detection in -

(3) Alternatives to DWT and Mask-RCNN / RetinaNet?

- Has anybody tried anything here?


(0) A more disciplined approach to training CNNs - (LR regime, hyper param fitting etc)

(1) GANS for iamge compression -

(2) Paper reviews from ODS - mostly moonshots, but some are interesting



(3) SqueezeNext - the new SqueezeNet -




snakers4 (Alexander), April 15, 07:36

So, I used to use Chromium based Opera.

Now I switched to the new Firefox, which is fast, boasts a lot of security extensions and looks also clean and nice. Their mobile apps are a bit unpolished, but also good.

Looks like the rewrote rendering from scratch - because a year ago it was slow.

snakers4 (Alexander), April 14, 12:25

Found an applied channel (RU) about security and admin stuff

Looks professional


- also the channel's admin posted some useful remarks here



Админим с Буквой

Канал о системном администрировании, DevOps и немного Инфобеза. По всем вопросам обращаться к @bykva флуд и обсуждение. обсуждение и флуд.

snakers4 (Alexander), April 14, 09:05

Out post is live on Russian reddit - geektimes


Please support if you have a valid account!


Простая пошаговая настройка SOCKS5 прокси сервера под Ubuntu 16 за 10-15 минут

Простая пошаговая настройка SOCKS5 прокси сервера под Ubuntu 16 Данная статья является переводом статьи отсюда. Стиль и особенности речи автора сглажены, но в...