Spark in me - Internet, data science, math, deep learning, philo

snakers4 @ telegram, 1812 members, 1759 posts since 2016

All this - lost like tears in rain.

Data science, ML, a bit of philosophy and math. No bs.

Our website
Our chat
DS courses review

Posts by tag «linux»:

snakers4 (Alexander), August 19, 10:14

Sampler - visualization for any shell command

A cool mix between glances and prometheus



A tool for shell commands execution, visualization and alerting. Configured with a simple YAML file. - sqshq/sampler

snakers4 (Alexander), April 30, 09:27

Tricky rsync flags

Rsync is the best program ever.

I find these flags the most useful

--ignore-existing (ignores existing files)
--update (updates to newer versions of files based on ts)
--size-only (uses file-size to compare files)
-e 'ssh -p 22 -i /path/to/private/key' (use custom ssh identity)

Sometimes first three flags get confusing.


More about STT from also us ... soon)

Forwarded from Yuri Baburov:

Вторая экспериментальная гостевая лекция курса.

Один из семинаристов курса, Юрий Бабуров, расскажет о распознавании речи и работе с аудио.

1-го мая в 8:40 Мск (12:40 Нск, 10:40 вечера 30-го апреля по PST).

Deep Learning на пальцах 11 - Аудио и Speech Recognition (Юрий Бабуров)

Deep Learning на пальцах 11 - Аудио и Speech Recognition (Юрий Бабуров)

snakers4 (Alexander), April 22, 11:44

Cool docker function

View aggregate load stats by container


docker stats

Description Display a live stream of container(s) resource usage statistics Usage docker stats [OPTIONS] [CONTAINER...] Options Name, shorthand Default Description --all , -a Show all containers (default shows just running)...

2019 DS / ML digest 9

Highlights of the week

- Stack Overlow survey;

- Unsupervised STT (ofc not!);

- A mix between detection and semseg?;



2019 DS/ML digest 09

2019 DS/ML digest 09 Статьи автора - Блог -

snakers4 (Alexander), March 04, 08:46

Tracking your hardware ... for data science

For a long time I though that if you really want to track all your servers' metrics you need Zabbix (which is very complicated).

A friend recommended me an amazing tool


It installs and runs literally in minutes.

If you want to auto-start it properly, there are even a bit older Ubuntu packages and systemd examples


Dockerized metric exporters for GPUs by Nvidia


It also features extensive alerting features, but they are very difficult to easily start, there being no minimal example




Monitoring Linux host metrics with the Node Exporter | Prometheus

An open-source monitoring system with a dimensional data model, flexible query language, efficient time series database and modern alerting approach.

snakers4 (Alexander), February 17, 10:22

A bit of lazy Sunday admin stuff

Monitoring you CPU temperature with email notifications

- Change CPU temp to any metric you like

- Rolling log

- Sending email only one time, if the metric becomes critical (you can add an email when metric becomes non-critical again)

Setting up a GPU box on Ubuntu 18.04 from scratch



Plain temperature monitoring in Ubuntu 18.04

Plain temperature monitoring in Ubuntu 18.04. GitHub Gist: instantly share code, notes, and snippets.

snakers4 (Alexander), January 04, 04:08

Linux subsystem in Windows 10

It works and installs in literally 2 clicks (run one command in Powershell and then just one-click install your Linux distro of choice in Windows Store (yes, this very funny indeed))!

Why would you need this?

To make and backup files on one command for example =)

Something like this becomes reality on Windows:

cd /mnt/d/ && \
TIME=`date +%b-%d-%y` && \
FILENAME=working_files_tar-$TIME.tar.gz && \
INCREMENTAL_FILE=backup_data.snar && \
echo 'Using folderlist' $FOLDERS && \
tar -czg $(<folders_backup.txt) --listed-incremental=$INCREMENTAL_FILE --verbose -f $FILENAME

Also, you may add rsync or scp and you are good to go!

Also other potential use cases:

- You are somehow vendor locked (I depend on proprietary drivers for my thunderbolt port to attach an external GPU) or just are used to Windows' windows (or are just lazy to install Linux);

- You need one particular Linux program or you need to quickly test something / do not want to bother replicating your environment under Windows (yes, you can also run Docker, but there will be some learning curve);

- You run all of your programs remotely, and use your Windows machine as a thin client, but sometimes you need git / bash / rsync - i.e. to download movies from your personal NAS;


snakers4 (Alexander), December 29, 2018

Environment setup for DS / ML / DL

Some time ago made a small guide for setting up an environment on a black Ubuntu machine.

If works both for CV and NLP.

If you like this, please tell me, I will add newer things:

- nvtop;

- CUDA10 with PyTorch 1.0;

- Scripts for managing GPU fan speed;




Contribute to snakers4/gpu-box-setup development by creating an account on GitHub.

snakers4 (Alexander), November 25, 2018

Creating a new user

With the above hack, user creation can be done as easy as:

sudo useradd $USER && \
sudo adduser $USER $GROUP && \
sudo mkdir -p /home/$USER/.ssh/ && \
sudo touch /home/$USER/.ssh/authorized_keys && \
sudo chown -R $USER:$USER /home/$USER/.ssh/ && \
sudo wget -O -$USER.keys | sudo tee -a /home/$USER/.ssh/authorized_keys

snakers4 (Alexander), November 25, 2018

Getting your public key from Github ... with wget!

I kind of saw it when installing Ubuntu 18 from scratch. But it is super awesome!

wget -O - >> test

Just replace test with your authorized_keys file and profit!


snakers4 (Alexander), October 08, 2018

Going from millions of points of data to billions on a single machine

In my experience pandas works fine with tables up to 50-100m rows.

Ofc plain indexing/caching (i.e. pre-process all of your data in chunks and index it somehow) and / or clever map/reduce like style optimizations work.

But sometimes it is just good to know that such things exist:

- for large data-frames + some nice visualizations;

- for large visualizations;

- Also you can use Dask for these purposes I guess;


Python3 nvidia driver bindings in glances

They used to have only python2 ones.

If you update your drivers and glances, you will get a nice GPU memory / load indicator within glances.

So convenient.


snakers4 (Alexander), August 13, 2018

Untar all the archives in the folder, deleting them

find . -name '*.tar' -execdir tar -xvf '{}' \; -execdir rm '{}' \;


snakers4 (Alexander), August 09, 2018

Thanks for everybody who used our DO / affiliate links!

They finally started to vest)

Affiliate links:

There were a couple of simplistic guides:

- Socks5

- OpenVPN


DigitalOcean: Cloud Computing, Simplicity at Scale

Providing developers and businesses a reliable, easy-to-use cloud computing platform of virtual servers (Droplets), object storage ( Spaces), and more.

Amazing image examples from open images dataset

Forwarded from Just links:
Forwarded from Just links:

snakers4 (Alexander), July 08, 2018

Yet another proxy - shadowsocks

If someone needs another proxy guide, someone with an Arabic username shared some alternative advice for proxy configuration

- (wait a bit till link resolves)



Playing with a simple SOCKS5 proxy server on Digital Ocean and Ubuntu 16

This article tells you how to start your SOCKS5 proxy with zero to little experience Статьи автора - Блог -

snakers4 (Alexander), June 18, 2018

Playing with renewing SSL certificates + Cloudflare

I am using certbot, which makes SSL certificate installation for any web-server literally a one-liner (a couple of guides - /

It also has an amazing command certbot renew for renewing your certificates.

Unsurprisingly, it does not work, when you have Cloudflare enabled. The solution in my case was as easy as:

- falling back to registrar's name-servers (luckily, my registrar stores its old DNS zone settings)

- certbot renew

- reverting back to cloudflare's DNS servers

- also, in this case when using VPN I did not have to wait for DNS records to propagate - it was instant


How To Use Certbot Standalone Mode for Let's Encrypt Certificates | DigitalOcean

Certbot offers a variety of ways to validate your domain, fetch certificates, and automatically configure Apache and Nginx. In this tutorial, we'll discuss Certbot's standalone mode and how to use it to secure other types of services, such as a mail s

snakers4 (Alexander), June 05, 2018

A very useful combination in tmux

You can resize your panes via pressing

- first ctrl+b

- hold ctrl

- press arrow keys several time holding ctrl


- profit



Digest about Internet

(0) Ben Evans Internet digest -

(1) GitHub purchased by Microsoft -

-- If you want to migrate - there are guides already -

(2) And a post on how Microsoft kind of ruined Skype -

-- focus on b2b

--lack of focus, constant redesigns, faltering service

(3) No drop in FB usage after its controversies -

(4) Facebook allegedly employes 1200 moderators for Germany -

(5) Looks like many Linux networking tools have been outdated for years



snakers4 (Alexander), May 18, 2018

Using ncdu with exclude

A really good extension of standard du

sudo ncdu --exclude /exclude_folder /

Useful when something is mounted in /media or /mnt


snakers4 (Alexander), April 30, 2018

A small saga about OpenVPN


(0) Purchase a cheap VDS from a noname provider with decent bandwidth => install OpenVPN => forget about problems => share with friends and family;

(1) This guide just works (do not be afraid of its length - it is just verbose);

(2) I tested it with DigitalOcean and;

From a financial standpoint US$1-5 per month per 3-5 users without any 3rd party services seems to be a bargain.

Hosting options:

(0) With DO it just works (just follow the guide step by step). But the cheapest VDS (which is overkill for this) costs US$5 per month. If you use my link - - you will get US$10 for free;

(1) Tested it with Follow my link, if you would like to support us - A decent VPS can be found in Amsterdam for as cheap as US$5-8 for 3 months. Be careful - their UX is a bit misleading at times - (!!!) the country choice does not seem to flow from one menu to another (!!!). This seems to be more than enough -;

(2) If you want to search yourself - go here - - the best 2 options seem to be VirMach and hostus, but the former is sold out; caveats:

(0) If you would like to follow the DO guide but use hostus, then for the cheapest options do not forget to enable this in the admin;

(1) VPS provisioning time there is 0-8 hours. In my case it was ~40 mins;

(2) I also faced this bug;

What if I have a problem with ssh keys on windows?

(0) This will give you some basic info about managing Linux servers;

(1) Here we explain how to use Putty and ssh keys on Windows (also just google it);

Why OpenVPN:

(0) Seems to be the most well-known open-source VPN software with easy accessible clients for all major platforms;

(1) I know people who used it;


(0) - seems to be newer and cooler, but I do not know living people who reported actually using it;



Как настроить сервер OpenVPN в Ubuntu 16.04 | DigitalOcean

Хотите иметь безопасный и защищённый доступ в Интернет с вашего смартфона или ноутбука при подключении к незащищённой сети через WiFi отеля или кафе Виртуальная частная сеть (Virtual Private Network, VPN) позволяет...

snakers4 (Alexander), April 22, 2018

SOCK5 proxy configuration on Vultr

As of now, Vult was not (yet) blocked, probably because it is less known in the CIS. If you missed our Digital Ocean sock5 configuration guide, then you can follow this guide.

For us, both DO and Vulture work as of now.


You can use our referral links to create accounts



If you like the above guides, consider buying us a coffee




Playing with a simple SOCKS5 proxy server on Vultr and Ubuntu 16.04

Start your own proxy server Статьи автора - Блог -

Readable list comprehensions in Python

My list and dictionary comprehensions usually look like s**t


Examples of readable comprehension formating from SO

snakers4 (Alexander), April 21, 2018

For windows users, that use their legacy machine as thin client to access Linux servers

Old habits die slowly. I use and old, but powerful Windows machine and so far doing everything on remote servers was ok, until I needed to commit to github using ssh agent forwarding.

But my key is stored locally and I do not want to use git bash or any windows based software, because it sucks. Also having a single source of truth on a remote Linux machine is better anyway. But I cannot store my key on the remote machine.

There is a solution - ssh-agent forwarding. In a nutshell:

- Install pageant, add your identity locally (.ppk private key file)

- Check allow agent forwarding in Putty

- Follow the below guides to check that all works

- profit


How To Use Pageant to Streamline SSH Key Authentication with PuTTY | DigitalOcean

Pageant is a PuTTY authentication agent. It holds your private keys in memory so that you can use them whenever you are connecting to a server. It eliminates the need to explicitly specify the relevant key to each Linux user account if you use more th

snakers4 (Alexander), April 17, 2018

Also what is interesting, despite the fact that geektimes blocked my SOCKS proxy post and the fact that marketing based web-sites stole it (in Russian), I received the following feedback:

- 3 people thanked me in the ODS channel

- 3 people thanked me via email

- 2 people thanked me in geektimes PM

Also this is also interesting - my referral link was hit 165 times and ~50 people registered =)


So if you missed the fun

- Post

- Referral link

- Note that the final config is in the comments and here (thanks to and its admin)

sudo apt update && apt upgrade

dpkg -i dante-server_1.4.2+dfsg-2build1_amd64.deb
echo '
logoutput: syslog /var/log/danted.log
internal: eth0 port = 1080
external: eth0

socksmethod: username
user.privileged: root
user.unprivileged: nobody

client pass {
from: to:
log: error

socks pass {
from: to:
command: connect
log: error
socksmethod: username
}' > /etc/danted.conf

# basic ufw installation
sudo apt-get install ufw
sudo ufw status
sudo ufw allow ssh
sudo ufw allow proto tcp from any to any port 1080
sudo ufw status numbered
sudo ufw enable

sudo systemctl enable danted

sudo useradd --shell /usr/sbin/nologin av_socks && sudo passwd av_socks

So, thanks to bykvaadm for his feedback and support and to everybody else.



Captured with Lightshot

Also someone just bought us a coffee


Please consider supporting us for more quality content

Usually it takes several hours (to a month if it is about a competition) to write and does not pay well

And when people steal your content to put their refcodes in it, it's painful (

Buy Alexander Veysov a Coffee -

A practitioner in the field of Data Science / Deep Learning

snakers4 (Alexander), April 15, 2018


(0), Leslie N. Smith US Naval Research Laboratory

(1) Will serve as a good intuition starter if you have little experience (!)

(2) Some nice ideas:

- The test/validation loss is a good indicator of the network’s convergence - especially in early epochs

- The amount of regularization must be balanced for each dataset and architecture

- The practitioner’s goal is obtaining the highest performance while minimizing the needed computational time

(smaller batch - less stability and faster convergence)

- Optimal momentum value(s) will improve network training

(3) The author does not study the difference between SGD and Adam in depth =( Adam kind of solves much of his pains

(4) In my practice the following approach works best:

- Aggressive training with Adam to find the optimal LR

- Apply various LR decay regimes to determine the optimal

- Use low LR or CLR in the end to converge to a lower value (possible overfitting)

- Test on test / delayed test end-to-end

- In my experience - a strong model with good params will start with test/val set loss much lower / target metric much higher than on the train set

- In some applications if your CNN is memory intesive - you just opt for the largest batch possible (usually >6-8 works)

- Also there is no mention of augmentations - they usually help reduce overfitting much better than hyper parameters


Nice read about systemctl


How To Use Systemctl to Manage Systemd Services and Units | DigitalOcean

Systemd is an init system and system manager that is widely becoming the new standard for Linux machines. While there is considerable controversy as to whether systemd is an improvement over the init systems it is replacing, the majority of distributi

snakers4 (Alexander), April 14, 2018

Found an applied channel (RU) about security and admin stuff

Looks professional


- also the channel's admin posted some useful remarks here



Админим с Буквой

Канал о системном администрировании, DevOps и немного Инфобеза. По всем вопросам обращаться к @bykva флуд и обсуждение. обсуждение и флуд.

snakers4 (Alexander), April 13, 2018

So I briefly dug into running a containerized GPU accelerated GUI app (I want to be able to run some apps I do not really want on my host).

Docker kind of works for this purpose, but I found working guides for nvidia-docker, not nvidia-docker2.

Looks like if you want to run a Linux host with a Linux container - then LXD is a good option. It is high level and seems to have an easy API to use. I will report if that will work for me.

- Guide

- LXD vs Docker

- Extensive LXD tutorial


How to run graphics-accelerated GUI apps in LXD containers on your Ubuntu desktop

Update June 2018: See updated post How to easily run graphics-accelerated GUI apps in LXD containers on your Ubuntu desktop  which describes how to use LXD profiles to simplify the creation of cont…

Soviet arcade video games museum (pics)


Ещё один занятный музей

Продолжая музейную тему, на этот раз заглянем в Москву. Только двинем мы не в Кремль, а в место более для меня интересное.…

snakers4 (Alexander), February 11, 2018

A note on reusing my old hard-drives from mdadm raid10 array in a new raid0 array after buying more hard drives (

Ideally this command should remove the superblock from old disks

sudo mdadm --zero-superblock /dev/sdc
But in practice I faced a problem, when only after something of this sort (

dd bs=512 count=63 if=/dev/zero of=/dev/sda
raid arrays started properly on reboot. This happened to both old raid10 disks and a disk that was used as plain storage. Magic.

Ofc you can shred and fill the whole disk with zeros, but it takes a lot of time...

sudo shred -v -n1 -z /dev/sda


How To Create RAID Arrays with mdadm on Ubuntu 16.04 | DigitalOcean

Linux's madam utility can be used to turn a group of underlying storage devices into different types of RAID arrays. This provides various advantages depending on which RAID level is used. This guide will cover how to set up devices in the most common

snakers4 (Alexander), February 02, 2018

2 links for quick look up into shell commands

- quick replacement to man pages

- explanation of each flag in a shell command


Simplified, community-driven man pages!

snakers4 (Alexander), February 02, 2018

A more concise alternative to nvidia-smi

watch --color -n1.0 gpustat --color

pip3 install gpustat

Also you can use python bindings for GPU drivers, but I managed to find only drivers for python2.



snakers4 (Alexander), January 24, 2018

If after updating your

packages your dockerized application suddenly fails to see the GPU(s), then you should migrate to nvidia-docker-2.


Do not forget to read section

Removing nvidia-docker 1.0

In my case all was solved simply by copy-pasting their commands:

# migrate to NVIDIA docker 2
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo apt-get purge nvidia-docker

curl -s -L | sudo apt-key add -
curl -s -L | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update

sudo apt-get install nvidia-docker2
sudo pkill -SIGHUP dockerd

# testing
docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi



Build and run Docker containers leveraging NVIDIA GPUs - NVIDIA/nvidia-docker

snakers4 (Alexander), January 03, 2018

Some controversial stuff about Docker

Hold on to your containers...


056: Is Docker Dead, Tech Bros, Kubernetes, Tools Galore, and More!

DevOps, Cloud Native, Open Source, and the 'ish in between.

snakers4 (Alexander), December 21, 2017

Для тех, кто хочет сделать rsync большого числа файлов через ssh + ключ.

Не выполняйте команду, пока не прочтете доку по всем флагам.

sudo rsync --dry-run --stats --ignore-existing --size-only -rvz -e 'ssh -p PORT -i /home/USER/.ssh/PRIVATE_KEY' --progress [email protected]_HOST:/path/to/remote/folder /path/to/local/folder/


snakers4 (Alexander), November 27, 2017


Если вы хотите дать полноценное окружение на своей машине третьей стороне (другу, коллеге, девушке, участнику вашей команды), которые либо не обладают супер админскими навыками или просто не должны иметь root доступ или доступ туда, куда не надо, то есть очень простой способ это сделать:

- Поднимаете докер, ставите sshd сразу в докерфайле

- В докере должен быть ваш любимый софт + jupyter notebook со всеми свистелками

- Прокидываете папки и диски, ставите ограничения на RAM, видеокарты и CPU при docker run (или nvidia docker run)

- В докерфайл также прописываете установку glances

- После запуска контейнера делаете exec внутрь него и запускаете sshd

- Естественно надо не забыть пробросить порты в докере и своей сетевой инфраструктуре и включить пароль или ssh ключ на этапе создания образа по докерфайлу

- Итог - вы даете URL + ключ от jupyter notebook коллеге + ssh доступ внутрь контейнера. При этом внутри него он царь и бог и видит нагрузку и свои процессы (glances + nvidia-smi), но не может вообще ничего плохого сделать с системой, т.к. включаете контейнер и монтируете папки вы

И не надо возиться с виртуальными машинами, и видеокарты пробрасываются отлично!



older first