Spark in me - Internet, data science, math, deep learning, philo

snakers4 @ telegram, 1818 members, 1744 posts since 2016

All this - lost like tears in rain.

Data science, ML, a bit of philosophy and math. No bs.

Our website
- http://spark-in.me
Our chat
- https://t.me/joinchat/Bv9tjkH9JHbxiV5hr91a0w
DS courses review
- http://goo.gl/5VGU5A
- https://goo.gl/YzVUKf

Posts by tag «digest»:

snakers4 (Alexander), May 27, 08:43

2019 DS / ML digest 11

Highlights of the week(s)

- New attention block for CV;

- Reducing the amount of data for CV 10x?;

- Brain-to-CNN interfaces start popping up in the mainstream;

spark-in.me/post/2019_ds_ml_digest_11

#digest

#deep_learning

2019 DS/ML digest 11

2019 DS/ML digest 11 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), May 14, 03:40

2019 DS / ML digest 10

Highlights of the week(s)

- New MobileNet;

- New PyTorch release;

- Practical GANs?;

spark-in.me/post/2019_ds_ml_digest_10

#digest

#deep_learning

2019 DS/ML digest 10

2019 DS/ML digest 10 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), April 22, 11:44

Cool docker function

View aggregate load stats by container

docs.docker.com/engine/reference/commandline/stats/

#linux

docker stats

Description Display a live stream of container(s) resource usage statistics Usage docker stats [OPTIONS] [CONTAINER...] Options Name, shorthand Default Description --all , -a Show all containers (default shows just running)...


2019 DS / ML digest 9

Highlights of the week

- Stack Overlow survey;

- Unsupervised STT (ofc not!);

- A mix between detection and semseg?;

spark-in.me/post/2019_ds_ml_digest_09

#digest

#deep_learning

2019 DS/ML digest 09

2019 DS/ML digest 09 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), April 09, 06:00

2019 DS / ML digest number 8

Highlights of the week

- Transformer from Facebook with sub-word information;

- How to generate endless sentiment annotation;

- 1M breast cancer images;

spark-in.me/post/2019_ds_ml_digest_08

#digest

#deep_learning

2019 DS/ML digest 08

2019 DS/ML digest 08 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), March 25, 05:31

Good old OLS regression

I needed some quick boilerplate to create an OLS regression with confidence intervals for a very plain task.

Found some nice statsmodels examples here:

www.statsmodels.org/devel/examples/notebooks/generated/ols.html

#data_science

2019 DS / ML digest number 7

Highlights of the week

- NN normalization techniques (not batch norm);

- Jetson nano for US$99 released;

- A bitter lesson in AI;

spark-in.me/post/2019_ds_ml_digest_07

#digest

#deep_learning

2019 DS/ML digest 07

2019 DS/ML digest 06 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), March 18, 06:18

6th 2019 DS / ML digest

Highlights of the week

- Cool python features;

- Google's on-device STT;

- Why Facebook invested so much in PyTorch 1.0;

spark-in.me/post/2019_ds_ml_digest_06

#digest

#data_science

#deep_learning

2019 DS/ML digest 06

2019 DS/ML digest 06 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), March 06, 10:31

5th 2019 DS / ML digest

Highlights of the week

- New Adam version;

- POS tagging and semantic parsing in Russian;

- ML industrialization again;

spark-in.me/post/2019_ds_ml_digest_05

#digest

#data_science

#deep_learning

2019 DS/ML digest 05

2019 DS/ML digest 05 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), February 18, 09:24

4th 2019 DS / ML digest

Highlights of the week

- OpenAI controversy;

- BERT pre-training;

- Using transformer for conversational challenges;

spark-in.me/post/2019_ds_ml_digest_04

#digest

#data_science

#deep_learning

2019 DS/ML digest 04

2019 DS/ML digest 04 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), February 08, 10:11

Third 2019 DS / ML digest

Highlights of the week

- quaternions;

- ODEs;

spark-in.me/post/2019_ds_ml_digest_03

#digest

#data_science

#deep_learning

2019 DS/ML digest 03

2019 DS/ML digest 03 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), January 31, 09:41

Second 2019 DS / ML digest

Highlight of the week - Facebook's LASER.

spark-in.me/post/2019_ds_ml_digest_02

#digest

#data_science

#deep_learning

2019 DS/ML digest 02

2019 DS/ML digest 02 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), January 15, 08:33

First 2019 DS / ML digest

No particular highlights - just maybe ML industrialization vector is here to stay?

spark-in.me/post/2019_ds_ml_digest_01

#digest

#deep_learning

#data_science

2019 DS/ML digest 01

2019 DS/ML digest 01 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), December 19, 2018

DS/ML digest 32

Highlights:

- A way to replace softmax in NMT;

- Large visual reasoning dataset;

- PyText;

spark-in.me/post/2018_ds_ml_digest_32

#digest

#deep_learning

#data_science

2018 DS/ML digest 32

2018 DS/ML digest 32 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), December 09, 2018

DS/ML digest 31

Highlights of the week:

- PyTorch 1.0 released;

- Drawing with GANs;

- BERT explained;

spark-in.me/post/2018_ds_ml_digest_31

#digest

#deep_learning

#data_science

2018 DS/ML digest 31

2018 DS/ML digest 31 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), November 28, 2018

DS/ML digest 30

spark-in.me/post/2018_ds_ml_digest_30

#digest

#deep_learning

#data_science

2018 DS/ML digest 30

2018 DS/ML digest 30 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), November 15, 2018

DS/ML digest 29

spark-in.me/post/2018_ds_ml_digest_29

#digest

#deep_learning

#data_science

2018 DS/ML digest 29

2018 DS/ML digest 29 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), November 06, 2018

DS/ML digest 28

Google open sources pre-trained BERT ... with 102 languages ...

spark-in.me/post/2018_ds_ml_digest_28

#digest

#deep_learning

#data_science

2018 DS/ML digest 28

2018 DS/ML digest 28 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), October 23, 2018

DS/ML digest 27

NLP in the focus again!

spark-in.me/post/2018_ds_ml_digest_27

Also your humble servant learned how to do proper NMT =)

#digest

#deep_learning

#data_science

2018 DS/ML digest 27

2018 DS/ML digest 27 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), October 15, 2018

DS/ML digest 26

More interesting NLP papers / material ...

spark-in.me/post/2018_ds_ml_digest_26

#digest

#deep_learning

#data_science

2018 DS/ML digest 26

2018 DS/ML digest 26 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), September 28, 2018

DS/ML digest 25

spark-in.me/post/2018_ds_ml_digest_25

#digest

#deep_learning

#data_science

2018 DS/ML digest 25

2018 DS/ML digest 25 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), September 20, 2018

DS/ML digest 24

Key topics of this one:

- New method to calculate phrase/n-gram/sentence embeddings for rare and OOV words;

- So many releases from Google;

spark-in.me/post/2018_ds_ml_digest_24

If you like our digests, you can support the channel via:

- Sharing / reposting;

- Giving an article a decent comment / a thumbs-up;

- Buying me a coffee (links on the digest);

#digest

#deep_learning

#data_science

2018 DS/ML digest 24

2018 DS/ML digest 24 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), September 06, 2018

DS/ML digest 23

The key topic of this one - is this is insanity

- vid2vid

- unsupervised NMT

spark-in.me/post/2018_ds_ml_digest_23

If you like our digests, you can support the channel via:

- Sharing / reposting;

- Giving an article a decent comment / a thumbs-up;

- Buying me a coffee (links on the digest);

Let's spread the right DS/ML ideas together.

#digest

#deep_learning

#data_science

2018 DS/ML digest 23

2018 DS/ML digest 23 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), August 31, 2018

DS/ML digest 22

spark-in.me/post/2018_ds_ml_digest_22

#digest

#deep_learning

#data_science

2018 DS/ML digest 22

2018 DS/ML digest 22 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), August 21, 2018

2018 DS/ML digest 21

spark-in.me/post/2018_ds_ml_digest_21

#digest

#deep_learning

#nlp

2018 DS/ML digest 21

2018 DS/ML digest 21 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), August 12, 2018

2018 DS/ML digest 20

spark-in.me/post/2018_ds_ml_digest_20

#deep_learning

#digest

#data_science

2018 DS/ML digest 20

2018 DS/ML digest 20 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), July 31, 2018

2018 DS/ML digest 19

Market / data / libraries

(0) 32k lesions image dataset open-sourced

- goo.gl/CUQwnv

- nihcc.app.box.com/v/DeepLesion

(1) A new Distill article about Differentiable Image Parameterizations

- Usually images are parametrized as RGB values (normalized)

- Idea - use different (learnable) parametrization

- distill.pub/2018/differentiable-parameterizations/

- Parametrizing resulting image with fourier transform enables to use different architectures with style transfer distill.pub/2018/differentiable-parameterizations/#figure-style-transfer-diagram

- Working with transparent images

(2) Lip reading with 40% Word Error Rate arxiv.org/pdf/1807.05162.pdf

(3) Joing auto architecture + hyper param search arxiv.org/pdf/1807.06906.pdf (*)

(4) rl-navigation.github.io/deployable/

(5) New CNN architectures from ICML www.facebook.com/icml.imls/videos/429607650887089/%20 (*)

(6) Jupiter notebook widget for text annotaion github.com/natasha/ipyannotate

(7) A bit more debunking of auto-ml by fast.ai www.fast.ai/2018/07/23/auto-ml-3/

(8) A small intro to Bayes methods alexanderdyakonov.wordpress.com/2018/07/30/%d0%b1%d0%b0%d0%b9%d0%b5%d1%81%d0%be%d0%b2%d1%81%d0%ba%d0%b8%d0%b9-%d0%bf%d0%be%d0%b4%d1%85%d0%be%d0%b4/

(9) Criminal face recognition 20% false positives - www.nytimes.com/2018/07/26/technology/amazon-aclu-facial-recognition-congress.html?

(10) Denoising images wo noiseless ground-truth news.developer.nvidia.com/ai-can-now-fix-your-grainy-photos-by-only-looking-at-grainy-photos/?ncid=--45511

NLP

(0) Autoencoders for text habr.com/company/antiplagiat/blog/418173/ - no clear conclusion?

(1) RNN use cases overview indico.cern.ch/event/722319/contributions/3001310/attachments/1661268/2661638/IML-Sequence.pdf

(2) ACL 2018 notes ruder.io/acl-2018-highlights/

Hardware

(0) Edge embeddable TPU devices aiyprojects.withgoogle.com/edge-tpu ?

(1) GeForce 11* finally coming soon? Prices for 1080Ti are falling now...

#digest

#deep_learning

NIH Clinical Center releases dataset of 32,000 CT images

Lesion data may make it easier for scientific community to identify tumor growth or new disease


snakers4 (Alexander), July 23, 2018

2018 DS/ML digest 18

Highlights of the week

(0) RL flaws

thegradient.pub/why-rl-is-flawed/

thegradient.pub/how-to-fix-rl/

(1) An intro to AUTO-ML

www.fast.ai/2018/07/16/auto-ml2/

(2) Overview of advances in ML in last 12 months

www.stateof.ai/

Market / applied stuff / papers

(0) New Nvidia Jetson released

www.phoronix.com/scan.php?page=news_item&px=NVIDIA-Jetson-Xavier-Dev-Kit

(1) Medical CV project in Russia - 90% is data gathering

cv-blog.ru/?p=217

(2) Differentiable architecture search

arxiv.org/pdf/1806.09055.pdf

-- 1800 GPU days of reinforcement learning (RL) (Zoph et al., 2017)

-- 3150 GPU days of evolution (Real et al., 2018)

-- 4 GPU days to achieve SOTA in CIFAR => transferrable to Imagenet with 26.9% top-1 error

(3) Some basic thoughts about hyper-param tuning

engineering.taboola.com/hitchhikers-guide-hyperparameter-tuning/

(4) FB extending fact checking to mark similar articles

www.poynter.org/news/rome-facebook-announces-new-strategies-combat-misinformation

(5) Architecture behind Alexa choosing skills goo.gl/dWmXZf

- Char-level RNN + Word-level RNN

- Shared encoder, but attention is personalized

(6) An overview of contemporary NLP techniques

medium.com/@ageitgey/natural-language-processing-is-fun-9a0bff37854e

(7) RNNs in particle physics?

indico.cern.ch/event/722319/contributions/3001310/attachments/1661268/2661638/IML-Sequence.pdf?utm_campaign=Revue%20newsletter&utm_medium=Newsletter&utm_source=NLP%20News

(8) Google cloud provides PyTorch images

twitter.com/i/web/status/1016515749517582338

NLP

(0) Use embeddings for positions - no brainer

twitter.com/i/web/status/1018789622103633921

(1) Chatbots were a hype train - lol

medium.com/swlh/chatbots-were-the-next-big-thing-what-happened-5fc49dd6fa61

The vast majority of bots are built using decision-tree logic, where the bot’s canned response relies on spotting specific keywords in the user input.
Interesting links

(0) Reasons to use OpenStreetMap

www.openstreetmap.org/user/jbelien/diary/44356

(1) Google deployes its internet ballons

goo.gl/d5cv6U

(2) Amazing problem solving

nevalalee.wordpress.com/2015/11/27/the-hotel-bathroom-puzzle/

(3) Nice flame thread about CS / ML is not science / just engineering etc

twitter.com/RandomlyWalking/status/1017899452378550273

#deep_learning

#data_science

#digest

RL’s foundational flaw

RL as classically formulated has lately accomplished many things - but that formulation is unlikely to tackle problems beyond games. Read on to see why!


snakers4 (spark_comment_bot), July 13, 2018

2018 DS/ML digest 17

Highlights of the week

(0) Troubling trends with ML scholars

approximatelycorrect.com/2018/07/10/troubling-trends-in-machine-learning-scholarship/

(1) NLP close to its ImageNet stage?

thegradient.pub/nlp-imagenet/

Papers / posts / articles

(0) Working with multi-modal data distill.pub/2018/feature-wise-transformations/

- concatenation-based conditioning

- conditional biasing or scaling ("residual" connections)

- sigmoidal gating

- all in all this approach seems like a mixture of attention / gating for multi-modal problems

(1) Glow, a reversible generative model which uses invertible 1x1 convolutions

blog.openai.com/glow/

(2) Facebooks moonshots - I kind of do not understand much here

- research.fb.com/facebook-research-at-icml-2018/

(3) RL concept flaws?

- thegradient.pub/why-rl-is-flawed/

(4) Intriguing failures of convolutions

eng.uber.com/coordconv/ - this is fucking amazing

(5) People are only STARTING to apply ML to reasoning

deepmind.com/blog/measuring-abstract-reasoning/

Yet another online book on Deep Learning

(1) Kind of standard livebook.manning.com/#!/book/grokking-deep-learning/chapter-1/v-10/1

Libraries / code

(0) Data version control continues to develop dvc.org/features

#deep_learning

#data_science

#digest

Like this post or have something to say => tell us more in the comments or donate!

Troubling Trends in Machine Learning Scholarship

By Zachary C. Lipton* & Jacob Steinhardt* *equal authorship Originally presented at ICML 2018: Machine


snakers4 (Alexander), July 04, 2018

2018 DS/ML digest 15

What I filtered through this time

Market / news

(0) Letters by big company employees against using ML for weapons

- Microsoft

- Amazon

(1) Facebook open sources Dense Pose (eseentially this is Mask-RCNN)

- research.fb.com/facebook-open-sources-densepose/

Papers / posts / NLP

(0) One more blog post about text / sentence embeddings goo.gl/Zm8C2c

- key idea different weighting

(1) One more sentence embedding calculation method

- openreview.net/pdf?id=SyK00v5xx ?

(2) Posts explaing NLP embeddings

- www.offconvex.org/2015/12/12/word-embeddings-1/ - some basics - SVD / Word2Vec / GloVe

-- SVD improves embedding quality (as compared to ohe)?

-- use log-weighting, use TF-IDF weighting (the above weighting)

- www.offconvex.org/2016/02/14/word-embeddings-2/ - word embedding properties

-- dimensions vs. embedding quality www.cs.princeton.edu/~arora/pubs/LSAgraph.jpg

(3) Spacy + Cython = 100x speed boost - goo.gl/9TwVqu - good to know about this as a last resort

- described use-case

you are pre-processing a large training set for a DeepLearning framework like pyTorch/TensorFlow
or you have a heavy processing logic in your DeepLearning batch loader that slows down your training

(4) Once again stumbled upon this - blog.openai.com/language-unsupervised/

(5) Papers

- Simple NLP embedding baseline goo.gl/nGujzS

- NLP decathlon for question answering goo.gl/6HHi7q

- Debiasing embeddings arxiv.org/abs/1806.06301

- Once again transfer learning in NLP by open-AI - goo.gl/82VR4U

#deep_learning

#digest

#data_science

Download full.pdf 0.04 MB

snakers4 (Alexander), July 02, 2018

2018 DS/ML digest 14

Amazing article - why you do not need ML

- cyberomin.github.io/startup/2018/07/01/sql-ml-ai.html

- I personally love plain-vanilla SQL and in 90% of cases people under-use it

- I even wrote 90% of my JSON API on our blog in pure PostgreSQL xD

Practice / papers

(0) Interesting papers from CVPR towardsdatascience.com/the-10-coolest-papers-from-cvpr-2018-11cb48585a49

(1) Some down-to-earth obstacles to ML deploy habr.com/company/hh/blog/415437/

(2) Using synthetic data for CNNs (by Nvidia) - arxiv.org/pdf/1804.06516.pdf

(3) This puzzles me - so much effort and engineering spent on something ... strange and useless - taskonomy.stanford.edu/index.html

On paper they do a cool thing - investigate transfer learning between different domains, but in practice it is done on TF and there is no clear conclusion of any kind

(4) VAE + real datasets siavashk.github.io/2016/02/22/autoencoder-imagenet/ - only small Imagenet (64x64)

(5) Understanding the speed of models deployed on mobile - machinethink.net/blog/how-fast-is-my-model/

(6) A brief overview of multi-modal methods medium.com/mlreview/multi-modal-methods-image-captioning-from-translation-to-attention-895b6444256e

Visualizations / explanations

(0) Amazing website with ML explanations explained.ai/

(1) PCA and linear VAEs are close pvirie.wordpress.com/2016/03/29/linear-autoencoders-do-pca/

#deep_learning

#digest

#data_science

No, you don't need ML/AI. You need SQL

A while ago, I did a Twitter thread about the need to use traditional and existing tools to solve everyday business problems other than jumping on new buzzwords, sexy and often times complicated technologies.


snakers4 (Alexander), June 28, 2018

2018 DS/ML digest 13

Blog posts / articles:

(0) Google notes on CNN generalization - goo.gl/XS4KAw

(1) Google to teaching robots in virtual environment and then trasferring models to reality - goo.gl/aAYCqE

(2) Google's object tracking via image colorization - goo.gl/xchvBQ

(2) Interesting articles about VAEs:

- A small intro into VAEs habr.com/company/otus/blog/358946/

- A small intuitive intro (super super cool and intuitive)

towardsdatascience.com/intuitively-understanding-variational-autoencoders-1bfe67eb5daf

- KL divergence explained

www.countbayesie.com/blog/2017/5/9/kullback-leibler-divergence-explained

- A more formal write-up arxiv.org/abs/1606.05908

- In (RU) habr.com/company/otus/blog/358946/

- Converting a FC layer into a conv layer cs231n.github.io/convolutional-networks/#convert

- A post by Fchollet blog.keras.io/building-autoencoders-in-keras.html

A good in-depth write-up on object detection:

- machinethink.net/blog/object-detection/

- finally a decent explanation of YOLO parametrization machinethink.net/images/object-detection/[email protected]

- best comparison of YOLO and SSD ever - machinethink.net/images/object-detection/[email protected]

Papers with interesting abstracts (just good to know sich things exist)

- Low-bit CNNs - ai.intel.com/nervana/wp-content/uploads/sites/53/2018/06/ELQ_CameraReady_CVPR2018.pdf

- Automated Meta ML - arxiv.org/abs/1806.06927

- Idea - use ResNet blocks for boosting - arxiv.org/abs/1706.04964

- 2D-discrete-Fourier transform (2D-DFT) to encode rotational invariance in neural networks - arxiv.org/abs/1805.12301

- Smallify the CNNs - arxiv.org/abs/1806.03723

- BLEU review as a metric - conclusion - it is good on average to measure MT performance - www.mitpressjournals.org/doi/abs/10.1162/COLI_a_00322

"New" ideas in SemSeg:

- UNET + conditional VAE arxiv.org/abs/1806.05034

- Dilated convolutions for larget satellite images arxiv.org/abs/1709.00179 - looks like that this works only if you have high resolution with small objects

#digest

#deep_learning

How Can Neural Network Similarity Help Us Understand Training and Generalization?

Posted by Maithra Raghu, Google Brain Team and Ari S. Morcos, DeepMind In order to solve tasks, deep neural networks (DNNs) progressively...


older first