Spark in me - Internet, data science, math, deep learning, philo

snakers4 @ telegram, 1797 members, 1726 posts since 2016

All this - lost like tears in rain.

Data science, ML, a bit of philosophy and math. No bs.

Our website
- spark-in.me
Our chat
- t.me/joinchat/Bv9tjkH9JHYvOr92hi5LxQ
DS courses review
- goo.gl/5VGU5A
- goo.gl/YzVUKf

Posts by tag «digest»:

snakers4 (Alexander), April 09, 06:00

2019 DS / ML digest number 8

Highlights of the week

- Transformer from Facebook with sub-word information;

- How to generate endless sentiment annotation;

- 1M breast cancer images;

spark-in.me/post/2019_ds_ml_digest_08

#digest

#deep_learning

2019 DS/ML digest 08

2019 DS/ML digest 08 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), March 25, 05:31

Good old OLS regression

I needed some quick boilerplate to create an OLS regression with confidence intervals for a very plain task.

Found some nice statsmodels examples here:

www.statsmodels.org/devel/examples/notebooks/generated/ols.html

#data_science

2019 DS / ML digest number 7

Highlights of the week

- NN normalization techniques (not batch norm);

- Jetson nano for US$99 released;

- A bitter lesson in AI;

spark-in.me/post/2019_ds_ml_digest_07

#digest

#deep_learning

2019 DS/ML digest 07

2019 DS/ML digest 06 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), March 18, 06:18

6th 2019 DS / ML digest

Highlights of the week

- Cool python features;

- Google's on-device STT;

- Why Facebook invested so much in PyTorch 1.0;

spark-in.me/post/2019_ds_ml_digest_06

#digest

#data_science

#deep_learning

2019 DS/ML digest 06

2019 DS/ML digest 06 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), March 06, 10:31

5th 2019 DS / ML digest

Highlights of the week

- New Adam version;

- POS tagging and semantic parsing in Russian;

- ML industrialization again;

spark-in.me/post/2019_ds_ml_digest_05

#digest

#data_science

#deep_learning

2019 DS/ML digest 05

2019 DS/ML digest 05 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), February 18, 09:24

4th 2019 DS / ML digest

Highlights of the week

- OpenAI controversy;

- BERT pre-training;

- Using transformer for conversational challenges;

spark-in.me/post/2019_ds_ml_digest_04

#digest

#data_science

#deep_learning

2019 DS/ML digest 04

2019 DS/ML digest 04 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), February 08, 10:11

Third 2019 DS / ML digest

Highlights of the week

- quaternions;

- ODEs;

spark-in.me/post/2019_ds_ml_digest_03

#digest

#data_science

#deep_learning

2019 DS/ML digest 03

2019 DS/ML digest 03 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), January 31, 09:41

Second 2019 DS / ML digest

Highlight of the week - Facebook's LASER.

spark-in.me/post/2019_ds_ml_digest_02

#digest

#data_science

#deep_learning

2019 DS/ML digest 02

2019 DS/ML digest 02 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), January 15, 08:33

First 2019 DS / ML digest

No particular highlights - just maybe ML industrialization vector is here to stay?

spark-in.me/post/2019_ds_ml_digest_01

#digest

#deep_learning

#data_science

2019 DS/ML digest 01

2019 DS/ML digest 01 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), December 19, 2018

DS/ML digest 32

Highlights:

- A way to replace softmax in NMT;

- Large visual reasoning dataset;

- PyText;

spark-in.me/post/2018_ds_ml_digest_32

#digest

#deep_learning

#data_science

2018 DS/ML digest 32

2018 DS/ML digest 32 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), December 09, 2018

DS/ML digest 31

Highlights of the week:

- PyTorch 1.0 released;

- Drawing with GANs;

- BERT explained;

spark-in.me/post/2018_ds_ml_digest_31

#digest

#deep_learning

#data_science

2018 DS/ML digest 31

2018 DS/ML digest 31 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), November 28, 2018

DS/ML digest 30

spark-in.me/post/2018_ds_ml_digest_30

#digest

#deep_learning

#data_science

2018 DS/ML digest 30

2018 DS/ML digest 30 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), November 15, 2018

DS/ML digest 29

spark-in.me/post/2018_ds_ml_digest_29

#digest

#deep_learning

#data_science

2018 DS/ML digest 29

2018 DS/ML digest 29 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), November 06, 2018

DS/ML digest 28

Google open sources pre-trained BERT ... with 102 languages ...

spark-in.me/post/2018_ds_ml_digest_28

#digest

#deep_learning

#data_science

2018 DS/ML digest 28

2018 DS/ML digest 28 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), October 23, 2018

DS/ML digest 27

NLP in the focus again!

spark-in.me/post/2018_ds_ml_digest_27

Also your humble servant learned how to do proper NMT =)

#digest

#deep_learning

#data_science

2018 DS/ML digest 27

2018 DS/ML digest 27 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), October 15, 2018

DS/ML digest 26

More interesting NLP papers / material ...

spark-in.me/post/2018_ds_ml_digest_26

#digest

#deep_learning

#data_science

2018 DS/ML digest 26

2018 DS/ML digest 26 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), September 28, 2018

DS/ML digest 25

spark-in.me/post/2018_ds_ml_digest_25

#digest

#deep_learning

#data_science

2018 DS/ML digest 25

2018 DS/ML digest 25 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), September 20, 2018

DS/ML digest 24

Key topics of this one:

- New method to calculate phrase/n-gram/sentence embeddings for rare and OOV words;

- So many releases from Google;

spark-in.me/post/2018_ds_ml_digest_24

If you like our digests, you can support the channel via:

- Sharing / reposting;

- Giving an article a decent comment / a thumbs-up;

- Buying me a coffee (links on the digest);

#digest

#deep_learning

#data_science

2018 DS/ML digest 24

2018 DS/ML digest 24 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), September 06, 2018

DS/ML digest 23

The key topic of this one - is this is insanity

- vid2vid

- unsupervised NMT

spark-in.me/post/2018_ds_ml_digest_23

If you like our digests, you can support the channel via:

- Sharing / reposting;

- Giving an article a decent comment / a thumbs-up;

- Buying me a coffee (links on the digest);

Let's spread the right DS/ML ideas together.

#digest

#deep_learning

#data_science

2018 DS/ML digest 23

2018 DS/ML digest 23 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), August 31, 2018

DS/ML digest 22

spark-in.me/post/2018_ds_ml_digest_22

#digest

#deep_learning

#data_science

2018 DS/ML digest 22

2018 DS/ML digest 22 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), August 21, 2018

2018 DS/ML digest 21

spark-in.me/post/2018_ds_ml_digest_21

#digest

#deep_learning

#nlp

2018 DS/ML digest 21

2018 DS/ML digest 21 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), August 12, 2018

2018 DS/ML digest 20

spark-in.me/post/2018_ds_ml_digest_20

#deep_learning

#digest

#data_science

2018 DS/ML digest 20

2018 DS/ML digest 20 Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), July 31, 2018

2018 DS/ML digest 19

Market / data / libraries

(0) 32k lesions image dataset open-sourced

- goo.gl/CUQwnv

- nihcc.app.box.com/v/DeepLesion

(1) A new Distill article about Differentiable Image Parameterizations

- Usually images are parametrized as RGB values (normalized)

- Idea - use different (learnable) parametrization

- distill.pub/2018/differentiable-parameterizations/

- Parametrizing resulting image with fourier transform enables to use different architectures with style transfer distill.pub/2018/differentiable-parameterizations/#figure-style-transfer-diagram

- Working with transparent images

(2) Lip reading with 40% Word Error Rate arxiv.org/pdf/1807.05162.pdf

(3) Joing auto architecture + hyper param search arxiv.org/pdf/1807.06906.pdf (*)

(4) rl-navigation.github.io/deployable/

(5) New CNN architectures from ICML www.facebook.com/icml.imls/videos/429607650887089/%20 (*)

(6) Jupiter notebook widget for text annotaion github.com/natasha/ipyannotate

(7) A bit more debunking of auto-ml by fast.ai www.fast.ai/2018/07/23/auto-ml-3/

(8) A small intro to Bayes methods alexanderdyakonov.wordpress.com/2018/07/30/%d0%b1%d0%b0%d0%b9%d0%b5%d1%81%d0%be%d0%b2%d1%81%d0%ba%d0%b8%d0%b9-%d0%bf%d0%be%d0%b4%d1%85%d0%be%d0%b4/

(9) Criminal face recognition 20% false positives - www.nytimes.com/2018/07/26/technology/amazon-aclu-facial-recognition-congress.html?

(10) Denoising images wo noiseless ground-truth news.developer.nvidia.com/ai-can-now-fix-your-grainy-photos-by-only-looking-at-grainy-photos/?ncid=--45511

NLP

(0) Autoencoders for text habr.com/company/antiplagiat/blog/418173/ - no clear conclusion?

(1) RNN use cases overview indico.cern.ch/event/722319/contributions/3001310/attachments/1661268/2661638/IML-Sequence.pdf

(2) ACL 2018 notes ruder.io/acl-2018-highlights/

Hardware

(0) Edge embeddable TPU devices aiyprojects.withgoogle.com/edge-tpu ?

(1) GeForce 11* finally coming soon? Prices for 1080Ti are falling now...

#digest

#deep_learning

NIH Clinical Center releases dataset of 32,000 CT images

Lesion data may make it easier for scientific community to identify tumor growth or new disease


snakers4 (Alexander), July 23, 2018

2018 DS/ML digest 18

Highlights of the week

(0) RL flaws

thegradient.pub/why-rl-is-flawed/

thegradient.pub/how-to-fix-rl/

(1) An intro to AUTO-ML

www.fast.ai/2018/07/16/auto-ml2/

(2) Overview of advances in ML in last 12 months

www.stateof.ai/

Market / applied stuff / papers

(0) New Nvidia Jetson released

www.phoronix.com/scan.php?page=news_item&px=NVIDIA-Jetson-Xavier-Dev-Kit

(1) Medical CV project in Russia - 90% is data gathering

cv-blog.ru/?p=217

(2) Differentiable architecture search

arxiv.org/pdf/1806.09055.pdf

-- 1800 GPU days of reinforcement learning (RL) (Zoph et al., 2017)

-- 3150 GPU days of evolution (Real et al., 2018)

-- 4 GPU days to achieve SOTA in CIFAR => transferrable to Imagenet with 26.9% top-1 error

(3) Some basic thoughts about hyper-param tuning

engineering.taboola.com/hitchhikers-guide-hyperparameter-tuning/

(4) FB extending fact checking to mark similar articles

www.poynter.org/news/rome-facebook-announces-new-strategies-combat-misinformation

(5) Architecture behind Alexa choosing skills goo.gl/dWmXZf

- Char-level RNN + Word-level RNN

- Shared encoder, but attention is personalized

(6) An overview of contemporary NLP techniques

medium.com/@ageitgey/natural-language-processing-is-fun-9a0bff37854e

(7) RNNs in particle physics?

indico.cern.ch/event/722319/contributions/3001310/attachments/1661268/2661638/IML-Sequence.pdf?utm_campaign=Revue%20newsletter&utm_medium=Newsletter&utm_source=NLP%20News

(8) Google cloud provides PyTorch images

twitter.com/i/web/status/1016515749517582338

NLP

(0) Use embeddings for positions - no brainer

twitter.com/i/web/status/1018789622103633921

(1) Chatbots were a hype train - lol

medium.com/swlh/chatbots-were-the-next-big-thing-what-happened-5fc49dd6fa61

The vast majority of bots are built using decision-tree logic, where the bot’s canned response relies on spotting specific keywords in the user input.
Interesting links

(0) Reasons to use OpenStreetMap

www.openstreetmap.org/user/jbelien/diary/44356

(1) Google deployes its internet ballons

goo.gl/d5cv6U

(2) Amazing problem solving

nevalalee.wordpress.com/2015/11/27/the-hotel-bathroom-puzzle/

(3) Nice flame thread about CS / ML is not science / just engineering etc

twitter.com/RandomlyWalking/status/1017899452378550273

#deep_learning

#data_science

#digest

RL’s foundational flaw

RL as classically formulated has lately accomplished many things - but that formulation is unlikely to tackle problems beyond games. Read on to see why!


snakers4 (spark_comment_bot), July 13, 2018

2018 DS/ML digest 17

Highlights of the week

(0) Troubling trends with ML scholars

approximatelycorrect.com/2018/07/10/troubling-trends-in-machine-learning-scholarship/

(1) NLP close to its ImageNet stage?

thegradient.pub/nlp-imagenet/

Papers / posts / articles

(0) Working with multi-modal data distill.pub/2018/feature-wise-transformations/

- concatenation-based conditioning

- conditional biasing or scaling ("residual" connections)

- sigmoidal gating

- all in all this approach seems like a mixture of attention / gating for multi-modal problems

(1) Glow, a reversible generative model which uses invertible 1x1 convolutions

blog.openai.com/glow/

(2) Facebooks moonshots - I kind of do not understand much here

- research.fb.com/facebook-research-at-icml-2018/

(3) RL concept flaws?

- thegradient.pub/why-rl-is-flawed/

(4) Intriguing failures of convolutions

eng.uber.com/coordconv/ - this is fucking amazing

(5) People are only STARTING to apply ML to reasoning

deepmind.com/blog/measuring-abstract-reasoning/

Yet another online book on Deep Learning

(1) Kind of standard livebook.manning.com/#!/book/grokking-deep-learning/chapter-1/v-10/1

Libraries / code

(0) Data version control continues to develop dvc.org/features

#deep_learning

#data_science

#digest

Like this post or have something to say => tell us more in the comments or donate!

Troubling Trends in Machine Learning Scholarship

By Zachary C. Lipton* & Jacob Steinhardt* *equal authorship Originally presented at ICML 2018: Machine


snakers4 (Alexander), July 04, 2018

2018 DS/ML digest 15

What I filtered through this time

Market / news

(0) Letters by big company employees against using ML for weapons

- Microsoft

- Amazon

(1) Facebook open sources Dense Pose (eseentially this is Mask-RCNN)

- research.fb.com/facebook-open-sources-densepose/

Papers / posts / NLP

(0) One more blog post about text / sentence embeddings goo.gl/Zm8C2c

- key idea different weighting

(1) One more sentence embedding calculation method

- openreview.net/pdf?id=SyK00v5xx ?

(2) Posts explaing NLP embeddings

- www.offconvex.org/2015/12/12/word-embeddings-1/ - some basics - SVD / Word2Vec / GloVe

-- SVD improves embedding quality (as compared to ohe)?

-- use log-weighting, use TF-IDF weighting (the above weighting)

- www.offconvex.org/2016/02/14/word-embeddings-2/ - word embedding properties

-- dimensions vs. embedding quality www.cs.princeton.edu/~arora/pubs/LSAgraph.jpg

(3) Spacy + Cython = 100x speed boost - goo.gl/9TwVqu - good to know about this as a last resort

- described use-case

you are pre-processing a large training set for a DeepLearning framework like pyTorch/TensorFlow
or you have a heavy processing logic in your DeepLearning batch loader that slows down your training

(4) Once again stumbled upon this - blog.openai.com/language-unsupervised/

(5) Papers

- Simple NLP embedding baseline goo.gl/nGujzS

- NLP decathlon for question answering goo.gl/6HHi7q

- Debiasing embeddings arxiv.org/abs/1806.06301

- Once again transfer learning in NLP by open-AI - goo.gl/82VR4U

#deep_learning

#digest

#data_science

Download full.pdf 0.04 MB

snakers4 (Alexander), July 02, 2018

2018 DS/ML digest 14

Amazing article - why you do not need ML

- cyberomin.github.io/startup/2018/07/01/sql-ml-ai.html

- I personally love plain-vanilla SQL and in 90% of cases people under-use it

- I even wrote 90% of my JSON API on our blog in pure PostgreSQL xD

Practice / papers

(0) Interesting papers from CVPR towardsdatascience.com/the-10-coolest-papers-from-cvpr-2018-11cb48585a49

(1) Some down-to-earth obstacles to ML deploy habr.com/company/hh/blog/415437/

(2) Using synthetic data for CNNs (by Nvidia) - arxiv.org/pdf/1804.06516.pdf

(3) This puzzles me - so much effort and engineering spent on something ... strange and useless - taskonomy.stanford.edu/index.html

On paper they do a cool thing - investigate transfer learning between different domains, but in practice it is done on TF and there is no clear conclusion of any kind

(4) VAE + real datasets siavashk.github.io/2016/02/22/autoencoder-imagenet/ - only small Imagenet (64x64)

(5) Understanding the speed of models deployed on mobile - machinethink.net/blog/how-fast-is-my-model/

(6) A brief overview of multi-modal methods medium.com/mlreview/multi-modal-methods-image-captioning-from-translation-to-attention-895b6444256e

Visualizations / explanations

(0) Amazing website with ML explanations explained.ai/

(1) PCA and linear VAEs are close pvirie.wordpress.com/2016/03/29/linear-autoencoders-do-pca/

#deep_learning

#digest

#data_science

No, you don't need ML/AI. You need SQL

A while ago, I did a Twitter thread about the need to use traditional and existing tools to solve everyday business problems other than jumping on new buzzwords, sexy and often times complicated technologies.


snakers4 (Alexander), June 28, 2018

2018 DS/ML digest 13

Blog posts / articles:

(0) Google notes on CNN generalization - goo.gl/XS4KAw

(1) Google to teaching robots in virtual environment and then trasferring models to reality - goo.gl/aAYCqE

(2) Google's object tracking via image colorization - goo.gl/xchvBQ

(2) Interesting articles about VAEs:

- A small intro into VAEs habr.com/company/otus/blog/358946/

- A small intuitive intro (super super cool and intuitive)

towardsdatascience.com/intuitively-understanding-variational-autoencoders-1bfe67eb5daf

- KL divergence explained

www.countbayesie.com/blog/2017/5/9/kullback-leibler-divergence-explained

- A more formal write-up arxiv.org/abs/1606.05908

- In (RU) habr.com/company/otus/blog/358946/

- Converting a FC layer into a conv layer cs231n.github.io/convolutional-networks/#convert

- A post by Fchollet blog.keras.io/building-autoencoders-in-keras.html

A good in-depth write-up on object detection:

- machinethink.net/blog/object-detection/

- finally a decent explanation of YOLO parametrization machinethink.net/images/object-detection/[email protected]

- best comparison of YOLO and SSD ever - machinethink.net/images/object-detection/[email protected]

Papers with interesting abstracts (just good to know sich things exist)

- Low-bit CNNs - ai.intel.com/nervana/wp-content/uploads/sites/53/2018/06/ELQ_CameraReady_CVPR2018.pdf

- Automated Meta ML - arxiv.org/abs/1806.06927

- Idea - use ResNet blocks for boosting - arxiv.org/abs/1706.04964

- 2D-discrete-Fourier transform (2D-DFT) to encode rotational invariance in neural networks - arxiv.org/abs/1805.12301

- Smallify the CNNs - arxiv.org/abs/1806.03723

- BLEU review as a metric - conclusion - it is good on average to measure MT performance - www.mitpressjournals.org/doi/abs/10.1162/COLI_a_00322

"New" ideas in SemSeg:

- UNET + conditional VAE arxiv.org/abs/1806.05034

- Dilated convolutions for larget satellite images arxiv.org/abs/1709.00179 - looks like that this works only if you have high resolution with small objects

#digest

#deep_learning

How Can Neural Network Similarity Help Us Understand Training and Generalization?

Posted by Maithra Raghu, Google Brain Team and Ari S. Morcos, DeepMind In order to solve tasks, deep neural networks (DNNs) progressively...


snakers4 (Alexander), June 23, 2018

Interesting links about Internet

- Ben Evans' digest - goo.gl/t9zG4y

- China plans to track cars - goo.gl/jeroFW

- Ben Evans - content is not king anymore - distribution / eco-system are goo.gl/ms2tQd

- Google opens AI center in Ghana - goo.gl/PRHBjq

- (RU) A funny case on censorship in Russia - funny article deleted from habr - sohabr.net/habr/post/414595/

-- It kind of clearly shows that you cannot safely post anything to habr

- India + WhatsApp + lynch mobs - goo.gl/tSBUCp

- Tor foundation about web-tracking and Facebook - goo.gl/H9DSuL

- Docker image jacking for crypto-mining - goo.gl/KrLLuQ

- Ethereum - 75% transactions automated bots - goo.gl/Q9BSNL

- (RU) - analyzing fake elections in Russia - 3-10M votes are fake - habr.com/post/358790/

#internet

2018 DS/ML digest 12

As usual, this is whatever I found really interesting / worth reading.

Implementations / papers / ideas

(0)

You can count bees well with UNet - matpalm.com/blog/counting_bees/

(1)

A really super cool idea - use affine transformations in 3D to stack augmentations on the level of transformation matrices

(3D augs are costly)

- gist.github.com/ematvey/5ca7df5d37c2f6a674390d42ef9e7d59

- both for rotation and scaling

- note a couple of things for easier understanding:

-- there is offset in tranformations - because the coordinate center is not in "center"

-- zoom essentially scales unit vectors after applying the offset

- 3Blue1Brown videos about linear algebra - www.youtube.com/watch?v=fNk_zzaMoSs

(2)

A top solution from Google's Landmark Challenge - goo.gl/pkZULZ

Essentially

- ensemble of features / skip connections from a CNN (ResNeXt)

- KNN

- use KNN + augment the extracted features by averaging with similar images

- query expansion (use the fact that different crops of the same landmark remain the same landmark)

(3)

(RU) A super cool series about interestring clustering algorithms

- Affinity propagation

-- habr.com/post/321216/

-- www.icmla-conference.org/icmla07/FreyDueckScience07.pdf

- DBSCAN habrahabr.ru/post/322034/

- (spoiler - in practice use awesome HDBSCAN library)

(4)

Brief review of image super-resolution techniques

- habr.com/post/359016/

- In a nutshell try in this order FCN CNNs, auto-encoders with skip connections or GANs

(5)

SOTA NLP by open-ai

blog.openai.com/language-unsupervised/

Key ideas

- Train a transformer language models on large corpus in an unsupervised way

- Fine-tune on a smaller task

- Profit

Caveats

- "Our approach requires an expensive pre-training step - 1 month on 8 GPUs" (probably this should be discounted somewhat)

- TF and unreadable enterprise code

(6)

One more claimed SOTA word embedding set

allennlp.org/elmo

(7)

A cool github page by Sebastian Ruder to track major NLP tasks

github.com/sebastianruder/NLP-progress

Visualizations

(0)

Amazing visual explanations of how decision trees work

- www.r2d3.us/visual-intro-to-machine-learning-part-2/

- it explains visually how overfitting occurs in decisions tree models

(1)

CIFAR T-SNE can be done in real-time on the GPU + tensorflow.js integration

- Blog goo.gl/Pk5Lq3

- Website goo.gl/1vpeFf

- Arxiv - arxiv.org/abs/1802.03680

- Demo - nicola17.github.io/tfjs-tsne-demo/

(2) Why people fail to use d3.js - goo.gl/hSt5dL

Datasets

(0) Nice idea - use available tools and videos to collect datasets

- goo.gl/HULsyH

- goo.gl/7AfRZZ

#digest

snakers4 (Alexander), June 12, 2018

Interesting links about Internet

- Ben Evans' digest - goo.gl/7NkYn6

- Why it took so much time to create previews for Wikipedia - goo.gl/xg7N99

- Google postulating its AI principles? blog.google/topics/ai/ai-principles/

- Google product alternatives - goo.gl/RmA76N - I personally started to switch to more open-source stuff lately, but Docs and Android have no real options

- The future of ML in embedded devices - goo.gl/PjWpKj (sound ideas, but a post is by an evangelist)

- Yahoo messenger shutting down (20 years!) - goo.gl/uhomds - hi ICQ

- Microsoft Buys GitHub for $7.5 Billion - 16z write-up - goo.gl/3znstT

- NYC medallions dropped 5x in price - goo.gl/Vi7pG6

- JD covers villages in China with drone delivery already - goo.gl/bMGKSY

#digest

snakers4 (spark_comment_bot), June 06, 2018

2018 DS/ML digest 11

Datasets

(0)

New Andrew Ng paper on radiology datasets

YouTube 8M Dataset post

As mentioned before - this is more or less blatant TF marketing

New papers / models / architectures

(0) Google RL search for optimal augmentations

- Blog, paper

- Finally Google paid attention to augmentations

- 83.54% top1 accuracy on ImageNet

- Discrete search problem, each policy consists of 5 sub-policies each each operation associated with two hyperparameters: probability and magnitude

- Training regime cosine decay for 200 epochs

- Top accuracy on ImageNet

- Best policy

- Typical examples of augmentations

(1)

Training CNNs with less data

Key idea - with clever selection of data you can decrease annotation costs 2-3x

(2)

Regularized Evolution for Image Classifier Architecture Search (AmoebaNet)

- The first controlled comparison of the two search algorithms (genetic and RL)

- Mobile-size ImageNet (top-1 accuracy = 75.1% with 5.1M parameters)

- ImageNet (top-1 accuracy = 83.1%)

Evolution vs. RL at Large-Compute Scale

• Evolution and RL do equally well on accuracy

• Both are significantly better than Random Search

• Evolution is faster

But the proper description of the architecture is nowhere to be seen...

Libraries / code / frameworks

(0) OpenCV installation for Ubuntu18 from source (if you need e.g. video support)

News / market

(0) Idea adversarial filters for apps - goo.gl/L4Vne7

(1) A list of 30 best practices for amateur ML / DL specialits - forums.fast.ai/t/30-best-practices/12344

- Some ideas about tackling naive NLP problems

- PyTorch allegedly supports just freezing bn layers

- Also a neat idea I tried with inception nets - assign different learning rates to larger models when fine-tuning them

(2) Stumbled upon a reference on NAdam as optimizer as being a bit better than Adam

It is also described in this popular article

(3) Barcode reader via OpenCV

#deep_learning

#digest

Like this post or have something to say => tell us more in the comments or donate!

older first