Spark in me - Internet, data science, math, deep learning, philo

snakers4 @ telegram, 1319 members, 1513 posts since 2016

All this - lost like tears in rain.

Data science, deep learning, sometimes a bit of philosophy and math. No bs.

Our website
- spark-in.me
Our chat
- goo.gl/WRm93d
DS courses review
- goo.gl/5VGU5A
- goo.gl/YzVUKf

Posts by tag «data_science»:

snakers4 (Alexander), July 15, 08:54

Sometimes in supervised ML tasks leveraging the data sctructure in a self-supervised fashion really helps!

Playing with CrowdAI mapping competition

In my opinion it is a good test-ground for testing your ideas with SemSeg - as the dataset is really clean and balanced

spark-in.me/post/a-small-case-for-search-of-structure-within-your-data

#deep_learning

#data_science

#satellite_imaging

Playing with Crowd-AI mapping challenge - or how to improve your CNN performance with self-supervised techniques

In this article I tell about a couple of neat optimizations / tricks / useful ideas that can be applied to many SemSeg / ML tasks Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), July 13, 09:15

Tensorboard + PyTorch

6 months ago looked at this - and it was messy

now it looks really polished

github.com/lanpa/tensorboard-pytorch

#data_science

lanpa/tensorboard-pytorch

tensorboard-pytorch - tensorboard for pytorch (and chainer, mxnet, numpy, ...)


snakers4 (spark_comment_bot), July 13, 05:22

2018 DS/ML digest 17

Highlights of the week

(0) Troubling trends with ML scholars

approximatelycorrect.com/2018/07/10/troubling-trends-in-machine-learning-scholarship/

(1) NLP close to its ImageNet stage?

thegradient.pub/nlp-imagenet/

Papers / posts / articles

(0) Working with multi-modal data distill.pub/2018/feature-wise-transformations/

- concatenation-based conditioning

- conditional biasing or scaling ("residual" connections)

- sigmoidal gating

- all in all this approach seems like a mixture of attention / gating for multi-modal problems

(1) Glow, a reversible generative model which uses invertible 1x1 convolutions

blog.openai.com/glow/

(2) Facebooks moonshots - I kind of do not understand much here

- research.fb.com/facebook-research-at-icml-2018/

(3) RL concept flaws?

- thegradient.pub/why-rl-is-flawed/

(4) Intriguing failures of convolutions

eng.uber.com/coordconv/ - this is fucking amazing

(5) People are only STARTING to apply ML to reasoning

deepmind.com/blog/measuring-abstract-reasoning/

Yet another online book on Deep Learning

(1) Kind of standard livebook.manning.com/#!/book/grokking-deep-learning/chapter-1/v-10/1

Libraries / code

(0) Data version control continues to develop dvc.org/features

#deep_learning

#data_science

#digest

Like this post or have something to say => tell us more in the comments or donate!

Troubling Trends in Machine Learning Scholarship

By Zachary C. Lipton* & Jacob Steinhardt* *equal authorship Originally presented at ICML 2018: Machine


snakers4 (Alexander), July 09, 09:04

2018 DS/ML digest 16

Papers / posts

(0) RL now solves Quake

venturebeat.com/2018/07/03/googles-deepmind-taught-ai-teamwork-by-playing-quake-iii-arena/

(1) A fast.ai post about AdamW

www.fast.ai/2018/07/02/adam-weight-decay/

-- Adam generally requires more regularization than SGD, so be sure to adjust your regularization hyper-parameters when switching from SGD to Adam

-- Amsgrad turns out to be very disappointing

-- Refresher article ruder.io/optimizing-gradient-descent/index.html#nadam

(2) How to tackle new classes in CV

petewarden.com/2018/07/06/what-image-classifiers-can-do-about-unknown-objects/

(3) A new word in GANs?

-- ajolicoeur.wordpress.com/RelativisticGAN/

-- arxiv.org/pdf/1807.00734.pdf

(4) Using deep learning representations for search

-- goo.gl/R1vhTh

-- library for fast search on python github.com/spotify/annoy

(5) One more paper on GAN convergence

avg.is.tuebingen.mpg.de/publications/meschedericml2018

(6) Switchable normalization - adds a bit to ResNet50 + pre-trained models

github.com/switchablenorms/Switchable-Normalization

Datasets

(0) Disney starts to release datasets

www.disneyanimation.com/technology/datasets

Market / interesting links

(0) A motion to open-source GitHub

github.com/dear-github/dear-github/issues/304

(1) Allegedly GTX 1180 start in sales appearing in Asia (?)

(2) Some controversy regarding Andrew Ng and self-driving cars goo.gl/WNW4E3

(3) National AI strategies overviewed - goo.gl/BXDCD7

-- Canada C$135m

-- China has the largest strategy

-- Notably - countries like Finland also have one

(4) Amazon allegedly sells face recognition to the USA goo.gl/eDzekn

#data_science

#deep_learning

Google’s DeepMind taught AI teamwork by playing Quake III Arena

Google’s DeepMind today shared the results of training multiple AI systems to play Capture the Flag on Quake III Arena, a multiplayer first-person shooter game. The AI played nearly 450,000 g…


snakers4 (Alexander), July 08, 06:06

A new multi-threaded addition to pandas stack?

Read about this some time ago (when this was just in development snakers41.spark-in.me/1850) - found essentially 3 alternatives

- just being clever about optimizing your operations + using what is essentially a multi-threaded map/reduce in pandas snakers41.spark-in.me/1981

- pandas on ray

- dask (overkill)

Links:

(0) rise.cs.berkeley.edu/blog/pandas-on-ray-early-lessons/

(1) www.reddit.com/comments/8wuz7e

(2) github.com/modin-project/modin

So...I ran a test in the notebook I had on hand. It works. More tests will be done in future.

pics.spark-in.me/upload/2c7a2f8c8ce1dd7a86a54ec3a3dcf965.png

#data_science

#pandas

Spark in me - Internet, data science, math, deep learning, philosophy

Pandas on Ray - RISE Lab https://rise.cs.berkeley.edu/blog/pandas-on-ray/


snakers4 (spark_comment_bot), July 07, 12:29

Playing with VAEs and their practical use

So, I played a bit with Variational Auto Encoders (VAE) and wrote a small blog post on this topic

spark-in.me/post/playing-with-vae-umap-pca

Please like, share and repost!

#deep_learning

#data_science

Like this post or have something to say => tell us more in the comments or donate!

Playing with Variational Auto Encoders - PCA vs. UMAP vs. VAE on FMNIST / MNIST

In this article I thoroughly compare the performance of VAE / PCA / UMAP embeddings on a simplistic domain - UMAP Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), July 04, 07:57

2018 DS/ML digest 15

What I filtered through this time

Market / news

(0) Letters by big company employees against using ML for weapons

- Microsoft

- Amazon

(1) Facebook open sources Dense Pose (eseentially this is Mask-RCNN)

- research.fb.com/facebook-open-sources-densepose/

Papers / posts / NLP

(0) One more blog post about text / sentence embeddings goo.gl/Zm8C2c

- key idea different weighting

(1) One more sentence embedding calculation method

- openreview.net/pdf?id=SyK00v5xx ?

(2) Posts explaing NLP embeddings

- www.offconvex.org/2015/12/12/word-embeddings-1/ - some basics - SVD / Word2Vec / GloVe

-- SVD improves embedding quality (as compared to ohe)?

-- use log-weighting, use TF-IDF weighting (the above weighting)

- www.offconvex.org/2016/02/14/word-embeddings-2/ - word embedding properties

-- dimensions vs. embedding quality www.cs.princeton.edu/~arora/pubs/LSAgraph.jpg

(3) Spacy + Cython = 100x speed boost - goo.gl/9TwVqu - good to know about this as a last resort

- described use-case

you are pre-processing a large training set for a DeepLearning framework like pyTorch/TensorFlow

or you have a heavy processing logic in your DeepLearning batch loader that slows down your training

(4) Once again stumbled upon this - blog.openai.com/language-unsupervised/

(5) Papers

- Simple NLP embedding baseline goo.gl/nGujzS

- NLP decathlon for question answering goo.gl/6HHi7q

- Debiasing embeddings arxiv.org/abs/1806.06301

- Once again transfer learning in NLP by open-AI - goo.gl/82VR4U

#deep_learning

#digest

#data_science

Download full.pdf 0.04 MB

snakers4 (Alexander), July 02, 04:51

2018 DS/ML digest 14

Amazing article - why you do not need ML

- cyberomin.github.io/startup/2018/07/01/sql-ml-ai.html

- I personally love plain-vanilla SQL and in 90% of cases people under-use it

- I even wrote 90% of my JSON API on our blog in pure PostgreSQL xD

Practice / papers

(0) Interesting papers from CVPR towardsdatascience.com/the-10-coolest-papers-from-cvpr-2018-11cb48585a49

(1) Some down-to-earth obstacles to ML deploy habr.com/company/hh/blog/415437/

(2) Using synthetic data for CNNs (by Nvidia) - arxiv.org/pdf/1804.06516.pdf

(3) This puzzles me - so much effort and engineering spent on something ... strange and useless - taskonomy.stanford.edu/index.html

On paper they do a cool thing - investigate transfer learning between different domains, but in practice it is done on TF and there is no clear conclusion of any kind

(4) VAE + real datasets siavashk.github.io/2016/02/22/autoencoder-imagenet/ - only small Imagenet (64x64)

(5) Understanding the speed of models deployed on mobile - machinethink.net/blog/how-fast-is-my-model/

(6) A brief overview of multi-modal methods medium.com/mlreview/multi-modal-methods-image-captioning-from-translation-to-attention-895b6444256e

Visualizations / explanations

(0) Amazing website with ML explanations explained.ai/

(1) PCA and linear VAEs are close pvirie.wordpress.com/2016/03/29/linear-autoencoders-do-pca/

#deep_learning

#digest

#data_science

No, you don't need ML/AI. You need SQL

A while ago, I did a Twitter thread about the need to use traditional and existing tools to solve everyday business problems other than jumping on new buzzwords, sexy and often times complicated technologies.


snakers4 (Alexander), July 01, 11:48

Measuring feature importance properly

explained.ai/rf-importance/index.html

Once again stumbled upon an amazing article about measuring feature importance for any ML algorithms:

(0) Permutation importance - if your ML algorithm is costly, then you can just shuffle a column and check importance

(1) Drop column importance - drop a column, re-train a model, check performance metrics

Why it is useful / caveats

(0) If you really care about understanding your domain - feature importances are a must have

(1) All of this works only for powerful models

(2) Landmines include - correlated or duplicate variables, data normalization

Correlated variables

(0) For RF - correlated variables share permutation importance roughly proportionally to their correlation

(1) Drop column importance can behave unpredictably

I personally like engineering different kinds of features and doing ablation tests:

(0) Among feature sets, sharing similar purpose

(1) Within feature sets

#data_science

snakers4 (Alexander), June 10, 15:35

And now the habr.ru article is also live -

habr.com/post/413775/

Please support us with your likes!

#deep_learning

#data_science

Состязательные атаки (adversarial attacks) в соревновании Machines Can See 2018

Или как я оказался в команде победителей соревнования Machines Can See 2018 adversarial competition. Суть любых состязательных атак на примере. Так уж...


snakers4 (Alexander), June 10, 06:50

An interesting idea from a CV conference

Imagine that you have some kind of algorithm, that is not exactly differentiable, but is "back-propable".

In this case you can have very convoluted logic in your "forward" statement (essentially something in between trees and dynamic programming) - for example a set of clever if-statements.

In this case you will be able to share both of the 2 worlds - both your algorithm (you will have to re-implement in your framework) and backprop + CNN. Nice.

Ofc this works only for dynamic deep-learning frameworks.

#deep_learning

#data_science

Machines Can See 2018 adversarial competition

Happened to join forces with a team that won 2nd place in this competition

- spark-in.me/post/playing-with-mcs2018-adversarial-attacks

It was very entertaining and a new domain to me.

Read more materials:

- Our repo github.com/snakers4/msc-2018-final

- Our presentation drive.google.com/file/d/1P-4AdCqw81nOK79vU_m7IsCVzogdeSNq/view

- All presentations drive.google.com/file/d/1aIUSVFBHYabBRdolBRR-1RKhTMg-v-3f/view

#data_science

#deep_learning

#adversarial

Playing with adversarial attacks on Machines Can See 2018 competition

This article is about MCS 2018 competition and my participation in it, adversarial attack methods and how out team won Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), June 07, 14:13

snakers4 (Alexander), May 25, 07:29

New competitions on Kaggle

Kaggle has started a new competition with video ... which is one of those competitions (read between the lines - blatant marketing)

www.kaggle.com/c/youtube8m-2018

I.e.

- TensorFlow Record files

- Each of the top 5 ranked teams will receive $5,000 per team as a travel award - no real prizes

- The complete frame-level features take about 1.53TB of space (and yes, these are not videos, but extracted CNN features)

So, they are indeed using their platform to promote their business interests.

Released free datasets are really cool, but only when you can use then for transfer learning, which implies also seeing the underlying ground level data (i.e. images of videos).

#data_science

#deep_learning

The 2nd YouTube-8M Video Understanding Challenge

Can you create a constrained-size model to predict video labels?


snakers4 (Alexander), May 19, 14:32

A thorough and short guide to Matplotlib API

A bit of history, small look under the hood and logical explanation of how to use it best:

realpython.com/python-matplotlib-guide/

#data_science

Python Plotting With Matplotlib (Guide) – Real Python

This article is a beginner-to-intermediate-level walkthrough on Python and matplotlib that mixes theory with example.


snakers4 (Alexander), May 15, 05:04

A great presentation about current state of particle tracking + ML

Also Kaggle failed to share this for some reason

indico.cern.ch/event/702054/attachments/1606643/2561698/tr180307_davidRousseau_CERN_trackML-FINAL.pdf

Key problem - current algorithm - Kalman filter faces time constaints

#data_science

snakers4 (Alexander), May 09, 06:55

A couple of articles about the harsh reality of DS / ML jobs

In a nutshell:

- politics

- unjustified decisions

- same as everywhere

- www.kdnuggets.com/2018/04/why-data-scientists-leaving-jobs.html

- www.rdisorder.eu/2017/09/13/most-difficult-thing-data-science-politics/

True story, especially about political decisions, fight for power, useless dashboards and data monkeys.

#data_science

The most difficult thing in data science: politics

Deep learning looks difficult to you? Come back after you get to know company politics, it will feel like a breeze...


snakers4 (Alexander), May 05, 13:13

Andrew Ng book

Is being released on chapter-by-chapter basis

- mailchi.mp/4e8328c430be/machine-learning-yearning-chapters-1-1258557?e=fd8b9f9fc6

This book is not really technical, though - it's more or less a combination of advice how to build ML models as a business process

Interesting idea - splitting your dev set into black box and eyeball dev set ... but this can be replaced by properly using Tensorboard when training...

#data_science

snakers4 (Alexander), May 01, 16:52

2018 DS/ML digest 9

Market / libraries

(0) Tensorflow + Swift - wtf - goo.gl/FDvLM4

(1) Geektimes / Habrhabr.ru going international - goo.gl/dbGNwD

(2) A service for renting GPUs ... from people

- Reddit goo.gl/HxQ54x

- Link vectordash.com/hosting/

- Looks LXC based (afaik - the only user friendly alternative to Docker)

- Cool in theory, no idea how secure this is - we can assume as secure as providing a docker container to stranger

- They did not reply me in a week

(3) A friend sent me a new list of ... new yet another PyTorch NLP libraries

- goo.gl/kasRfZ, goo.gl/XXnbJy (AllenNLP is the biggest library like this)

- I believe that such libraries are more or less useless for real tasks, but cool to know they exist

(4) New SpaceNet 4? goo.gl/CsSS6P

(5) A new super cool competition on Kaggle about particle physics? www.kaggle.com/c/trackml-particle-identification

Tutorials / basics

(0) Bias vs. Variance (RU) goo.gl/4Y7tH7

(1) Yet another magic Jupyter guideline collection - goo.gl/AFWMuq

Real world ML applications

(0) Resnet + object detection (RU) - people wo helmets 90% accuracy - goo.gl/7xpQnE

(1) Fast.ai about using embeddings with Tabular data - www.fast.ai/2018/04/29/categorical-embeddings/

Very similar to our approach on electricity

I personally do not recommend using their library by all means

(2) Comparing Google TPU vs. V100 with ResNet50 - goo.gl/s6dhsy

- speed - goo.gl/Pww2sm

- pricing - goo.gl/Rtkp8Q

- but ... buying GPUs is much cheaper

(3) Other blog posts about embeddings + tabular data

- Sales prediction blog.kaggle.com/2016/01/22/rossmann-store-sales-winners-interview-3rd-place-cheng-gui/

- Taxi drive prediction blog.kaggle.com/2015/07/27/taxi-trajectory-winners-interview-1st-place-team-%F0%9F%9A%95/

MLP + classification + embeddings - goo.gl/AMNGNG / arxiv.org/pdf/1508.00021.pdf

(4) Albu's solution to SpaceNet - augmentations github.com/SpaceNetChallenge/RoadDetector/tree/master/albu-solution/src/augmentations

CNN overview

Neural network part:

Split data to 4 folds randomly but the same number of each city tiles in every fold

Use resnet34 as encoder and unet-like decoder (conv-relu-upsample-conv-relu) with skip connection from every layer of network. Loss function: 0.8*binary_cross_entropy + 0.2*(1 – dice_coeff). Optimizer – Adam with default params.

Train on image crops 512*512 with batch size 11 for 30 epoch (8 times more images in one epoch)

Train 20 epochs with lr 1e-4

Train 5 epochs with lr 2e-5

Train 5 epochs with lr 4e-6

Predict on full image with padding 22 on borders (1344*1344).

Merge folds by mean

Jobs / job market

(0) Developers by country by scraping GitHub - goo.gl/n8gnLi

- developers count vs. GDP prntscr.com/j9v80e R^2 = 84%

- developers count vs. population - R^2 = 50%

Visualization

(0) Interactive tool for visualizing convolutions - ezyang.github.io/convolution-visualizer/

Datasets

(0) Open Images v4 outsourced

- research.googleblog.com/2018/04/announcing-open-images-v4-and-eccv-2018.html

- the dataset itself storage.googleapis.com/openimages/web/download.html

- categories storage.googleapis.com/openimages/2018_04/bbox_labels_600_hierarchy_visualizer/circle.html

#data_science

#deep_learning

#digest

tensorflow/swift

swift - Swift for TensorFlow documentation repository.


snakers4 (Alexander), May 01, 05:37

Playing with unsupervised learning in genetics

A small blog post on this topic

spark-in.me/post/playing-with-genetics

The first thing that springs to mind is RNN but what if there is no annotation and it is not known if the data consists of valid sequences?)

#data_science

Playing with genetic markers, clustering and visualization

Mesmerizing structires found in data: encoding, dimension reduction, clustering and visualization a dataset with genetic markers Статьи автора - http://spark-in.me/author/yara_tchk Блог - http://spark-in.me


snakers4 (Alexander), April 28, 08:45

Using Mendeley to read papers

Looks like when you migrate to a new PC it also can migrate your literature library.

Nice.

#data_science

snakers4 (Alexander), April 27, 09:58

A handy snippet for `IOU` calculation

stackoverflow.com/questions/25349178/calculating-percentage-of-bounding-box-overlap-for-image-detector-evaluation

#deep_learning

Calculating percentage of Bounding box overlap, for image detector evaluation

In testing an object detection algorithm in large images, we check our detected bounding boxes against the coordinates given for the ground truth rectangles. According to the Pascal VOC challenges,


Widen Jupyter editor to 100% wide screen

Just apply this CSS

#texteditor-container {

width: 95%

}

#data_science

snakers4 (Alexander), April 26, 04:24

On the surface looks like an interesting competition

Well, I said that about Power Laws - but then it turned out otherwise.

So far I can see CV, NLP and tables in one mix.

www.kaggle.com/c/avito-demand-prediction/

#data_science

Avito Demand Prediction Challenge

Predict demand for an online classified ad


snakers4 (Alexander), April 22, 15:02

DWT article on habrahabr.ru

DS Bowl article is live on habrahabr.ru

- habrahabr.ru/post/354040/

Please support us with your likes.

#data_science

Применяем Deep Watershed Transform в соревновании Kaggle Data Science Bowl 2018

Применяем Deep Watershed Transform в соревновании Kaggle Data Science Bowl 2018 Представляем вам перевод статьи по ссылке и оригинальный докеризированный код.


snakers4 (Alexander), April 22, 11:40

snakers4 (Alexander), April 20, 04:58

Useful Python abstractions / sugar / patterns

I already shared a book about patterns, which contains mostly high level / more complicated patters. But for writing ML code sometimes simple imperative function programming style is ok.

So - I will be posting about simple and really powerful python tips I am learning now.

This time I found out about map and filter, which are super useful for data preprocessing:

Map

items = [1, 2, 3, 4, 5]

squared = list(map(lambda x: x**2, items))Filter

number_list = range(-5, 5)

less_than_zero = list(filter(lambda x: x < 0, number_list))

print(less_than_zero)Also found this book - book.pythontips.com/en/latest/map_filter.html

#python

#data_science

snakers4 (Alexander), April 17, 19:14

Andrew NG released first 4 chapters of his new book

So far looks not really technical

- gallery.mailchimp.com/dc3a7ef4d750c0abfc19202a3/files/704291d2-365e-45bf-a9f5-719959dfe415/Ng_MLY01.pdf

#data_science

Download Ng_MLY01.pdf 1.52 MB

snakers4 (Alexander), April 17, 08:50

DS Bowl 2018 top solution

www.kaggle.com/c/data-science-bowl-2018/discussion/54741

#data_science

This is really interesting...their approach to separation is cool

snakers4 (Alexander), April 16, 10:17

A draft of the article about DS Bowl 2018 on Kaggle.

This time this was a lottery.

Good that I did not really spend much time, but this time I learned a lot about watershed and some other instance segmentation methods!

An article is accompanied by a dockerized PyTorch code release on GitHub:

- spark-in.me/post/playing-with-dwt-and-ds-bowl-2018

- github.com/snakers4/ds_bowl_2018

This is a beta, you are welcome to comment and respond.

Kudos!

#data_science

#deep_learning

#instance_se

Applying Deep Watershed Transform to Kaggle Data Science Bowl 2018 (dockerized solution)

In this article I will describe my solution to the DS Bowl 2018 and why it was a lottery and post a link to my dockerized solution Статьи автора - http://spark-in.me/author/snakers41 Блог - http://spark-in.me


snakers4 (Alexander), April 15, 08:06

2018 DS/ML digest 8

As usual my short bi-weekly (or less) digest of everything that passed my BS detector

Market / blog posts

(0) Fast.ai about the importance of accessibility in ML - www.fast.ai/2018/04/10/stanford-salon/

(1) Some interesting news about market, mostly self-driving cars (the rest is crap) - goo.gl/VKLf48

(2) US$600m investment into Chinese face recognition - goo.gl/U4k2Mg

Libraries / frameworks / tools

(0) New 5 point face detector in Dlib for face alignment task - goo.gl/T73nHV

(1) Finally a more proper comparsion of XGB / LightGBM / CatBoost - goo.gl/AcszWZ (also see my thoughts here snakers41.spark-in.me/1840)

(3) CNNs on FPGAs by ZFTurbo

-- www.youtube.com/watch?v=Lhnf596o0cc

-- github.com/ZFTurbo/Verilog-Generator-of-Neural-Net-Digit-Detector-for-FPGA

(4) Data version control - looks cool

-- dataversioncontrol.com

-- goo.gl/kx6Qdf

-- but I will not use it - becasuse proper logging and treating data as immutable solves the issue

-- looks like over-engineering for the sake of overengineering (unless you create 100500 datasets per day)

Visualizations

(0) TF Playground to seed how simplest CNNs work - goo.gl/cu7zTm

Applications

(0) Looks like GAN + ResNet + Unet + content loss - can easily solve simpler tasks like deblurring goo.gl/aviuNm

(1) You can apply dilated convolutions to NLP tasks - habrahabr.ru/company/ods/blog/353060/

(2) High level overview of face detection in ok.ru - goo.gl/fDUXa2

(3) Alternatives to DWT and Mask-RCNN / RetinaNet? medium.com/@barvinograd1/instance-embedding-instance-segmentation-without-proposals-31946a7c53e1

- Has anybody tried anything here?

Papers

(0) A more disciplined approach to training CNNs - arxiv.org/abs/1803.09820 (LR regime, hyper param fitting etc)

(1) GANS for iamge compression - arxiv.org/pdf/1804.02958.pdf

(2) Paper reviews from ODS - mostly moonshots, but some are interesting

-- habrahabr.ru/company/ods/blog/352508/

-- habrahabr.ru/company/ods/blog/352518/

(3) SqueezeNext - the new SqueezeNet - arxiv.org/abs/1803.10615

#digest

#data_science

#deep_learning

snakers4 (Alexander), April 12, 08:08

DS Bowl 2018 stage 2 data was released.

It has completely different distribution from stage 1 data.

How do you like them, apples?

Looks like Kaggle admins really have no idea about dataset curation, or all of this is mean to misguide manual annotators.

Anyway - looks like random bs.

#data_science

#deep_learning