Managing your DS / ML environment neatly and in style
If you have a sophisticated environment that you need to do DS / ML / DL, then using a set of Docker images may be a good idea.
You can also tap into a vast community of amazing and well-maintained Dockerhub repositories (e.g. nvidia, pytorch).
But what you have to do this for several people? And use it with a proper IDE via ssh?
A well-known features of Docker include copy on write and user "forwarding". If you approach naively, each user will store his own images, which take quite some space.
And also you have to make your ssh daemon works inside of a container as a second service.
So I solved these "challenges" and created 2 public layers so far:
- Basic DS / ML layer -
FROM aveysov/ml_images:layer-0 - from dockerfile;
- DS / ML libraries -
FROM aveysov/ml_images:layer-0- from dockerfile;
Your final dockerfile may look something like this just pulling from any of those layers.
Note that when building this, you will need to pass your
UID as a variable, e.g.:
docker build --build-arg NB_UID=1000 -t av_final_layer -f Layer_final.dockerfile .
When launched, this launched a notebook with extensions. You can just
exec into the machine itself to run scripts or use an
ssh daemon inside (do not forget to add your ssh key and
service ssh start).
Using public Dockerhub account for your private small scale deploy
Also a lifehack - you can just use Dockerhub for your private stuff, just separate the public part and the private part.
Push the public part (i.e. libraries and frameworks) to Dockerhub/
You private Dockerfile will be then something like:
COPY your_app_folder your_app_folder
COPY app.py app.py
CMD ["python3", "app.py"]