Machine Learning Labs

Machine learning in practice can be an arduous task. Managing multiple iterations of code, processing input data, feature engineering, training models, visualizing and tabulating results, performing analysis, and using experience to draw conclusions and adapt the system in a way which might yield improved results.

In many cases you feel like you have more promising ideas than you have compute resource to conduct the necessary experiments to demonstrate them, and without meticulous book-keeping you may struggle to understand the behaviour of your model under different parameterizations. Worse, it may not be possible to reproduce your own results as unmanaged changes to code and data get confused during the course of executing multiple experiments.

Resource management can also be a concern; the cost of high-end GPUs should act as a strong motivator to keep them occupied 24/7 wherever possible, whilst perhaps sharing them fairly with production, amongst multiple developers or even accessing them securely through a third party cloud service. The ability to integrate with the studio’s existing renderfarm might also be advantageous.

There are tools out there which attempt to address some of these concerns, for example helping to record data pre-processing pipelines and model parameterizations, version control model code, ensure a reproducible environment (perhaps via a container or virtual machine), monitor training, handle failures gracefully, perform hyper-parameter searches, and help make sense of multiple variations of the same experiment. Here are but a few, both open source and commercial, in their own words:

Airflow is a platform to programmatically author, schedule, and monitor workflows

When workflows are defined as code, they become more maintainable, versionable, testable, and collaborative.
Use Airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The Airflow scheduler executes your tasks on an array of workers while following the specified dependencies. Rich command line utilities make performing complex surgeries on DAGs a snap. The rich user interface makes it easy to visualize pipelines running in production, monitor progress, and troubleshoot issues when needed.

Artemis is a collection of tools that make it easier to run experiments in Python. These include:
An easy-to-use system for making live plots, to monitor variables in a running experiment.

  • A browser-based plotter for displaying live plots.
  • A framework for defining experiments and logging their results (text output and figures) so that they can be reviewed later and replicated easily.
  • A system for downloading/caching files, to a local directory, so the same code can work on different machines.

Running 100 experiments in parallel on different versions of your code/data? Don’t remember how you got that result from 6 months ago? CodaLab allows you to run your jobs on a cluster, document and share your experiments, all while keeping track of full provenance, so you can be a more efficient researcher.

  • Tracks your code, experiments and results
  • Team collaboration built in
  • Works with most machine learning libraries
  • Free for individuals

DVC is an open source tool for data science projects. DVC makes your data science projects reproducible by automatically building data dependency graph (DAG). Your code and the dependencies could be easily shared by Git, and data – through cloud storage (AWS S3, GCP) in a single DVC environment.

Datmo is a simple CLI workflow tool and web platform to bring standardization, complete reproducibility, and collaboration to quantitative workflows.

The NVIDIA Deep Learning GPU Training System (DIGITS) puts the power of deep learning into the hands of engineers and data scientists. DIGITS can be used to rapidly train the highly accurate deep neural network (DNNs) for image classification, segmentation and object detection tasks.
DIGITS simplifies common deep learning tasks such as managing data, designing and training neural networks on multi-GPU systems, monitoring performance in real time with advanced visualizations, and selecting the best performing model from the results browser for deployment. DIGITS is completely interactive so that data scientists can focus on designing and training networks rather than programming and debugging.

Domino accelerates the development and delivery of models with key capabilities of infrastructure automation, seamless collaboration, and automated reproducibility. This greatly increases the productivity of data scientists and removes bottlenecks in the data science lifecycle.

FGLab is a machine learning dashboard, designed to make prototyping experiments easier. Experiment details and results are sent to a database, which allows analytics to be performed after their completion.

Platform-as-a-Service for training and deploying your DL models in the cloud. Start running your first project in < 30 sec! FloydHub takes care of the grunt work so you can focus on the core of your problem.

Luigi is a Python (2.7, 3.3, 3.4, 3.5) package that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more.
The purpose of Luigi is to address all the plumbing typically associated with long-running batch processes. You want to chain many tasks, automate them, and failures will happen. These tasks can be anything, but are typically long running things like Hadoop jobs, dumping data to/from databases, running machine learning algorithms, or anything else.

ModelDB is an end-to-end system to manage machine learning models. It ingests models and associated metadata as models are being trained, stores model data in a structured format, and surfaces it through a web-frontend for rich querying. ModelDB can be used with any ML environment via the ModelDB Light API. ModelDB native clients can be used for advanced support in spark.ml and scikit-learn.
The ModelDB frontend provides rich summaries and graphs showing model data. The frontend provides functionality to slice and dice this data along various attributes (e.g. operations like filter by hyperparameter, group by datasets) and to build custom charts showing model performance.

Neptune is a Machine Learning Lab provided as a service for data scientists to speed up the development and productionisation of machine learning models.

Pinball is a scalable workflow manager.

Reskit (researcher’s kit) is a library for creating and curating reproducible pipelines for scientific and industrial machine learning. The natural extension of the scikit-learn Pipelines to general classes of pipelines, Reskit allows for the efficient and transparent optimization of each pipeline step. Main features include data caching, compatibility with most of the scikit-learn objects, optimization constraints (e.g. forbidden combinations), and table generation for quality metrics. Reskit also allows for the injection of custom metrics into the underlying scikit frameworks. Reskit is intended for use by researchers who need pipelines amenable to versioning and reproducibility, yet who also have a large volume of experiments to run.

Sacred is a tool to help you configure, organize, log and reproduce experiments. It is designed to do all the tedious overhead work that you need to do around your actual experiment in order to:

  • keep track of all the parameters of your experiment
  • easily run your experiment for different settings
  • save configurations for individual runs in a database
  • reproduce your results

Distributed platform for rapid Deep learning application development.

Visdom aims to facilitate visualization of (remote) data with an emphasis on supporting scientific experimentation. Broadcast visualizations of plots, images, and text for yourself and your collaborators. Organize your visualization space programmatically or through the UI to create dashboards for live data, inspect results of experiments, or debug experimental code.

Where possible, newly uncovered ones will be starred on our GitHub account.

Which platforms are you having successes with? Which ones have you tried but subsequently rejected? Did you find any particularly well-suited to integration with your studio’s existing infrastructure? Are you writing your own platform? Or are you simply comfortable using lower level tools? Let us know in the comments section. We will surely be revisiting some of these tools again in the future.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.