You’ve carefully formulated and specified a target problem, collected and processed training data, picked a familiar framework (or learned a new one) and through a process of research, intuition, and luck chosen an initial network architecture to represent your problem. You’ve methodically run experiments and tuned hyperparameters, getting side-tracked along the way to write tools to help visualize and understand your results. Finally, you have a model which works. You’re done!
Or are you?
Production deployment of your model can still be a minefield. Your options may be limited by the framework you originally chose to implement your model, or you may find that custom work is necessary to add support for an exotic feature of your architecture to a generic exporter. Many choices necessitate many decisions. Do you attempt to use something newer and specifically designed for production deployment, such as ONNX, or do you roll your own REST APIs and Docker containers to host them and allow evaluations of your models to scale across a cluster? Do you target the cloud from day one, if your studios security policies even allow it, or opt to keep things simple and hopefully more manageable? What about tight integration with 3rd party applications such as Houdini, Maya, and Nuke?
Once it’s out there and being relied upon by production it may still be necessary to apply tweaks for one set of users but not others. It’s likely the model needs to be registered as being related to a particular set of compatible digital assets in an in-house asset management system, tracking the chain of training data and model dependencies as work flows through the pipeline. You will also need to collect feedback from production in order to improve subsequent model iterations. Or perhaps you want to employ continuous integration methodologies to keep your datasets, models, and associates tools current. It may be tempting to treat trained models as if they were any other software package to leverage in-house version control tools and support channels, but perhaps not every artist has the necessary GPU hardware on their workstation to run inference optimally.
Many of the technical problems associated with deep learning production deployment do not yet have complete off-the-shelf solutions which are compatible with studios’ existing workflows and infrastructure. But the coming years are likely to bring some clarity and indications of the best ways that studios can efficiently integrate deep learning into their pipelines.
We look forward to hearing about the industry’s experiences in the future, and would welcome comments or even guest posts on the topic.
We are having a discussion at Siggraph 2018 on Tuesday Aug 14th. It is somewhat related to this subject. We will discuss both Production and Software Development scheduling challenges, including deployment http://www.jdamato.com/siggraph
It is not meant as a highly technical discussion and we expect production and technology people to be present in this open forum. If there are unique aspects of AI/ML that alter the deployment strategy (or more importantly the ADOPTION of a specific tool/workflow) then we would love to hear from you. Feel free to drop into the open discussion or send us what you want to talk in case we can wedge it into our scheduled agenda.
Thank you for sharing, Joe and Johanna! It certainly would be interesting to hear if any studios have experience in this area.