Machine learning models are trained by learning a mapping between a set of input features and an output target. Typically, this mapping is learned by optimizing some cost function to minimize prediction error. Once the optimal model is found, it’s released with the goal of generating accurate predictions on future unseen data. Ideally, as expected, the models predict these future instances as accurately as the data used during the training process.
Upon deployment, the assumption is that future data will be similar to past observed data and the distributions of the features and targets will remain fairly constant. However, the assumption does not bring about the same results. A machine learning model’s predictive performance is expected to decline as soon as the model is deployed to production.
Data Source changes
Since changes are expected over time, model deployment should be treated as a continuous process. If we take the example of house price prediction, the prices of houses will not remain the same all the time. Data that has been used to train models which predict the prices of houses some while ago will not offer great predictions today. Another example is the computer vision model trained in labs to detect crowds in summer which could not function optimal in winter due to different setups and parameters. Hence, the objective is to continually ensure that the quality of your model in production is up to date.
NATIX reTrain
NATIX reTrain is one of the newest features of our platform released to address such issues for computer vision models deployed on Edge especially in smart city projects. NATIX reTrain is an automatic feedback loop between training and detection nodes for continuous improvement of the deployed AI. This means that the NATIX system adapts itself to different contexts and environments. For instance, a person detection AI model created from general datasets in our lab can be iteratively adapted to the streets of Den Haag to deliver optimum detection for that environment. The user has full control over the process and the training is done privately on the client’s trusted nodes.
With reTrain, users can benefit from
How does it work?
Periodically and with the user’s consent, NATIX middleware (Vision Deploy) collects low confidence detection data (i.e. images and metadata) to create a new training dataset. This data will be stored on trusted data nodes of the user for review and next steps.
After the data collection process, the images are double-checked by another detection model to check if the labels (annotations) are correct and to prepare a list of data points that need to be reviewed by a human.
The authorized user can access this data and review the dataset. After the review is done a final confirmation from the user is needed. This assures that the user is fully aware of what data is being added to the previous model.
NATIX reTrain uses various techniques to expand the new dataset. This is required as the amount of data used for training the previous model is much larger than the new dataset. To see a recognizable impact of newly collected data on the existing model, we need to amplify its presence with various methods. Consequently, this dataset is fed to the original model and multiple trainings are simultaneously started. At the end, the new output models are tested against the test set and the one with the highest performance is offered to the user for deployment on the production environment.
Overall, NATIX reTrain provides an iterative learning process that enables client-specific models. This leads to deployment of computer vision models in production without concerns for detection inaccuracy due to environmental parameters. This cross-sharing could be from NATIX lab to the client’s environment or from the beach boulevard in Scheveningen (City of The Hague) to the urban streets of Amsterdam. Furthermore, it accommodates privacy preservation and eliminates unwanted data-sharing. Besides, the new feature reduces costs of staff provision, update, and deployment.
We believe that NATIX reTrain has the potential to bring about more collaboration among different stakeholders such as city departments, municipalities, and public/private entities.
Curious to know how we can support your business? Drop us an email at hello@natix.io.
DISCLAIMER: This post only reflects the author’s personal opinion, not any other organization’s. This is not official advice. The author is not responsible for any decisions that readers choose to make.