Here is a very abstract question - What does a data science model look like? We are all using data science models in our day to day life. Most people that aren’t data scientists have experienced a data science model, but never seen one. So, let me reveal the secret. It may look scary. Here is what a data science model looks like:
It is a mathematical formula encrypted into alphanumeric characters. But make no mistake, this strange looking thing is the secret sauce for making your enterprise successful and to blow away the competition. It can help you perform your business operations with some cutting-edge advanced analytics. Diverse business cases such as
product recommendations to increase revenue,
fraud detection to prevent revenue loss, and
asset failure prediction to safeguard your asset value all have predictive models behind them.
Because models are so crucial in creating business value, we need to handle them with care. Let us look at different ways these models can be handled.
The worst care possible – the model left on laptop
The worst type of care is that these models are left on a laptop, usually where it was originally created. Imagine treating your enterprise secret sauce as a person left abandoned on an island.
Unfortunately, this happens a lot. Models created by data scientists using analytic tools on a laptop or PC remain there. A large amount of effort and brain power was used to create them and they contain elements critical to your enterprise success. However, as they remain on local machines and are never operationalised, this is the worst thing that can happen to such beautiful pieces of data science work.
Getting better – putting models in containers
A better approach is to put models in docker containers. In this way you are taking one step closer to treating the model in a better and more justified way. You are now putting the models in containers, which means that they are secured and isolated within the container, as well as easier to operationalise.
Though the model is in a safe container, it is still isolated. Which means that if you want to use the model, you need to send data to the docker container and use an API to get back the results. This means that data movement is increased, which may not be a desired situation for all business operations
Strategic approach – treating models like data
In recent times data has become a valuable asset for any company. Many advances in technology have been in managing data as a valuable asset, for example: data warehousing and big data storage platforms all revolve around keeping the data safe, managed and make it easily available to benefit a business.
So, if we start thinking of models as data, we can leverage all the benefits of data management and apply it to models. By treating models like data, we ensure that models will also become as strategic to business operations as the data is.
Here are some points on why treating models as data is an interesting proposition:
Models are made from data
Models are not created from thin air or by magic wand. They are created from the application of an algorithm to data. You can consider it as a mathematical projection of data. So, it makes sense to consider them as part of data.
Model results need data to make sense of them
Say your model alerts you of a critical asset failure in coming days. In order that any action can be taken, you need to know more details about this asset, such as its location and its value. You will also need an assessment of whether it makes sense to carry out an urgent repair or take the risk of waiting until the next scheduled maintenance is due.
As you realize by now, that output of the model was just an alert trigger. The real action needs to be done, and converting model output to something tangible needs data about the asset in question. So, if you have your model as part of the data that is stored in the system as your data in tables, you can easily integrate the output of models with other data; this makes sense out of the model output and also makes it more actionable
Managing millions of models
In the book “
Prediction Machines,” the authors write that AI predictions are becoming cheaper and this means we will use more of it. This also means that there will be more and more models.
Use-cases where millions of models are required is not science fiction. Accurate retail stock forecasting requires a model for each product in each store. Fraud detection requires modelling of normal customer behaviour in order to predict any deviation from normal behaviour. As normal behaviour for customer X may be different than normal behaviour for customer Y, you will need as many models as customers.
With enterprises managing millions of products and millions of customers, suddenly the need to have millions of models becomes inevitable.
In such a scenario, it is better to treat models like data and apply all big data management principles also to models.
Models are the Intellectual Property of your enterprise – keep them safe
Models are made from data, and they encode how your enterprise works. For example, a fraud detection model encodes how you intend to detect fraud. It is intellectual property for your company and therefore should be managed and kept safe.
Imagine the fraud detection model is stolen and decrypted or, even worse, the decrypted model is put on the Internet for everyone to know how you detect fraud. Suddenly, you will be left vulnerable to fraud attacks.
Managing models like data and applying all security principles of data to models will help make your intellectual property safer.
Managing the economics of your model
There is a cost to develop a model and there is a cost to manage your models and keep them operational. If you invest in specialised systems to manage the models, you increase the cost of the model. So, you need to think carefully about the total costs involved in creating and managing a model.
As all good models come from clean and integrated data, if you have good models, you already have a data management platform. So, if you leverage the data management platform also to manage your models, you are keeping the overall cost of models low. This helps in long run to keep your models economic and profitable
Now as you have seen why it makes sense to manage models like data, let me briefly describe what goes into it. These are some of the building blocks of you would like to treat models as data:
- Model Repository – This is a place where your models are stored as data. Generally, it is table with specialized fields to hold the model’s encrypted definition
- Model Metadata – Models are strange looking and hard to read for humans. You will need some kind of metadata which describes what the model is about. This is where model metadata is used. It has information such as purpose of model, what kind of algorithms it is using, and information about model accuracy
- Model Lineage – Like data, you will also need to know how the model was built as well as how it is used. You need to capture information on the data which went into building the model. This is very useful in traceability or audit situation
- Design Patterns – Models are like data. Most of them have origins outside of a data management platform. If you have to manage models like data, you need to bring them inside the database. This requires design patterns, which describes different ways in which an external model can be brought inside the database
With
Teradata Vantage, you can use the data warehouse to manage your models also and treat them like data. Teradata Vantage will assure that the models are managed like the valuable assets that they are.