Kirk Borne is the Principal Data Scientist at Booz Allen and a top Big Data, Data Science and AI influencer. He recently pointed out on Twitter “The Risks Associated with Predictive Models – What We All Need to Remember: We Can’t Predict Everything!”.
He referenced an article from Data Science Central, “Predictability of Life Outcomes – Guess We Can’t Predict Everything.”
Summary from the above article: Whether trying to predict the life outcomes of disadvantaged kids or to model where ventilators will be most needed, a little humility is in order. As this study shows, the best data and the broadest teams failed at critical predictions. Getting the model wrong, or more importantly using it in the wrong way can hurt all of us.
This equally applies to industrial process chemistry. We have to be very careful about so called black box predictive models being promoted by various vendors for controlling the process chemistry. Sometimes these proprietary systems are promoted by suppliers, who want to use it as a way to tie your industrial site to their chemicals or equipment.
If you are considering the installation of such a system, PCA can help you to evaluate it on your process.
For now, we reference an article from Dan Steinberg’s Blog, “Black Boxes and Data Mining Systems”. Dan Steinberg, President and Founder of Salford Systems, is a well respected member of the statistics and econometrics communities. Here is a quote:
“In the world of data mining and predictive modelling, the concept of the black box often comes up in the context of proprietary prediction systems in which the vendor does not disclose details of the algorithm by which the predictions are being made….
In the 1990’s, many financial institutions paid hefty fees to use a proprietary system for predicting interest rates; the vendor was successful in persuading banks that the predictions were accurate enough to warrant subscribing to the service even though the banks did not know how the predictions were generated.
Today, in the field of data mining and predictive modeling software, there are new black box vendors who prefer to offer the most minimal descriptions of their algorithms. Instead of describing their own algorithms in detail, they offer general discussions of data mining principles and pepper their white papers with formulas for well-known procedures such as logistic regression and ROC calculation.
The topic addressed in this blog is: Should you seriously consider such a black box system? In general, we think not for the following reasons:….”