Discussion about bias in artificial intelligence (AI) continues to focus significant attention largely on its social implications and rightly so. Yet, the very same processes that cause machine-learning models to reinforce stereotypes in society can impact efforts to use these technologies in artificial intelligence of things (AIoT) applications in industrial environments, healthcare, and many more. Without careful attention to data used to build these models, these applications stand to fall short of expectations even as companies and individuals rely more heavily on their results.
Encouraged by the ready availability of tools and hardware for building AI-based applications, organizations in nearly every market segment are rushing to take full advantage of a growing pool of data sources enabled by the rapid emergence of smart sensors in industrial, medical, and consumer devices, among others. Indeed, market researchers predict that the global AI market will reach a compound annual growth rate (CAGR) of nearly 35 percent through 2023. For AI within the manufacturing sector alone, researchers forecast a CAGR of over 55 percent through 2025. Here’s a look at how AI can be managed in these models.
The potential of using AI methods to deploy intelligent applications rapidly is compelling. AIoT applications able to predict failures of industrial machinery can offer much-needed relief from unplanned shutdowns when equipment fails. Instead of manually checking machine health periodically, sensors such as accelerometers, microphones, and temperature can feed continuous data streams to predictive maintenance applications built with machine-learning models trained to identify impending failure modes well before they occur (Figure 1).
Figure 1: By analyzing data from multiple sensor types, machine-learning models in predictive maintenance applications can warn of impending machine failures well before they lead to equipment breakdowns. (Source: STMicroelectronics)
For predictive maintenance of industrial motors, developers would gather combinations of vibration, audio, and temperature measurements associated with specific failure modes such as motor imbalance, misalignment, loose coupling, degraded bearings, and others. Using these data sets, they would then use supervised learning algorithms such as decision trees or neural-network models to later predict those failure modes from fresh sensor data collected from motors.
This approach is both a virtue and a limitation. It allows the deployment of sophisticated applications to detect those states without the need to write complex pattern-recognition software. Unfortunately, it also allows deployment of applications that are blind to states not supported with corresponding training data and are likely to favor the states that dominate the training data. If the training data set is heavily weighted to states associated with high-frequency vibrations and sound, for example, the corresponding model predictions will be weighted toward the kind of bearing-related failures associated with that combination of sensor measurements. Ideally, the chance of the model predicting a particular type of failure should be the same for each failure type rather than simply dependent on how much data corresponding to each failure type was collected and used in training.
Model bias appears when the model is more likely to predict one type of outcome over another. It can arise from the kind of sampling bias described above or from attempts to clean the data by removing data that seems to the human observer to be outliers. A more subtle form arises from the cognitive bias inherent in human observers, who naturally bring their own perceptions and understanding of the world into determining what is and what is not relevant—all of which can subtly impact the efficacy of machine-learning models.
Machine-learning researchers and experienced developers use various methods to counter the effects of bias. Approaches such as random forest algorithms or collections of neural networks use ensemble methods designed to help avoid these limitations. Other approaches focus on training methods, using generative adversarial networks (GANs) to generate new data sets to enhance the target neural network’s training. Still, other methods focus on the data itself with approaches that oversample data for poorly represented states to provide balanced training data. Methods for managing bias in AI remain a very active topic of research.
Even without these more advanced methods, AIoT application developers can take the first step in managing bias in their models by simply reminding themselves of the critical importance of collecting training data that is truly representative of their application. Although development platforms and supervised learning algorithm software tools have greatly simplified the machine-learning models’ deployment, the most critical component of machine-learning-based application development is the unbiased training data.
Stephen Evanczuk has more than 20 years of experience writing for and about the electronics industry on a wide range of topics including hardware, software, systems, and applications including the IoT. He received his Ph.D. in neuroscience on neuronal networks and worked in the aerospace industry on massively distributed secure systems and algorithm acceleration methods. Currently, when he's not writing articles on technology and engineering, he's working on applications of deep learning to recognition and recommendation systems.
Privacy Centre |
Terms and Conditions
Copyright ©2022 Mouser Electronics, Inc.
Mouser® and Mouser Electronics® are trademarks of Mouser Electronics, Inc. in the U.S. and/or other countries.
All other trademarks are the property of their respective owners.
Corporate headquarters and logistics centre in Mansfield, Texas USA.