Croatia - Flag Croatia

Incoterms:DDP
All prices include duty and customs fees on select shipping methods.

Please confirm your currency selection:

Euros
Free shipping on most orders over 50 € (EUR)
All payment options available

US Dollars
Free shipping on most orders over $60 (USD)
All payment options available

Bench Talk for Design Engineers

Bench Talk

rss

Bench Talk for Design Engineers | The Official Blog of Mouser Electronics


Edge Impulse Fundamentals Part Six Mike Parks

Edge Impulse Fundamentals 6: Improving Machine-Learning Model Performance

(Source: eireenz- stock.adobe.com)

Welcome back to our series on machine learning and the capabilities that Edge Impulse offers embedded systems developers to AI to their products. In our previous entry, we discussed the issues that can arise when developing a neural network and some of the tools that Edge Impulse offers to assess the quality of your models. Let’s jump back into that topic and introduce the data explorer view. Perhaps one of the most visually striking views, the data explorer gives developers an easily digestible overview of how well the neural network has classified the training dataset on a point-by-data point basis (Figure 1). For example, let’s say we have a classifier making predictions based on data from a 3-axis accelerometer (X, Y, and Z). Edge Impulse will plot each data point in a three-dimensional grid (2D grid if only two variables). It will color code each according to whether or not the model correctly predicted the classification. Ideally, all data points of a similar classification will be grouped together. Furthermore, each data point can be selected individually, allowing the developer to gather detailed insights that can be especially useful in diagnosing why a training or test point was classified incorrectly.

Figure 1: Edge Impulse's data explorer can provide a visual mechanism to evaluate how a machine-learning model is classifying training data. (Source: Green Shoe Garage)

One final bit of performance analysis is also tied to the embedded platform you selected. As every microcontroller has varying hardware features and specifications, the performance of a model can vary significantly. Parameters such as operating frequency and the amount and speed of memory (flash and RAM) are among the chief factors. Edge Impulse can estimate your model's real-time performance for numerous, leading embedded platforms. Edge Impulse will estimate the inference time, RAM usage, and flash memory usage (Figure 2). Inference time is the time elapsed between the neural network being presented with new input data and when the model provides an output in the form of a predicted classification of the input signal. Typically, this will be measured in milliseconds. RAM and flash usage are presented as the peak memory required to perform the inference, which will likely be measured in kilobytes for embedded systems.

Figure 2: Edge Impulse can estimate the inference time, RAM usage, and flash memory usage of a machine-learning model. (Source: Green Shoe Garage)

Improving Model Performance

Now that we have a better understanding of the tools that Edge Impulse provides for us to probe the performance of machine-learning models, let’s look at how we can act on that knowledge to improve the model’s functionality and performance. If you recall from the previous installment, two major overarching categories of issues can affect model performance. First, overfitting occurs when a model performs exceedingly well on the training data but fails to generalize to new, unseen data. The model essentially "memorizes" the training examples. The second is underfitting and results when a model is too simplistic to capture the underlying patterns in the training data. The model fails to learn from the data and struggles to make accurate predictions on the training set and new data.

Addressing Overfitting

To address overfitting, various techniques can be applied:

  • Increase the training data and variety to represent the problem space better.
  • Simplify the model architecture by reducing the number of parameters or using techniques like feature selection or dimensionality reduction. For example, the model must ignore the unneeded axis if using a 3-axis accelerometer but only care about motion in two planes.
  • Apply techniques like dropout or early stopping during training to prevent the model from relying too heavily on specific features or over-iterating the training data.
  • Use regularization techniques such as L1 or L2 regularization to add penalties for overly complex models. The math behind L1 (aka Lasso regularization) and L2 (aka Ridge regularization) is beyond the scope of what we can cover here. For now, understand that if you suspect your model is demonstrating overfit behavior, it is possible to apply techniques during training to compensate. Specifically, the application of a so-called regularization term to the model's loss function during training. L1 regularization will promote sparsity and can be helpful for feature selection. L2 regularization encourages small but non-zero weights and can help prevent overfitting. More on both in a follow-up article.

Addressing Underfitting

To address underfitting, you can consider the following techniques:

  • Increase the model's complexity by adding more parameters, layers, or features. In other words, look for ways to identify differences between your selected classifications.
  • Train the model for more epochs or increase the capacity of the model.
  • Consider using a more powerful or flexible model architecture.

As we discussed previously, the struggle in fine-tuning a model is the bias-variance tradeoff. Recall that bias refers to the error introduced by approximating a real-world problem with a simplified model. A high-bias model typically exhibits underfitting. Variance, on the other hand, refers to the model's sensitivity to fluctuations in the training data. A high variance model is overly complex and captures noise in the training data, leading to overfitting. In addition to the regularization techniques discussed above, cross-validation, hyperparameter tuning, and collecting more diverse and representative training data can help find the optimal balance between bias and variance.

Conclusion

By using the model analysis tools provided by Edge Impulse, along with a knowledge of underfitting and overfitting and techniques to reduce the impacts of both, you can be assured that the models you are generating with be accurate and have top-notch performance on your embedded platform of choice. In the following entries of the Edge Impulse blog series, we will look at how you can perform real-time inferencing, versioning, and secure deployment of your machine-learning model to actual devices in the laboratory and the field.



« Back


Michael Parks, P.E. is the co-founder of Green Shoe Garage, a custom electronics design studio and embedded security research firm located in Western Maryland. He produces the Gears of Resistance Podcast to help raise public awareness of technical and scientific matters. Michael is also a licensed Professional Engineer in the state of Maryland and holds a Master’s degree in systems engineering from Johns Hopkins University.


All Authors

Show More Show More
View Blogs by Date

Archives