Artificial Intelligence Problems and Limitations – Data Overfitting

Artificial Intelligence Problems and Limitations – Data Overfitting
share on
by Sanjeev Kapoor 12 Aug 2019

Artificial Intelligence (AI) is one nowadays of the most trending IT topics, as it empowers novel applications that exhibit human-like capabilities. At the same time, there is a heated debate about the pros and cons of AI when compared to Human Intelligence (HI). AI is to date very efficient when dealing with domain specific problems like chess and GO, where no human can currently beat the top AI-based programs. However, AI performs poorly in problems where intelligence needs to be transferred and applied in different contexts. Humans are much more efficient than computers in generalizing their knowledge and applying it in different settings.

AI and HI can both lead to erroneous decisions, especially when these decisions are biased. Humans have different forms of bias when taking decisions. As a prominent example, humans tend to remember their choices as better than they were, which is known as choice supportive bias. Moreover, they are sometimes overly optimistic, which leads them to overestimate pleasing outcomes as part of wishful thinking or the so-called optimism bias. Furthermore, humans have a tendency to judge decisions based on their eventual outcomes instead of the quality of the decisions at the time they were made, which is conveniently called outcome bias. These are only a few bias examples: In fact, human decisions are subject to many more types of bias.


Bias in AI and the Overfitting Problem

Similar to HI systems, AI systems can be biased as well. The most common types of bias for AI systems include:

However, the most popular form of bias in AI system is the so-called overfitting bias, which happens when the AI system is built to fit very well the available datasets, but is weak in reliably fitting additional data and/or in predicting future observations. This usually happens when the training dataset (or parts of it) is not representative of the real-world context for the problem at hand. In such cases, the AI and machine learning model is trained to identify patterns that exist in the training dataset, yet they are not valid for additional data and future observations beyond the dataset. In general AI models that are very complex or have very high variance (e.g., flexibility) with respect to the data points of a dataset, are likely to be overfitted.


Overcoming the Overfitting Problem

Overfitting leads to AI models with poor performance such as poor accuracy in the case of predictive analytics. Therefore, data scientists strive to avoid overfitting bias based on one or more of the following measures:


Overall, there are known and tested methods for alleviating the overfitting bias. In practice, applying these methods is challenging, as data scientists have to deal with other related problems, such as lack of appropriate datasets, poor data sampling and data collection processes, problems in understanding business processes and the social context of the problems at hand, shortage of domain experts, as well as the proclaimed talent gap in data scientists and AI engineers. Therefore, despite the above ways for overcoming overfitting, there are still AI systems that suffer from this problem. Nevertheless, this should not be seen as a set-back to building, deploying and using AI. In the years to come more data will be gradually made available, along with more computing cycles that will allow their faster processing. More data will lead to more credible and accurate AI systems that will suffer less from overfitting and other forms of AI-related bias. In the meantime, AI experts should be prepared to timely identify and confront the bias issues.

Recent Posts

get in touch

We're here to help!

Terms of use
Privacy Policy
Cookie Policy
Site Map
2020 IT Exchange, Inc