While machine learning and artificial intelligence are fixing many real-world issues, there are still many “unsolved” issues in these domains that cannot be fully understood because of fundamental issues that have not yet been fully resolved. Developers go deeply into a variety of machine learning fields and produce small, incremental gains. However, obstacles still stand in the way of these fields’ continued growth.
Several AI/ML developers were recently invited to a Reddit debate to address some of these “critical” and “unsolved” issues that, if they are resolved, will probably lead to significant advancements in these fields.
Prediction with uncertainty
Possibly the most crucial step in developing a machine learning model is collecting data from dependable and plentiful sources. Working with faulty or partial data, which is inherent in the area, is tough for beginners in machine learning who were previously computer scientists.
According to Andyk Maulana in his book series “Adaptive Computation and Machine learning,” “it can be unexpected that machine learning takes considerable use of probability theory given that many computer scientists and software developers work in a relatively clean and certain environment.”
There are three main areas where machine learning is uncertain:
Data noise: In machine learning, observations are referred to as “samples” or “instances,” which frequently include variability and unpredictability that ultimately affect the output.
Insufficient domain coverage: Models trained on observations that are inherently insufficient because they only represent a “sample” of the bigger, impractical dataset.
Models with flaws: According to George Box, “all models are erroneous but some are useful.” Every model contains some sort of mistake.
Learning systems with low resource convergence
It takes a lot of resources to optimise the training and data inference processes. Reduced neural network convergence times and the need for low-resource systems are issues that conflict with one another. Although it may be possible for developers to create technology with ground-breaking applications, such technology often demands a significant amount of hardware, power, storage, and electricity.
For instance, huge volumes of data are needed for language models. Massive training is necessary to eventually achieve human-level interactivity in the models. This calls for a longer convergence period and more expensive training resources.
Scaling the amount of input data is important for the development of machine learning algorithms because it, allegedly, improves model accuracy. However, the recent success of deep learning models demonstrates the necessity of more powerful processors and resources, necessitating constant juggling of the two issues.
Recent text-to-image generators like DALL-E and Midjourney offer examples of what might happen when training data and input are overfit.
A learning model overfits when it interprets random fluctuations in the training data as concepts, which leads to errors and reduces the model’s generalizability. Overfitting is also a byproduct of data noise.
Most non-parametric and non-linear models have strategies and input guidance parameters to limit the model’s learning scope in order to address this issue. Even so, it can be challenging to fit a flawless dataset into a model in real life. The following are two methods to reduce overfitting data:
Using resampling methods to evaluate the model’s precision: The most common sampling strategy, known as “K-fold cross validation,” enables model creators to train and test their creations several times using various subsets of training data.
Validation dataset withheld: To accomplish the model’s end goal and test how the model would perform on never-before-seen data, developers input a validation dataset after fine-tuning the machine learning algorithm on the initial dataset.
Estimating cause and effect rather than correlations
Humans naturally draw conclusions about causality. Deep neural networks and other machine learning algorithms excel at analysing patterns in massive datasets, but often struggle to identify causes. In areas like computer vision, robotics, and self-driving cars, this happens because models—while they are capable of recognising patterns—do not understand the physical environmental properties of objects. As a result, they make predictions about the situations rather than dealing with novel situations directly.
In a study titled “Towards Causal Representation Learning,” researchers from the Max Planck Institute for Intelligent Systems and Google Research discuss the difficulties faced by machine learning algorithms as a result of the absence of causal representation. Researchers claim that in an effort to address the lack of causality in machine learning models, developers try to increase the number of datasets used to train the models without realising that doing so eventually results in the models recognising patterns rather than independently “thinking.”
One approach of programming causality into machines is to introduce “inductive bias” into models. However, it might be argued that doing so could be detrimental to the development of bias-free AI.
Since AI/ML is the most promising technology in practically all fields, many newbies jump in headfirst without fully understanding the nuances of the field. Reproducibility or replication is a result of a combination of the aforementioned issues, but it still presents significant difficulties for newly created models.
Many of the algorithms are unsuccessful when evaluated and used by other experienced researchers due to a lack of resources and a reluctance to conduct large experiments. Large corporations that provide high-tech solutions frequently withhold their source codes from the public, forcing new researchers to do their own experiments and give answers to complex problems without thorough testing, which lacks trustworthiness.