Gary Marcus proposes a systematic criticism of deep learning

While people's application of AI technology is gradually moving toward the right track, pioneers in artificial intelligence have already turned their attention to the future. In early 2018, Gary Marcus, a professor at New York University and former supervisor at Uber AI Lab, published a critical article exploring the current state and limitations of deep learning. In this piece, he argued that we must move beyond deep learning to achieve truly universal AI. Although the concept of deep learning dates back decades, the term itself only gained popularity about five years ago, largely due to influential papers such as those by Krizhevsky, Sutskever, and Hinton. Their work on ImageNet marked a turning point, reviving interest in neural networks. What has emerged in the last five years? In areas like speech recognition, image processing, and gaming, where significant progress has been made, Marcus raised ten concerns about deep learning, suggesting that other techniques are needed alongside it. Many problems once solved effectively by deep learning have shown diminishing returns since 2016-2017. - François Chollet, Google, Keras Author, 2017.12.18 "Science marches on the funeral," and the future belongs to those who question everything I've said. - Geoffrey Hinton, Deep Learning Godfather, Google Brain Leader, 2017.9.15 **1. Learning to hit the wall deeply?** Although the roots of deep learning can be traced back decades (Schmidhuber, 2015), its popularity only took off five years ago. In 2012, Krizhevsky, Sutskever, and Hinton published a groundbreaking paper titled “ImageNet Classification with Deep Convolutional Neural Networks,” achieving top results in the ImageNet competition. This sparked a revolution, making deep learning the most prominent technique in AI. The idea of training multi-layer neural networks was not new, but increased computing power and data availability made it practical for the first time. Since then, deep learning has achieved remarkable success in areas like speech recognition, image recognition, and language translation, playing a crucial role in many current AI applications. Major companies have invested heavily in hiring deep learning talent. One of its key advocates, Wu Enda, even suggested that if a person can solve a mental task in less than a second, AI could automate it soon. A recent article in the New York Times Sunday Magazine highlighted that deep learning is "ready to reinvent itself." Nowadays, deep learning may be approaching its limits. Many of these challenges were anticipated in the rise of deep learning (Marcus 2012) and hinted at by figures like Hinton (Sabour, Frosst, & Hinton, 2017) and Chollet (2017). What exactly is deep learning? What does it reveal about intelligence? What can we expect from it? When will it fail? How close are we to general AI? How flexible can machines become when dealing with unfamiliar issues? The purpose of this article is not only to address irrational optimism but also to consider the direction we need to take. It is written for researchers in the field, as well as AI consumers without technical backgrounds who wish to understand the domain. In the second part, I will briefly explain what deep learning systems can do and why they are effective. Then, in the third part, I will discuss the weaknesses of deep learning. The fourth part will explore misunderstandings about its capabilities, and finally, I will introduce possible directions forward. Deep learning is unlikely to die, nor should it. However, five years after its rise, it seems time to critically reflect on its strengths and shortcomings. **2. What is deep learning? What can deep learning do?** Deep learning is essentially a statistical technique based on sample data that classifies patterns using multilayer neural networks. These networks consist of input units, multiple hidden layers, and output units. In a typical application, such as recognizing handwritten numbers, a network is trained on large sets of images and corresponding labels. Over time, an algorithm called back-propagation has emerged, allowing connections between units to be adjusted through a gradient descent process, enabling the system to produce accurate outputs for given inputs. In general, we can think of the relationship learned by a neural network as a mapping. Neural networks, especially those with multiple hidden layers, are particularly good at learning complex input-output mappings. These systems are often referred to as neural networks because their structure resembles biological neurons, albeit simplified. The connections between nodes resemble synapses. Most deep learning networks use convolution techniques (LeCun, 1989), which constrain neural connections and allow them to capture translational invariance. For example, an object can maintain its identity while moving across an image. Deep learning also has the ability to generate intermediate representations, such as internal units responding to more complex elements like horizontal lines or graph structures. In principle, for an infinite amount of data, a deep learning system can show a limited deterministic mapping between inputs and outputs. However, in practice, whether the system can learn such mappings depends on various factors. A common concern is the local minimum trap, where the system gets stuck in suboptimal solutions. Experts use various techniques to avoid these issues and achieve better results. For example, in speech recognition, neural networks learn the mapping between speech signals and labels (such as words or phonemes). In target recognition, they learn the mapping between images and tags. In DeepMind’s Atari game system, neural networks learn the mapping between pixel data and joystick positions. Deep learning is most often used as a classification system, as its goal is to determine the category to which a given input belongs. With enough creativity, the ability to classify is vast; the output can represent almost anything, such as words, positions on a chessboard. In a world with unlimited data and computing resources, other technologies may not be needed. **3. The Limitations of Deep Learning** The main limitation of deep learning is that the data we live in is not infinite. Systems relying on deep learning often need to generalize to unseen data, which is not infinite, and their ability to ensure high-quality performance is limited. We can use generalization as an extrapolation between known samples and data beyond the training set. For a neural network with good generalization performance, there usually needs to be a large amount of data, and the test data must be similar to the training data, so that the new answer is interpolated in the old data. In Krizhevsky et al.'s paper, a 9-layer convolutional neural network with 60 million parameters and 650,000 nodes was trained on approximately 1 million different samples from around 1000 categories. This method works well on limited datasets like ImageNet, where all external stimuli can be classified into smaller categories. It also performs well in stable areas. For instance, in speech recognition, data can be mapped to a limited set of speech categories in a conventional manner, but for many reasons, deep learning is not a universal solution for AI. Here are the ten challenges facing current deep learning systems: **3.1 Current Deep Learning Requires Large Data** Humans can learn abstract relationships with just a few attempts. If I tell you that "schmister" is a sister between the ages of 10 and 21, you might only need one example to understand that you don't have a schmister, your friend doesn't either, nor does your child or parent. You don't need hundreds or even millions of training samples. You just need to use an abstract relationship between variables of a few classes to define "schmister" accurately. Humans can learn such abstractions, either through exact definitions or more implicit means. Even a 7-month-old baby can learn abstract linguistic rules from a small number of unlabeled samples in just two minutes. A subsequent study by Gervain and colleagues showed that newborns can also perform similar learning. Deep learning currently lacks a mechanism for learning abstract concepts through exact definitions. When there are millions or even billions of training samples, deep learning can achieve the best performance, such as DeepMind's research in chess games and Atari. As Brenden Lake and his colleagues recently emphasized in a series of papers, humans are much more efficient at learning complex rules than deep learning systems. Geoff Hinton also expressed concerns about deep learning relying on a large number of labeled samples. This view was reflected in his latest Capsule Network study, which pointed out that convolutional neural networks may encounter "index inefficiencies," leading to network failures. Another problem is that convolutional networks struggle with generalizing to new perspectives. The ability to handle transformations (invariance) is inherent in the network, but for other common types of transformation invariance, we have to choose between duplicate feature detectors on the grid. The computational cost of this process is exponentially increasing, or we have to increase the size of the labeled training set, resulting in exponential growth in computation. For problems without large amounts of data, deep learning is usually not the ideal solution. **3.2 Deep Learning is Still Superficial and Lacks Sufficient Ability to Migrate** It is important to realize here that "deep" refers to the technical and architectural nature of deep learning (i.e., using a large number of hidden layers in modern neural networks), rather than conceptual meanings (i.e., the representation of such networks can naturally be applied to concepts like "justice," "democracy," or "intervention"). Even concrete concepts like "ball" or "opponent" are difficult to learn by deep learning. Consider DeepMind's research on Atari games using deep reinforcement learning, which combines deep learning and reinforcement learning. The results seem impressive: the system uses a single "hyper-parameter" set (control network properties such as learning rate) to reach or defeat human experts in a large number of game samples, where the system does not know the specific game rules. However, people can easily overinterpret this result. For example, according to an extensive video about the system learning to play Breakout, "After 240 minutes of training, the system realized that breaking the brick wall through one channel was the most efficient technique to get high marks." But in fact, the system did not learn this kind of thinking: it did not understand what the channel was, or what the brick wall was; it only learned specific strategies in specific scenarios. Migration testing (where a deep-enhanced learning system needs to face slightly different scenes from the training process) suggests that deep-enhancement learning methods often learn something. For example, Vicarious's research team showed that DeepMind's more advanced technology - the Atari system "Asynchronous Advantage Actor-Critic" (also called A3C) failed when faced with changes in the Breakout version (such as changing the racket's Y coordinate, or adding a wall in the center of the screen). These counterexamples demonstrate that DFS cannot learn to generalize concepts such as brick walls or racquets; more precisely, such comments are caused by over-attribution in biological psychology. The Atari system did not really learn the robust concept of brick walls, but only superficially broken brick walls in a narrow collection of highly trained scenarios. I found similar results in the ski game scenario of the research team of the startup Geometric Intelligence (later acquired by Uber). In 2017, a team from Berkeley and OpenAI found that it was easy to construct adversarial examples in multiple games, making DQN (original DeepMind algorithm), A3C, and other related technologies invalid. Recent experiments by Robin Jia and Percy Liang (2017) show similar views in different fields (language). They trained various neural networks for question answering system tasks (known as SQuAD, Stanford Question Answering Database), where the goal was to mark words related to a given question in a particular paragraph. For example, there is a trained system that can accurately identify the winner of the Super Bowl XXXIII based on a short essay as John Elway. However, Jia and Liang showed that simply inserting interference sentences (for example, claiming that Google's Jeff Dean won in another cup) can make the accuracy rate drop sharply. In the 16 models, the average accuracy rate dropped from 75% to 36%. In general, the pattern of deep learning extraction is actually more superficial than giving a first impression. **3.3 Deep Learning So Far Has No Natural Way to Deal with Hierarchical Structures** For linguists like Chomsky, it is not surprising that the problems Robin Jia and Percy Liang have documented. Basically, most current deep learning methods are based on language models to express sentences as pure word sequences. However, Chomsky has always believed that languages have a hierarchical structure, that is, a small component loop is built into a larger structure. (For example, in the sentence "the teenager who has recently changed the Atlantic set a record for flying around the world", the main sentence is "the teenager set a record for flying around the world", and "who was the past for the flying around the world" is a clause.) In the 1980s, Fodor and Pylyshyn (1988) also expressed the same concern, which is an early branch of neural networks. In my 2001 article, I also speculated that a single recurrent neural network (SRN) is the predecessor of today's more complex deep learning approach based on a recurrent neural network (i.e., RNN); Elman, 1990) has difficulty expressing and expanding each recursive structure of the unfamiliar statement (see the essay paper for a specific type). Earlier in 2017, Brenden Lake and Marco Baroni tested whether such pessimism was still correct. As they were written in the title of the article, contemporary neural networks "have remained unstructured for years". RNN may "be different between training and testing... Generalization is good when it's very small, but when it comes to systematically combining skills to generalize, RNN is extremely unsuccessful." Similar problems may also be exposed in other areas, such as planning and motor control, which require complex hierarchical structures, especially when new environments are encountered. We can indirectly see this from the difficulty mentioned above of the Atari game AI. More generally, in the robotics field, systems cannot generally summarize abstract plans in a brand new environment. At the very least, the core issue of deep learning at present is that its learning feature set is relatively smooth or non-hierarchical. It is like a simple, unstructured list. Each feature is equal. Hierarchical structures (e.g., syntax trees that distinguish main clauses from clauses in sentences) are not inherent in such a system, or are expressed directly. As a result, deep learning systems are forced to use a variety of fundamentally inappropriate agents, such as sentences. The sequence position of the word. Systems like Word2Vec (Mikolov, Chen, Corrado, & Dean, 2013) express a single word as a vector with appropriate success. There are also a number of systems using little tricks trying to express complete sentences in deep learning compatible vector spaces (Socher, Huval, Manning, & Ng, 2012). However, as demonstrated by Lake and Baroni's experiments, the capacity of the circulating network is still limited, and it is not sufficient to accurately and reliably express and summarize the rich structural information. **3.4 Deep Learning So Far Cannot Be Openly Reasoned** If you can't figure out the difference between "John promised Mary to leave" and "John promised to leave Mary," you can't tell who is leaving and what will happen down. Current machine reading systems have achieved some degree of success in tasks such as SQuAD, where answers to given questions are explicitly contained in texts or integrated in multiple sentences (called multi-level inference) or integrate in a few explicit sentences of background knowledge, but no specific text is marked. For humans, we can often make extensive inferences when reading texts, forming new and implicit thinking. For example, we can determine the intention of a character only through dialogue. Although Bowman et al. (Bowman, Angeli, Potts & Manning, 2015; Williams, Nangia & Bowman, 2017) have taken some important steps in this direction, for now, no deep learning system can be based on real-world knowledge. Open reasoning and achieve human-level accuracy. **3.5 Insufficient Deep Learning to Date** The characteristics of the "black box" of neural networks have been the focus of discussions in the past few years (Samek, Wiegand & Müller, 2017; Ribeiro, Singh & Guestrin, 2016). In the current typical state, the deep learning system has millions or even billions of parameters. Its developer-recognizable form is not a conventionally used ("last_character_typed") human identifiable tag, but only in a complicated network. Applicable geographic form (e.g., the activity value of the ith node at layer j in network module k). Although visual tools allow us to see individual node contributions in a complex network (Nguyen, Clune, Bengio, Dosvitskiy & Yosinski, 2016), most observers believe that the neural network as a whole remains a black box. In the long run, the importance of this situation is still not clear (Lipton, 2016). If the system is robust and self-contained, then there is no problem; if the neural network occupies an important position in a larger system, its debuggability is crucial. To solve the problem of transparency, the potential for deep learning in some areas such as financial or medical diagnosis is fatal, and humans must understand how the system makes decisions. As Catherine O.Neill (2016) pointed out, this opacity can also lead to serious bias problems. **3.6 So Far, Deep Learning Has Not Been Well Combined with Prior Knowledge** An important direction of deep learning is hermeneutics, which isolates itself from other potentially useful knowledge. The way deep learning works usually involves finding a training data set, the various outputs associated with the input, learning through any sophisticated architecture or variation, and data cleansing and/or enhancement techniques, and then learning the relationship between input and output. The method of the problem. Of these, there are only a few exceptions, such as LeCun's study of convolutional constraints on neural network connections (LeCun, 1989). Prior knowledge is intentionally minimized. Thus, for example, the system proposed by Lerer et al. (2016) learns the physical properties of falling objects from the tower, and there is no prior knowledge of physics (except for the content implied by the convolution). Here, Newton's law is not coded, and the system learns this law from raw pixel-level data (and on some limited levels) and approximates them. As pointed out in my forthcoming paper, deep learning researchers seem to have a strong prejudice against prior knowledge, even though (such as physics) these prior knowledge is well-known. In general, integrating prior knowledge into deep learning systems is not easy: partly because the knowledge representation in deep learning systems is mainly the relationship between features (mostly opaque), not abstract quantitative statements (If mortals eventually die, see the discussion of universal quantification of one-to-one mapping (Marcus, 2001), or generics (a statement that can be violated if the dog has four legs or the mosquito carries the Nile virus (Gelman, Leslie, Was & Koch, 2015 )). This problem is rooted in the machine learning culture, emphasizing that the system needs to be self-contained and competitive – no need for even a little prior knowledge. The Kaggle Machine Learning Contest platform is an illustration of this phenomenon. Participants strive to obtain the best results for a given task on a given data set. The information needed for any given problem is neatly packaged with the relevant input and output files. We have made great progress in this paradigm (mainly in the field of image recognition and speech recognition). The problem is, of course, that life is not a Kaggle contest; children don't pack all the data neatly into a single directory. In the real world we need to learn more fragmented data, and the problem is not so neatly encapsulated. Deep learning is very effective on issues such as speech recognition, which have many marks, but few people know how to apply them to more open issues. How to pick the rope stuck on the bicycle chain? Do I major in Mathematics or Neuroscience? The training set will not tell us. The further away from the classification, the closer to common sense the more the problem cannot be solved by deep learning. In a recent study of common sense, I started with Ernie Davis (2015), starting with a series of easily deducible inferences, such as Prince William and his children Prince George who are taller? Can you use a polyester shirt for salads? If you put a needle on a carrot, is there a hole in the carrot or a hole in the needle? As far as I know, there is no common sense to allow deep learning to answer such questions. These very simple questions for human beings need to integrate a large number of different sources of knowledge, so there is still a long way to go from the category of deep learning to the style used. On the contrary, this may mean that if we want to achieve a human-level flexible cognitive ability, we need a completely different tool than deep learning. **3.7 So Far, Deep Learning Has Not Yet Fundamentally Differentiated Causal Relationships and Related Relationships** If the causal relationship does not really equate with the related relationship, then the difference between the two is also a serious problem for deep learning. Roughly speaking, deep learning is a complex correlation between input features and output features, rather than an inherent causal representation. Deep learning system can regard people as a whole and it is easy to learn that height and vocabulary are related, but it is more difficult to characterize the interrelationship between growth and development (children learn more words at the same time as they grow longer. Big, but it doesn't mean that growing taller will lead them to learn more words, and learning more words will not lead them to grow taller.) Causality has been a central factor in other methods for artificial intelligence (Pearl, 2000), but perhaps it is because the goal of deep learning is not these problems, so the field of deep learning has traditionally tackled this problem. less. **3.8 Deep Learning Assumes That the World Is Generally Stable. The Approach Adopted May Be Probabilistic.** The logic of deep learning is that in a highly stable world (such as rule-invariant go), the effect is likely to be best, while in politics and The effects of changing areas such as the economy are not as good. Even if deep learning is applied to tasks such as stock forecasting, it is likely to encounter the same fate as Google Flu Trends; Google Flu Trends can predict epidemiological data very well based on search trends, but Missed the 2013 flu season completely (Lazer, Kennedy, King, & Vespignani, 2014). **3.9 So Far, Deep Learning Is Only a Good Approximation, and Its Answer Is Not Completely Reliable.** This problem is partly the result of other problems mentioned in this section. Deep learning is fairly comparable in a given area. Most of them work well, but they are still easy to be tricked into fooling. More and more papers have shown this flaw, from the linguistic case given by Jia and Liang mentioned in the previous article to a large number of examples in the visual field, such as the image description system with deep learning mistakes the yellow and black stripes patterns. Think of the school bus (Nguyen, Yosinski, & Clune, 2014) as mislabeling the parking sign as a full-fledged refrigerator (Vinyals, Toshev, Bengio, & Erhan, 2014), while others appear to perform well. There have also been cases where real-world stop signs have been mistaken for speed limit signs after being slightly modified (Evtimov et al., 2017), and 3D printed turtles have been mistaken for rifles (Athalye, Engstrom, Ilyas, & Kwok , 2017). There is also a recent news that a system of British police is difficult to distinguish between nudity and sand dunes. [10] The paper that first proposed the "spoofability" of the deep learning system may be Szegedy et al. (2013). Four years have passed. Although research activities are very active, we have not yet found a robust solution. **3.10. So Far, Deep Learning Is Hard to Use in Engineering.** With the problems mentioned above, another fact is that it is still difficult to use deep learning for engineering development. As Google’s research team stated in the title of an important but still unanswered paper in 2014 (Sculley, Phillips, Ebner, Chaudhary, & Young, 2014): Machine learning is a “high interest technical debt credit card” This means that machine learning is relatively easy (short-term benefit) in creating systems that can work in certain limited environments, but it is important to ensure that they also work in other environments with entirely new data that may be different from previous training data. Difficulties (long-term debt, especially when a system is used as part of another larger system). Leon Bottou (2015) compared machine learning with aircraft engine development in an important speech at ICML. He pointed out that while aircraft designs rely on the use of simpler systems to build complex systems, it is still possible to ensure reliable results and machine learning lacks the ability to obtain such assurance. As pointed out by Google’s Peter Norvig in 2016, machine learning currently lacks the progressiveness, transparency, and debuggability of traditional programming. In order to achieve deep learning robustness, some trade-offs need to be made in terms of simplicity. Henderson and colleagues recently extended these ideas around deep learning, pointing out that the field faces some serious problems related to robustness and reproducibility (Henderson et al., 2017). Although there has been some progress in the automation of the machine learning system's development process (Zoph, Vasudevan, Shlens, & Le, 2017), there is still a long way to go. **3.11 Discussion** Of course, deep learning is itself mathematics; none of the problems given above is because there is a loophole in underlying mathematics for deep learning. In general, deep learning is a perfect way to optimize complex systems that characterize the mapping between input and output when there are enough large data sets. The real problem lies in misunderstanding what deep learning is good at and what it is not. This technique is good at solving the closed classification problem, that is, mapping a large number of potential signals to a limited number of classifications when there is enough available data and the test set is similar to the training set. Deviating from these assumptions may cause problems; deep learning is only a statistical technique, and all statistical techniques will have problems when deviating from the assumptions. When the amount of available training data is limited or there is a significant difference between the test set and the training set or the sample space is wide and there is a lot of new data, the effect of the deep learning system is not so good. And under the restrictions of the real world, some problems cannot be regarded as classification problems at all. For example, open natural language understanding should not be seen as a mapping between different large collections of finite sentences, but should be viewed as a mapping between a potentially infinite range of input sentences and an array of meanings of the same scale, many of which Never encountered before. The use of deep learning in such a problem is like a roundabout chisel, and it can only be a rough approximation. There are certainly solutions in other places. By considering a series of experiments I had done many years ago (1997), I could gain an intuitive understanding of the current errors when I tested language development on a class of neural networks that later became popular in cognitive science. Some simple aspects. This network is much simpler than the current model. They use no more than three layers (1 input layer, 1 hidden layer, 1 output layer) and do not use convolution techniques. They also use back-propagation technology. In language, this problem is called generalization. When I heard a sentence "John pilked a football to Mary," I could grammatically infer "John pilked Mary the football." If I know what pilk means, I can infer a new sentence "Eliza pilked the ball. The meaning of "to Alec" was even heard for the first time. I believe that the extraction of a large number of language problems as simple examples is still a concern at the moment. I have run a series of experiments on training three-level perceptrons (full connection, no convolution) on the identity function f(x) = x. Training samples are characterized by input nodes (and related output nodes) that characterize binary numbers. For example, the number 7 is represented as 4, 2 and 1 on the input node. To test the generalization ability, I trained the network with a variety of even sets and tested it with odd and even inputs. I experimented with a variety of parameters and the output was the same: the network could apply the identity function exactly to the trained even number (unless only the local optimum was reached), and some other even numbers, but applied to all Odd numbers have encountered failures, such as f(15)=14. In general, the neural networks I tested can all learn from the training samples and can generalize the set of points to the neighbors of these samples in the n-dimensional space (i.e., the training space), but they cannot infer the results beyond the training space. Odd numbers are outside this training space, and the network cannot generalize identity functions out of this space. Even adding more hidden units or more hidden layers is useless. Simple multi-layer perceptrons cannot be generalized beyond the training space (Marcus, 1998a; Marcus, 1998b; Marcus, 2001). The above is the generalization challenge in the current deep learning network, and it may exist for twenty years. Many of the issues mentioned in this article - data hungriness, dealing with the fragility of fraud, dealing with open inference and migration - can all be seen as extensions of this basic issue. Modern neural networks have a good generalization effect on data that is close to the core training data, but the generalization effect on data that differs greatly from training samples begins to collapse. Widely used convolution ensures the resolution of specific categories of problems (similar to my identity problem): the so-called translation invariance, the object still maintains its identity after the position is transformed. However, this solution does not apply to all issues, such as Lake's recent display. (Data augmentation extends the training sample space and provides another way to address the challenges of deep learning extrapolation, but such techniques are more effective in language 2d than in languages.) There is currently no universal solution to the generalization problem in deep learning. For this reason, if we want to implement universal artificial intelligence, we need to rely on different solutions. **4. The Potential Risks of Over-hype** One of the biggest risks of current AI over-hype is to once again experience the cold winter of AI, just like in the 1970s. Despite the fact that AI applications are much more numerous now than in the 1970s, hype is still a major concern. When Wu Enda, a high-profile figure, wrote in Harvard Business Review that automation is coming (which is quite different from reality), excessive expectations bring risks. Machines cannot do many things that humans can do in a second, from understanding the world to understanding sentences. Healthy humans would not mistake a turtle for a rifle or a stop sign for a fridge. Many investors in AI may end up disappointed, especially in the field of natural language processing. Some large projects have been abandoned, such as Facebook's M project, which was launched in August 2015, aiming to create a general personal virtual assistant, but later its position was reduced to helping users perform a few defined tasks, such as calendar entries. It can be fairly said that chatbots have not met the expectations set during the hype. For example, if autonomous vehicles prove unsafe after large-scale deployment, or if they fall short of the full automation promised, causing disappointment (compared to early hype), the entire AI field may face a downturn, both in terms of popularity and funding. We may have already seen the beginning, as Wired recently published an article titled "After peak hype, self-driving cars enter the trough of disillusionment" (https://). There are many other serious concerns, not just apocalyptic scenarios (which seem to be science fiction at the moment). My biggest concern is that the AI field may fall into a local minimum trap, becoming overly focused on the wrong parts of the intelligent space, and too focused on exploring usable but limited models, eager to pick the low-hanging fruit, while ignoring the riskier "paths" that may ultimately lead to more robust development paths. I recall Peter Thiel's famous quote: "We wanted a flying car, but got 140 characters." I still dream of Rosie the Robost, a home robot providing all-around service, but now, in the six decades of AI history, our robots are still only playing music, cleaning, and bidding on ads. No progress is a disgrace. AI has risks and huge potential. I believe that AI's greatest contribution to society will ultimately come in areas like automated scientific discovery. But to achieve success, we must first ensure that the field does not fall into a local minimum. **5. What Will Be Better?** Although I have outlined so many problems, I do not think we need to abandon deep learning. Instead, we need to re-conceptualize it: it is not a universal solution, but just one of many tools. We have electric screwdrivers, but we also need hammers, wrenches, and pliers, so we cannot just mention drills, voltmeters, logic probes, and oscilloscopes. In perceptual classification, if there is a large amount of data, deep learning is a valuable tool. But in other more formal cognitive domains, deep learning is usually not so suitable. So the question is, what should our direction be? Here are four possible directions. **5.1 Unsupervised Learning** Recently, deep learning pioneer Geoffrey Hinton and Yann LeCun have indicated that unsupervised learning is the key method to go beyond supervised, low-data deep learning. However, we need to clarify that deep learning and unsupervised learning are not logically opposing. Deep learning is mainly used for supervised learning with labeled data, but there are also methods that can be used in unsupervised settings. However, many fields have reasons to move away from the large amounts of labeled data required by supervised deep learning. Unsupervised learning is a commonly used term, often referring to systems that do not require labeled data. One common type is clustering inputs that share common attributes, even without explicitly labeling them as a class. Google's cat detection model (Le et al., 2012) is perhaps the most prominent example of this approach. Another method advocated by Yann LeCun and others (Luc, Neverova, Couprie, Verbeek, & LeCun, 2017) initially does not conflict, using data that changes over time instead of labeled datasets. Intuitively, a system trained on videos can use each pair of consecutive frames as a training signal and predict the next frame. Therefore, this method of predicting the t+1 frame from the t frame does not require any human-labeled information. My view is that these two methods are useful (other methods are not discussed here), but they alone cannot solve the problems mentioned in Section 3. These systems still have issues, such as lacking explicit variables. I haven't seen these systems have open reasoning, explanation, or modifiability. In other words, there is a different concept of unsupervised learning that is rarely discussed but still very interesting: the unsupervised learning done by children. Children often set themselves new tasks, such as building a Lego tower or climbing through a window. Usually, these exploratory problems involve (or at least seem to involve) solving a large number of self-set goals (what should I do?) and higher-level problem-solving (how do I get my arm through the chair, and is the rest of my body already through?), and integrating abstract knowledge (how the body works, what objects have windows and whether they can be climbed through). If we build systems that can set their own goals and reason and solve problems at a more abstract level, there will be significant progress in the field of AI. **5.2 The Necessity of Symbol Processing and Hybrid Models** Another area we need to focus on is classic symbolic AI, sometimes called GOFAI (Good Old-Fashioned AI). The name of symbolic AI comes from the idea that abstract objects can be directly represented by symbols, which is a core idea in mathematics, logic, and computer science. Equations like f = ma allow us to compute outputs for a wide range of inputs, regardless of whether we have observed any specific values before. Computer programs also do the same thing (if the value of variable x is greater than the value of variable y, then perform operation a). Symbolic representation systems are often proven to be fragile, but they were developed in an era when data and computational power were much less. The correct approach today may be to combine deep learning, which is good at perception classification, with excellent reasoning and abstract symbolic systems. People may think that this potential integration can be compared to the brain; the sensory input system like the primary sensory cortex seems to do the same as deep learning, but there are also regions like Broca's area and the prefrontal cortex that seem to perform higher-level abstraction. The brain's ability and flexibility partly come from its dynamic integration of many different computational methods. For example, the scene perception process seamlessly combines direct perceptual information with complex abstract information about objects and their properties, light sources, etc. There have been some preliminary studies exploring how to integrate existing methods, including neuro-symbolic modeling (Besold et al., 2017) and recent work on differentiable neural computers (Graves et al., 2016), planning via differentiable interpreters (Bošnjak, Rocktäschel, Naradowsky, & Riedel, 2016), and neural programming based on discrete operations (Neelakantan, Le, Abadi, McCallum, & Amodei, 2016). Although this research has not yet fully expanded to the exploration of full-service general AI, I have always advocated (Marcus, 2001) that integrating more microprocessor-like operations into neural networks is very valuable. For expansion, the brain can be seen as composed of "a series of reusable computational primitives - basic units of processing similar to a set of basic instructions in

Solid Solder Wire

Solid wire is based on the high end of the metal film capacitor market demand,the use of advanced production technology,with a better surface and inner quality and good stability,is a variety of film capacitor end of the iderl material,especially suitable for hig-end capacitor manufacturing.

Solid Solder Wire,Soldering Wire,Lead Free Solder Wire,Soldering Copper Wire

Shaoxing Tianlong Tin Materials Co.,Ltd. , https://www.tianlongspray.com