This course is an introduction to symbolic techniques in Artificial Intelligence (AI). It relies on one technique, Prolog. The Prolog programming language is used here not only for its power in dealing with logic and knowledge representation, but also because it is the best instantiation of notions like unification and backtracking. The course is also an opportunity to discover important concepts involved in artificial reasoning, in symbolic machine learning, in knowledge representation, and in natural language processing.
Why Prolog ?
The Prolog programming language is unique in several respects. It differs radically from imperative languages (Python, C++, Java) and from functional languages (Caml, Lisp). Prolog is a declarative language. The programmer ideally provides knowledge to the machine and nothing more. And the machine does the job! Prolog is an ideal opportunity to discover and master fundamental notions of computer science such as:
recursivity
declarativity
unification
backtracking
Prolog was imagined to deal with symbolic AI problems: knowledge processing, natural langage processing, reasoning. Its functioning principles are commonly used in ontologies, in unification grammars and in constraint planning systems.Prolog is also an excellent opportunity to train oneself in a way of thinking that goes beyond the domain of computer science. It consists in addressing a problem in terms of constraints rather than in terms of procedure, leaving to others (to the machine in the case of Prolog) the burden of finding out a solution that will comply with the constraints.
Symbolic Artificial Intelligence
In a famous paper, Alan Turing proposed a behaviouristic definition of artificial intelligence. A machine should be declared intelligent if it can fool human judges in a written conversation so that they believe that they are talking with another human being.
Improvements in statistical machine learning (data mining, deep learning) led some famous people to predict such a surge in machine intellectual capabilities that they could lead to the end of humankind (see Stephen Hawking’s or Bill Gates’s warnings). These concerns probably motivated other important figures to sign a call for better use of AI techniques. We should not underestimate the danger of giving robots too much decision power. For instance, in high-frequency trading, virtually all decisions are made by computer programs; financial crashes may be the result, as was the case in 2010.
However, we can question the fact that machines will outsmart people and be dangerous for that reason.Current NLP techniques such as transformers are impressive. Play with ChatGPT to get convinced.
However, it is quite unlikely that the mere extension of statistical techniques may ever constitute a threat to humankind.
Statistical machines such as transformers are inherently limited to interpolating/extrapolating from known data.
Read also this comparison of ChatGPT with Wolfram-Alpha by Stephen Wolfram himself.
Let’s take an example. Try to ask this question:
How many phalanges are there in a human body ?
Make a real try to answer the question, and write down your answer.
Then you may ask your favourite search engine. It certainly knows the correct answer!
Phalanges
How does your reasoning differ from the procedure used in the search engine?
Yet another example. Anyone who is able to count will find an obvious way of continuing the following series:
1,2,2,3,3,3,4,4,4,4
Would you be able to write a program that can solve such riddles? Mmm..., think twice before considering the task trivial.
Your program would need to implement a principle of simplicity (or minimum description length): the ‘best’ solution, here probably 5,5,5,5,5, should keep the description of the series as short as possible. You can learn about this topic in this course.Some Web sites such as this one are specialized in performing induction on numerical series. Verify that the ‘correct’ solution is proposed.
Increasing Series
Indicate an alternative solution proposed by the site (which is not n repeated n times).
Explain why you think that that solution is not so good.
Importantly, solutions to such induction problems are found using computations that may have no statistical basis whatsoever (one-shot learning). These computations are sensitive to structure. In the previous example, anyone can perceive repetitions and incrementations (and the repetition of incrementation). This structural analysis is absent in statistical methods (though in favourable cases, deep learning can be said to extract relevant features implicitly).This course will deal with symbolic aspects of AI. This means that we will use structures, structural matching and structure-sensitive procedures. This will allow us to deal with problems (such as aspects of Natural Language Processing and problem solving) that are intractable using brute statistics. It will allow us to produce systematic behaviour and to deal with exceptions and anomalies. And we will see how background knowledge can dramatically improve machine learning, making one-shot learning possible.
Topics
The topics addressed in this course result from a compromise between the importance of notions and the coherence of the course (but have a look at other symbolic AI courses).♦
Symbolic models are characterized by the use of structures. They rely on discrete time (e.g. a loop index in a sequential algorithm). Procedures are sensitive to structure. The simplest one consists in matching structures.
♦
Sub-symbolic (or dynamical) models, by contrast, are much closer to physics. They involve metrics, parallel processes and rely ideally on continuous time (though they are generally simulated on discrete sequential machines). Most current sub-symbolic techniques are based on statistical computations.
♦
Neural models are a special type of sub-symbolic models, that rely on large amounts of computation to approximate functions. They are related to distributed representations: concepts are not represented using discrete symbols, but rather using vectors in high-dimensional spaces. Neural models are extremely powerful when one does not know enough about the structure of the function being approximated.