E. Cabrio

Natural Language Processing

En intelligence Artificielles (IA), le Traitement Automatique du Langage Naturel (TALN) est une discipline qui a pour objectif de modéliser, grâce à l'informatique, le langage qu'il soit écrit ou parlé. Les technologies TALN sont présentes, de manière grandissante, dans divers systèmes grands public (par ex. Google, IBM Watson, Facebook, Apple Siri).

S2 3 ECTS 24h OPT E. Cabrio

In Artificial Intelligence (AI), Natural Language Processing (NLP) is a discipline at the intersection of Computer Science and Linguistics whose goal is to enable computers to understand and communicate with human language. NLP technologies are increasingly present in various consumer systems (e.g. Google, IBM Watson, Facebook, Apple Siri).

The aim of this course is to present the main symbolic and machine/deep learning methods for analysing and generating documents in natural language.

Natural Language Processing (NLP) is a discipline at the frontier between computer science and linguistics, and is part of the field of artificial intelligence. It encompasses all research and development aimed at using machines to model and reproduce the human ability to produce and understand linguistic statements for the purpose of communication. We will therefore be talking here about human language, hence the adjective natural, and not formal language. Why take an interest in the automation of natural language processing? As with most fields of AI , there are two main motivations: on the one hand, the desire to model language in order to test hypotheses about the mechanisms of human communication; and on the other, the need for applications capable of efficiently processing the information contained in written or audio sources that are now available in electronic form (HTML pages, hypermedia documents, social media, etc.).

What are NLP skills good for? There are many different careers in the language industries. There are many opportunities in companies that specialize or have sectors that specialize in the development of NLP tools (Google, Yahoo, IBM Watson, Microsoft, Facebook, Apple Siri, Amason Alexa, Lucene, Orange, France Telecom, etc.) for the design and maintenance of software, for the services they offer or for their own needs. The aim of this course is to present the issues involved in automatic text processing and the main symbolic and machine/deep learning methods for analysing and generating natural language. We will limit ourselves almost exclusively to the processing of language in written form.

This course covers:

Foundations of NLP, i.e. the different levels of processing required to achieve a complete understanding of an utterance in natural language (morphological analysis, syntactic analysis, semantic analysis, and analysis of pragmatics, discourse and dialogue). From an engineer’s point of view, these levels correspond to the modules that need to be developed and made to work together as part of a complete language processing application.
Applications: information retrieval and extraction, automatic summarisation, text mining, sentiment detection, topic extraction, Question-Answer systems, dialogue systems, etc.