This project is an ANR funded 4-year chair for the democratization of AI, starting in January 2021.
With the current rapid growth of AI research and applications, there are both unprecedented opportunities and legitimate worries about its potential mis-uses. In this context, we are committed to help making AI easier to access and use by a large population segment. Making AI more accessible to all should both be an important factor of economical growth and help strengthen democracy.
This research aims at reducing the need for human expertise in the implementation of pattern recognition and modeling algorithms, including Deep Learning, in various fields of application (medicine, engineering, social sciences, physics), using multiple modalities (images, videos, text, time series, questionnaires). To that end, we will organize scientific competitions (or challenges) in Automated Machine Learning (AutoML). Our designs will expose the community to progressively harder and more diverse settings, ever reducing the need for human intervention in the modeling process. By involving the scientific community at large in challenge-solving, we will effectively multiply by an important factor our government funding to solve such hard AutoML problems. All winners' code will be open-sourced. This effort will culminate in an AutoRL challenge (Automated Reinforcement Learning) in which participants will have to submit code that will be blind tested on new RL tasks they have never seen before.
Recognizing that there is no good data-driven AI without good data, we also want to dedicate part of our time to educate the public on proper data collection and preparation. Our objective is to instill good practices to reduce problems resulting from bias in data or irreproducible results due to lack of sufficient data. We will also encourage the protection of data confidentiality or privacy by supplying software allowing data donors to replace real data by realistic synthetic data. This will facilitate broadening access to data confidential or private data having a commercial value or the potential to harm individuals.
A first thesis on the methodology of challenges has already started (Adrien Pavao).
A second one will start soon on the subject below (HaoZhe Sun will work on it):
DEEP MODULAR LEARNING (PhD subject)
The current trend in Artificial Intelligence (AI) is to heavily rely on systems capable of learning from examples, such as Deep learning (DL) models, a modern embodiment of artificial neural networks. While numerous applications have made it to market in recent years (including self-driving cars, automated assistants, booking services, and chatbots, improvements in search engines, recommendations, and advertising, and heath-care applications, to name a few) DL models are still notoriously hard to deploy in new applications. In particular, the require massive numbers of training examples, hours of GPU training, and highly qualified engineers to hand-tune their architectures. This thesis will contribute to reduce the barrier of entry in using DL models for new applications, a step towards "democratizing AI".
The angle taken will be to develop new Transfer Learning (TL) approaches, based on modular DL architectures. Transfer learning encompasses all techniques to speed up learning by capitalizing on exposure to previous similar tasks. For instance, using pre-trained networks is a key TL tactic used by winners of the recent AutoDL challenge. The doctoral candidate will push forward the notion of reusability of pre-trained networks in whole or in part (modularity).
There are several important questions raised in this context.
From a technical standpoint, the current limitations of pre-training include that: (T1) In many domains, there are no available pre-trained networks, due to lack of massive datasets in related domains; (T2) Novel architectures of networks such as "Graph Neural Networks" (GNNs) do not easily lend themselves to pre-training; (T3) Besides merely retraining the last layer and fine-tuning inner layers, means of re-using pre-trained networks in new contexts are under-developed. These three issues offer challenging research opportunities to efficiently use prior knowledge, data simulators, and/or data augmentation, and develop novel algorithms and architectures that learn in a modular re-usable way.
From the fundamental research point of view, modularity and inheritance of pre-trained learning modules in biologically-inspired learning systems is a burning topic in AI. Unanswered questions include: (F1) Does modularity of the brain increase its effectiveness or is this a legacy of evolution that plays no particular role; (F2) Likewise, in which context and how does modularity help in artificial systems (e.g. to implement invariances, to help transfer learning, etc.); (F3) Does module specialization hinder or help generalization to new data modalities (e.g. new sensor data), and if so, how?
In this context, the doctoral student will investigate a novel approach to transfer learning that we call "Deep Modular Learning". The candidate will tackle the problem of training large artificial neural networks whose architectures are modular and whose modules are eventually reusable. A possible method to approach the problem will be to use multi-level optimization algorithms, addressing the optimization of the overall system (achieving a higher level objective) under the constraint that the modules achieve a lower level objective (reusability). One scientific aim will be to challenge the hypothesis that modularity is essential for learning systems, in that it accelerates learning by making possible an effective form of transfer learning, a central functionality in AI.
Several principles/conjectures/hypotheses may be guiding this research including: (P1) The principle of parsimony or "Ockham's razor" embodied in modern learning theory as "regularization", which in layman's words states that ``of two theories equivalently powerful in reproducing observations, one should prefer the simpler one''; indeed modular architectures sharing identical sub-modules have fewer adjustable parameters and therefore can be considered less complex than e.g. fully connected networks. (P2) The innateness hypothesis: Task solving capabilities are a combination of innate vs. acquired skills. Is it a characteristic of intelligent systems to rely more on learned skills such as language rather than inherit them? Is it true that language can be completely learned "from scratch"? (P3) Induction, deduction, conceptualisation, and causality: do intelligent learning systems rest upon modularity for conceptualisation, language acquisition, and causal inference?
To put this framework in practice, the student will choose practical applications from domains including biomedicine (e.g. molecular toxicity or efficacy), ecology, econometrics, speech recognition, natural language processing, image or video processing, etc.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
Behley, J., Garbade, M., Milioto, A., Quenzel, J., Behnke, S., Stachniss, C., & Gall, J. (2019). SemanticKITTI: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE International Conference on Computer Vision (pp. 9297-9307).
Zhou, L., Gao, J., Li, D., & Shum, H. Y. (2020). The design and implementation of xiaoice, an empathetic social chatbot. Computational Linguistics, 46(1), 53-93.
Covington, P., Adams, J., & Sargin, E. (2016, September). Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM conference on recommender systems (pp. 191-198).
Rajkomar, A., Oren, E., Chen, K., Dai, A. M., Hajaj, N., Hardt, M., ... & Sundberg, P. (2018). Scalable and accurate deep learning with electronic health records. NPJ Digital Medicine, 1(1), 18.
He, K., Girshick, R., & Dollár, P. (2019). Rethinking imagenet pre-training. In Proceedings of the IEEE international conference on computer vision (pp. 4918-4927).
Ellefsen, K. O., Mouret, J. B., & Clune, J. (2015). Neural modularity helps organisms evolve to learn new skills without forgetting old skills. PLoS Comput Biol, 11(4), e1004128.
Rasmussen, C. E., & Ghahramani, Z. (2001). Occam's razor. In Advances in neural information processing systems (pp. 294-300).
Sinha, A., Malo, P., & Deb, K. (2017). A review on bilevel optimization: from classical to evolutionary approaches and applications. IEEE Transactions on Evolutionary Computation, 22(2), 276-295.