Finding hidden patterns in data using penalised splines
by Viani Djeundje Biatat
In practice, relationships between data items often have various and complex shapes, causing standard parametric modelling attempts to fail. Smoothing methods provide an attractive solution in such situations in that the shape of the functional relationships is not predetermined, but are driven by the data; i.e. can adjust to capture unusual or “unexpected” features in the data. In this talk, I will describe how smoothing via penalised splines allows to capture patterns in data in the credit risk context.
Big Data Analytics with python and application to risk credit scoring
by Olaf Kouamo
Big Data and AI are two of the most popular and useful technologies today. Artificial intelligence is in existence from more than a decade, while Big Data came into existence just a few years ago. Computers can be used to store millions of records and data, but the power to analyze this data is provided by Big Data. We can say that together Big Data and AI can be used to resolve all possible issues related to the data. Many organizations consider that AI will bring the revolution in their organizational data. Machine learning is considered as an advanced version of AI through which various machines can send or receive data and learn new concepts by analyzing the data. Big data helps the organizations in analyzing their existing data and in drawing meaningful insights from the same. However, the use of Big Data and IA for helping decision making are not yet fully realized. Even less on the African continent. This is due to internet access that is not spreading as fast as in the rest of the world but also because of the skills. During this workshop, I will try to explain and show to the students basic concepts related to these two fields, their genesis, how they are linked together and last but not least applying those new technologies to real data using python (the most famous tool used to model both Big Data and AI). The application on real data will focus on default payments of credit card clients. We will use python to model and implement machine learning tools which will allow us to detect potential risk clients and help to take a decision.
by Naila Murray
Convolutional models and sub-modules are integral components of many deep learning-based approaches to a wide-array of problems. These models therefore find application in domains from computer vision to social network modeling. In this lecture, I will introduce the basics of convolutional operators and the related concept of cross-correlation, before discussing how they are used to construct a hierarchical network. I will then describe approaches for interpreting the weights learned by such networks, before concluding with an overview of the main applications of such networks in different domains.
Selected Topics in Data Science
by Bubacarr Bah
Data Science is an emerging discipline which is in the intersection of mathematics, statistics and computer science. Moreover, Data Science applications are ubiquitous, making it very inter-disciplinary, hence domain knowledge is considered another component of Data Science. There will be three lectures focusing on select topics: i) Introduction to Data Science, ii) Mathematics of Data Science, and iii) Ethics and security in Data Science.
Data Science: The Role of Statistics
by Atinuke Adebanji
Data science originated in statistics and it has evolved over the last two decades as an amalgamation of disciplines which has resulted in different definitions depending on the field of study. The generally accepted components of data science are statistics, informatics, computing, communication and sociology, being conditioned on available data, environment and intuitive thinking. Statistics provides essential tools and methods that enables the identification of structure in data for better and deeper insight. It also makes possible the quantification of uncertainty, model calibration and validation and performance evaluation of classification algorithms. In this talk, the Statistical Assessment of PCA/SVD and FFT-PCA/SVD on Variable Facial Expressions is presented as an illustration of the application of statistics in data science.
Practical Data Science with application in healthcare
by Habiboulaye Amadou-Boubacar
This Lecture will focus on applied Data Science techniques for real world applications especially in healthcare. The courses will be balanced between theoretical framework and practical implementation. The principles of the most popular Machine Learning techniques will be revisited including Support Vector Machines, Decision trees and Ensemble learning, and Neural Networks. Hands-on tutorials will be proposed to have a better grasp on the implementation of these algorithms. Over the past few years, Advanced Data Analytics has become a critical part of the healthcare revolution with promising outcome on patient’s quality of life and cost savings for public health systems. We will study some practical use cases ranging from tackling infectious diseases to the optimisation of hospital resources and patient’s risk detection using machine learning. In spite of the great strides in the application of data science to healthcare problems, it is worth noting that the potential for value creation has not been fully realized yet. This is due to many hurdles related to the accessibility to patients’ medical records, the validation of predictive models and the acceptance by healthcare professionals.
Complex Network Analytics and applications
by Pr Franck Kalala Mutombo
This course will provide a broad range of approaches and techniques drawn from social network analysis, graph theory, and network science for analyzing real-world network data. During the course, theoretical material will be presented together with data and code using NetworkX in Python in the Jupyter Notebook environment (or spyder). Specific topics include, but are not limited to, the following:
The basic conceptual and mathematical formulation of networks. Basic metrics of networks (e.g. paths, components, degree distributions, etc.) Centrality measures, General properties of real-world networks, Models of networks, Dynamics of, and on, networks, Community detection (it time allow). Social Network Analysis.
Lecture 1: Networks and Graphs (introduction)
Lecture 2: Network measure/Analytics
Lecture 3: Models of networks
Lecture 4: Centrality measure
Lecture 5: Properties of large network (case study of social network)
Lecture 6: Dynamics on/of Networks