Over the last few years we have seen tremendous progress in deep learning based generative models that can now sometimes convincingly model high dimensional data objects like e.g. images or sound. Broadly speaking three distinct approaches have been at the center of attention: autoregressive models, generative adversarial approaches (GANs) and latent variable models with amortized variational inference (e.g. VAEs). Here I want to focus on the latter: models where we train not only a neural network based generative model that maps latent variables to observed ones, but where we in parallel train an approximate inference network to predict the latent variables given observations. Despite recent advances, many foundational aspects of these models, especially for models with hierarchies of latents variables, are relatively unexplored. This includes theoretical properties and effective algorithms for inference and learning.
Machine learning, broadly defined as data-driven technology to enhance human decision making, is already in widespread use and will soon be ubiquitous and indispensable in all areas of human endeavour. Data is collected routinely in all areas of significant societal relevance including law, policy, national security, education and healthcare, and machine learning informs decision making by detecting patterns in the data. Achieving transparency, robustness and trustworthiness of these machine learning applications is hence of paramount importance, and evaluation procedures and metrics play a key role in this.
In this talk I will review current issues in theory and practice of evaluating predictive machine learning models. Many issues arise from a limited appreciation of the importance of the scale on which metrics are expressed. I will discuss why it is OK to use the arithmetic average for aggregating accuracies achieved over different test sets but not for aggregating F-scores. I will also discuss why it is OK to use logistic scaling to calibrate the scores of a support vector machine but not to calibrate naive Bayes.
More generally, I will argue that it is naive to assume that all metrics of interest are directly observable in experiments, and discuss the need for a dedicated measurement theory for machine learning. I will outline our first steps in that direction, using ideas from item-response theory as employed in psychometrics in order to estimate latent skills and capabilities from observable traits.
George Karypis is a Distinguished McKnight University Professor and an ADC Chair of Digital Technology at the Department of Computer Science & Engineering at the University of Minnesota, Twin Cities. His research interests span the areas of data mining, high performance computing, information retrieval, collaborative filtering, bioinformatics, cheminformatics, and scientific computing. His research has resulted in the development of software libraries for serial and parallel graph partitioning (METIS and ParMETIS), hypergraph partitioning (hMETIS), for parallel Cholesky factorization (PSPASES), for collaborative filtering-based recommendation algorithms (SUGGEST), clustering high dimensional datasets (CLUTO), finding frequent patterns in diverse datasets (PAFI), and for protein secondary structure prediction (YASSPP). He has coauthored over 280 papers on
these topics and two books (“Introduction to Protein Structure Prediction: Methods and Algorithms” (Wiley, 2010) and “Introduction to Parallel Computing” (Publ. Addison Wesley, 2003, 2 nd edition)). In addition, he is serving on the program committees of many conferences and workshops on these topics, and on the editorial boards of the IEEE Transactions on Knowledge and Data Engineering, ACM Transactions on Knowledge Discovery from Data, Data Mining and Knowledge Discovery, Social Network Analysis and Data Mining Journal, International Journal of Data Mining and Bioinformatics, the journal on Current Proteomics, Advances in Bioinformatics, and Biomedicine and Biotechnology.
He is a Fellow of the IEEE.
Recommender systems are designed to identify the items that a user will like or find useful based on the user’s prior preferences and activities. These systems have become ubiquitous and are an essential tool for information filtering and (e-)commerce. Over the years, collaborative filtering, which derive these recommendations by leveraging past activities of groups of users, has emerged as the most prominent approach for solving this problem. This talk will present some of our recent work towards improving the performance of collaborative filtering-based recommender systems and understanding some of their fundamental limitations and characteristics. It will start by analyzing how the ratings that users provide to a set of items relate to their ratings of the set’s individual items and, using these insights, will present rating prediction approaches that utilize distant supervision. It will then discuss extensions to approaches based on sparse linear and latent factor models that postulate that users’ preferences are a combination of global and local preferences, which are shown to lead to better user modeling and as such improved prediction performance. Finally, the talk will conclude by discussing what can be accurately predicted by latent factor approaches and by analyzing the estimation error of sparse linear and latent factor models and how its characteristics impacts the performance of top N recommendation algorithms.
Financial markets, banks, currency exchanges and other institutions can be modeled and analyzed as network structures where nodes are any agents such as companies, shareholders, currencies, or countries. The edges (can be weighted, oriented, etc.) represent any type of relations between agents, for example, ownership, friendship, collaboration, influence, dependence, and correlation. We are going to discuss network and data sciences techniques to study the dynamics of financial markets and other problems in economics.
Director of MIPT-School of Applied Mathematics and Computer Science, Head of Discrete Mathematics Department at MIPT, Head of the Laboratory of Advanced Combinatorics and Network Applications at MIPT, Head of the Laboratory of Applied Research MIPT-Sberbank at MIPT, Chief coordinator of research and educational projects between MIPT and Yandex internet company,
Full Professor at Lomonosov Moscow State University, Mechanics and Mathematics Faculty, Department of Mathematical Statistics and Random
Processes Invited Lecturer at New School of Economics/Higher School of Economics.
In my talk, I will give an overview of some classical problems in optimization arising from graph theory. Starting from pure mathematical beauty of the questions and discussing recent progress on them, I will show how they are being applied in actual industrial projects running at Moscow Institute of Physics and Technology.
Past Keynote Speakers
The Keynote Speakers of the previous editions:
- Nello Cristianini, University of Bristol, UK
- Yi-Ke Guo, Imperial College London, UK
- Vipin Kumar, University of Minnesota, USA
- George Michailidis, University of Florida, USA
- Stephen Muggleton, Imperial College London, UK
- Panos Pardalos, University of Florida, USA
- Jun Pei, Hefei University of Technology, China
- Tomaso Poggio, MIT, USA
- Ruslan Salakhutdinov, Carnegie Mellon University, USA, and AI Research at Apple
- Vincenzo Sciacca, IBM, Italy
- My Thai, University of Florida, USA