Oct 19, 2006: Extreme Learning: Mining Needles in a Haystack
Filed in: Colloquium
Dr. Nitesh Chawla, University of Notre Dame
Data mining has emerged as a vital science for a variety of high-profile applications, each with its own idiosyncrasies. However, there remain fundamental challenges including, but not limited to: massive datasets, high imbalance in class distributions, cost-sensitive classifications, cost of procuring data, and unlabeled data. My talk will focus on tackling these fundamental challenges and directions in machine learning and data mining research. Specifically, I'll present our work in parallel and distributed learning for massive datasets, which has been ported on ASCI and CRAY supercomputers; learning from highly imbalanced data sets in a cost-sensitive environment; active learning when there is a cost to data acquisition and/or learning from the entire dataset is not feasible; and learning from labeled and unlabeled data, particularly in the extreme case of a sample selection bias. Cross-fertilization of these concepts with applications can catalyze new research initiatives, and I will also incorporate an overview of some of these initiatives.
Dr. Nitesh Chawla is currently a Research Assistant Professor in the Department of Computer Science and Engineering at the University of Notre Dame. Nitesh's core research in machine learning and data mining focuses on cost/distribution sensitive learning, massively parallel and distributed data mining, semi-supervised learning, and learning in networks. His work has also included various applications to systems, bioinformatics, biometrics, and finance. His recent research funding, in collaboration, is from DOD and DOJ for research in wireless sensor networks and biometrics, respectively. He has also received various awards for his research and teaching. Notably, his Ph.D. dissertation was the recipient of the Outstanding Dissertation Award. He won the challenge on a classification problem for evaluating predictive uncertainty organized at NIPS 2004. More recently, his work with students has resulted in best student paper awards at conferences. He is also the recipient of FIE New Faculty Fellowship for his education paper on teaching data mining. He has served on various program and organization committees for conferences/workshops, panels, and editorial boards. He is currently the Associate Editor for the IEEE Transactions on SMC-B. He served as a Guest Editor for SIGKDD Explorations. Prior to joining Notre Dame, Nitesh was a Senior Risk Modeling Manager at one of the largest Canadian Banks (CIBC), where he was recognized for his contributions to retail portfolio analytics.
Abstract
Data mining has emerged as a vital science for a variety of high-profile applications, each with its own idiosyncrasies. However, there remain fundamental challenges including, but not limited to: massive datasets, high imbalance in class distributions, cost-sensitive classifications, cost of procuring data, and unlabeled data. My talk will focus on tackling these fundamental challenges and directions in machine learning and data mining research. Specifically, I'll present our work in parallel and distributed learning for massive datasets, which has been ported on ASCI and CRAY supercomputers; learning from highly imbalanced data sets in a cost-sensitive environment; active learning when there is a cost to data acquisition and/or learning from the entire dataset is not feasible; and learning from labeled and unlabeled data, particularly in the extreme case of a sample selection bias. Cross-fertilization of these concepts with applications can catalyze new research initiatives, and I will also incorporate an overview of some of these initiatives.
Bio
Dr. Nitesh Chawla is currently a Research Assistant Professor in the Department of Computer Science and Engineering at the University of Notre Dame. Nitesh's core research in machine learning and data mining focuses on cost/distribution sensitive learning, massively parallel and distributed data mining, semi-supervised learning, and learning in networks. His work has also included various applications to systems, bioinformatics, biometrics, and finance. His recent research funding, in collaboration, is from DOD and DOJ for research in wireless sensor networks and biometrics, respectively. He has also received various awards for his research and teaching. Notably, his Ph.D. dissertation was the recipient of the Outstanding Dissertation Award. He won the challenge on a classification problem for evaluating predictive uncertainty organized at NIPS 2004. More recently, his work with students has resulted in best student paper awards at conferences. He is also the recipient of FIE New Faculty Fellowship for his education paper on teaching data mining. He has served on various program and organization committees for conferences/workshops, panels, and editorial boards. He is currently the Associate Editor for the IEEE Transactions on SMC-B. He served as a Guest Editor for SIGKDD Explorations. Prior to joining Notre Dame, Nitesh was a Senior Risk Modeling Manager at one of the largest Canadian Banks (CIBC), where he was recognized for his contributions to retail portfolio analytics.