Publications
Identifying how hard it is to achieve a good classification performance on a given dataset can be useful in data analysis, model selection, and meta-learning. We hypothesize that the dataset clustering indices which …
Tags:
Classification, Dataset complexity, Classification complexity
Publications
Data from Global Positioning Systems (GPS) and fare-meters in For-Hire vehicles (FHVs) have been used for various applications – both in research as well as organizational decision-making. The utility of such exercises …
Tags:
PU learning, Classification
Publications
Building an effective classifier for imbalanced data is a challenging task as most of classifier work on the assumption of balanced data. Therefore, several sampling methods have been devised to bridge this gap by …
Tags:
Machine Learning, Classification
Publications
A classification tree is grown by repeated partitioning of the dataset based on a predefined split criterion. The node split in the growth process depends only on the class ratio of the data chunk that gets split in …
Tags:
Semi-supervised learning, Trees, Classification
Publications
We present a consistent algorithm for constrained classification problems where the objective (e.g. F-measure, G-mean) and the constraints (e.g. demographic parity fairness, coverage) are defined by general functions of …
Tags:
Machine Learning, Classification