scikit-feature: Open-Source Feature Selection Repository in Python

scikit-feature is an open-source feature selection repository in python, with around 40 popular algorithms in feature selection research. It is developed by Data Mining and Machine Learning Lab at Arizona State University.



By Jundong Li, ASU.

scikit-feature is an open-source feature selection repository in Python developed at Arizona State University. It is built upon one widely used machine learning package scikit-learn and two scientific computing packages Numpy and Scipy. scikit-feature contains around 40 popular feature selection algorithms, including traditional feature selection algorithms and some structural and streaming feature selection algorithms.

Data science

It serves as a platform for facilitating feature selection application, research and comparative study. It is designed to share widely used feature selection algorithms developed in the feature selection research, and offer convenience for researchers and practitioners to perform empirical evaluation in developing new feature selection algorithms.

Currently, scikit-feature consists of both supervised and unsupervised feature selection algorithms in the following categories:

  • Similarity based feature selection
  • Information theoretical based feature selection
  • Sparse learning based feature selection
  • Statistical based feature selection
  • Wrapper based feature selection
  • Structural feature selection
  • Streaming feature selection

In addition, scikit-feature also provides many benchmark feature selection datasets, and evaluation examples on how to evaluate feature selection algorithms via classification or clustering task.

To download scikit-feature, please visit its website with additional information: http://featureselection.asu.edu/

If you have further information, please contact Jundong Li at Arizona State University (firstname.lastname@asu.edu).

Bio: Jundong Li is a PhD student of computer science and engineering at Arizona State University. His research interests include data mining, machine learning and their applications in social media.

Related: