Gini support vector machine: Quadratic entropy based robust multi-class probability regression

Shantanu Chakrabartty, Gert Cauwenberghs

Research output: Contribution to journalArticlepeer-review

24 Scopus citations


Many classification tasks require estimation of output class probabilities for use as confidence scores or for inference integrated with other models. Probability estimates derived from large margin classifiers such as support vector machines (SVMs) are often unreliable. We extend SVM large margin classification to GiniSVM maximum entropy multi-class probability regression. GiniSVM combines a quadratic (Gini-Simpson) entropy based agnostic model with a kernel based similarity model. A form of Huber loss in the GiniSVM primal formulation elucidates a connection to robust estimation, further corroborated by the impulsive noise filtering property of the reverse water-filling procedure to arrive at normalized classification margins. The GiniSVM normalized classification margins directly provide estimates of class conditional probabilities, approximating kernel logistic regression (KLR) but at reduced computational cost. As with other SVMs, GiniSVM produces a sparse kernel expansion and is trained by solving a quadratic program under linear constraints. GiniSVM training is efficiently implemented by sequential minimum optimization or by growth transformation on probability functions. Results on synthetic and benchmark data, including speaker verification and face detection data, show improved classification performance and increased tolerance to imprecision over soft-margin SVM and KLR.

Original languageEnglish
Pages (from-to)813-839
Number of pages27
JournalJournal of Machine Learning Research
StatePublished - Apr 2007


  • Gini index
  • Growth transformation
  • Kernel regression
  • Large margin classifiers
  • Probabilistic models
  • Quadratic entropy
  • Support vector machines


Dive into the research topics of 'Gini support vector machine: Quadratic entropy based robust multi-class probability regression'. Together they form a unique fingerprint.

Cite this