Enhancing Rare Class Performance in HOI Detection with Re-Splitting and a Fair Test Dataset

  • Gyubin Park
  • , Afaque Manzoor Soomro

Research output: Contribution to journalArticlepeer-review

Abstract

In Human–Object Interaction (HOI) detection, class imbalance severely limits the performance of a model on infrequent interaction categories. To overcome this problem, a Re-Splitting algorithm has been developed. This algorithm implements DreamSim-based clustering and performs k-means-based partitioning to restructure the train–test splits. By doing so, the approach balances the rarities and frequent classes of interaction equally, thereby increasing robustness. A Real-World test dataset has also been introduced. This dataset is comparable to a truly independent benchmark. It is designed to address class distribution bias, which is commonly present in traditional test sets. However, as shown in the Experiment and Evaluation subsection, a high level of performance can be achieved for the general case using different few-shot and rare-class training instances. Models trained solely on the re-split dataset show significant improvements in rare-class mAP, particularly for one-stage models. Evaluation on the test dataset from the real world further emphasizes previously overlooked model performance and supports fair structuring of dataset. The methods are validated with extensive experiments using five one-stage and two two-stage models. Our analysis shows that reshaping dataset distributions increases rare-class detection by as much as 8.0 mAP. This study paves the way for balanced training and evaluation leading to the formulation of a general framework for scalable, fair, and generalizable HOI detection.

Original languageEnglish
Article number474
JournalInformation (Switzerland)
Volume16
Issue number6
DOIs
StatePublished - Jun 2025

Keywords

  • DreamSim
  • ElbowMethod
  • HOI
  • dataset Re-splitting

Fingerprint

Dive into the research topics of 'Enhancing Rare Class Performance in HOI Detection with Re-Splitting and a Fair Test Dataset'. Together they form a unique fingerprint.

Cite this