Contrastive learning: Big Data Foundations and Applications

Sandhya Tripathi, Christopher Ryan King

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


Contrastive learning (CL) has exploded in popularity due to its ability to learn effective representations using vast quantities of unlabelled data across multiple domains. CL underlies some of the most impressive applications of generative AI for the general public. We will review the fundamentals and applied work on contrastive learning representations focusing on three main topics: 1) CL in supervised, unsupervised and self-supervised setup and its revival in AI research as an instance discriminator. In this part, we will focus on learning about the nuts and bolts, such as different augmentation techniques, loss functions, performance evaluation metrics, and some theoretical understanding of contrastive loss. We will also present the methods supporting DALL · E 2, a popular generative AI. 2) Learning contrastive representations across vision, text, time series, tabular data and knowledge graph modalities. Specifically, we will present the literature representative of solution approaches regarding new augmentation techniques, modification in the loss function, and additional information. The first two parts will also have small hands-on session on the application shown and some of the methods learned. 3) Discussing the various theoretical and empirical claims for CL's success, including the role of negative examples. We will also present some work that challenges the shared information assumption of CL and propose alternative explanations. Finally, we will conclude with some future directions and applications for CL.

Original languageEnglish
Title of host publicationCODS-COMAD 2024 - Proceedings of 7th Joint International Conference on Data Science and Management of Data
PublisherAssociation for Computing Machinery
Number of pages5
ISBN (Electronic)9798400716348
StatePublished - Jan 4 2024
Event7th Joint International Conference on Data Science and Management of Data, CODS-COMAD 2024 - Bangalore, India
Duration: Jan 4 2024Jan 7 2024

Publication series

NameACM International Conference Proceeding Series


Conference7th Joint International Conference on Data Science and Management of Data, CODS-COMAD 2024


  • augmentations
  • clustering
  • contrastive learning
  • distillation
  • graphs
  • multi-modal
  • multi-view
  • noise estimation loss
  • tabular datasets
  • time-series


Dive into the research topics of 'Contrastive learning: Big Data Foundations and Applications'. Together they form a unique fingerprint.

Cite this