Multimodal Contrastive Learning
Training method that aligns representations across modalities such as images and text using paired examples.
Core metadata
- ID: multimodal_contrastive_learning
- Era: Modern
- First known date: 2021 (year)
- Region: United States / OpenAI CLIP
- Review status: source_checked
- Maturity: established
Prerequisites
- Deep Learning Neural Networks (deep_learning_neural_networks)
- Large Language Models (large_language_models)
Dependents
- None.
Fields
Field lanes
- Artificial Intelligence & Machine Learning: Foundation Models
Node sources
- CLIP: Connecting Text and Images (OpenAI, 2021, generic_overview) • Supports: node, maturity
- Learning Transferable Visual Models From Natural Language Supervision (arXiv, 2021, primary_paper) • Supports: node
Prerequisite edge evidence
Edge/source evidence summary:
- Prerequisite edges: 2
- Average edge confidence: 68%
- Prerequisite sources: 2
- expert_inference: 2
| Prerequisite | Type | Confidence | Evidence level | Note | Sources |
|---|---|---|---|---|---|
| Deep Learning Neural Networks (deep_learning_neural_networks) | enabling | 68% | expert_inference | Deep Learning Neural Networks provides a capability that enables this technology without being the only possible path. |
|
| Large Language Models (large_language_models) | enabling | 68% | expert_inference | Large Language Models provides a capability that enables this technology without being the only possible path. |
|
This page is generated from canonical era JSON and is indexable by URL.