AI & Analytics

How Vision Language Models Are Trained from “Scratch”

Towards Data Science (Medium) 13 Mar 2026, 16:30

Summary

Modern vision language models are effectively trained from scratch, reshaping the future of AI in image processing.

Training of Vision Language Models

A recent article explains how vision language models like CLIP and DALL-E are trained using extensive datasets of images and their corresponding texts. This methodology enables developers to create models capable of not only drawing and generating but also truly understanding what they see. Training from a foundational level requires innovative approaches to ensure that the models accurately grasp the relationship between images and text.

Implications for the BI Market

Developments in vision language models are crucial for BI professionals, especially in sectors where visual data analysis is becoming increasingly important. Competitors such as Google and Microsoft are also working on similar technologies that integrate visual and textual data for advanced analytics. This aligns with the broader trend of AI integration into business intelligence toolsets, enabling companies to gain insights from their data more rapidly and efficiently.

What BI Professionals Should Do

BI professionals need to prepare for the integration of vision language models in their workflows. This entails exploring how these models can be applied in data analysis and reporting, as well as being ready to embrace new tools and technologies emerging from these advancements.

Read the full article

Deepen your knowledge

Knowledge Base

How Vision Language Models Are Trained from “Scratch”

Summary

Training of Vision Language Models

Implications for the BI Market

What BI Professionals Should Do

Deepen your knowledge

AI in Power BI — Copilot, Smart Narratives and more

ChatGPT and BI — How AI is transforming data analysis

Predictive Analytics — What can it do for your business?

How Vision Language Models Are Trained from “Scratch”

Summary

Training of Vision Language Models

Implications for the BI Market

What BI Professionals Should Do

Deepen your knowledge

AI in Power BI — Copilot, Smart Narratives and more

ChatGPT and BI — How AI is transforming data analysis

Predictive Analytics — What can it do for your business?

Related articles

Architecture and Orchestration of Memory Systems in AI Agents

Proxy-Pointer RAG: Achieving Vectorless Accuracy at Vector RAG Scale and Cost

A Data Scientist’s Take on the $599 MacBook Neo

What domains are easier to work in/understand