Since the unveiling of ChatGPT in 2022, executives have been flocking to include GenAI in their portfolios, looking for novel ways to revolutionize their most complex and labor-intensive processes using the troves of proprietary data they have collected for their organizations. Why? GenAI has shown tremendous potential to improve productivity and reduce costs, and has demonstrated its ability to excel in diverse applications like translation, classification, creative writing, and code generation—capabilities that previously demanded specialized, task-specific models developed by expert engineers using domain-specific data.1
According to a research paper published by scientists and engineers from OpenAI and the University of Pennsylvania, knowledge workers involved in routine and repetitive tasks are at a higher risk of technology-driven displacement, a phenomenon known as routine-biased technological change.1 The paper’s findings showed that automation technologies have resulted in increased wage inequality in the US, driven by relative wage declines for workers dedicated to handling routine tasks.1
So, how does one get upskilled in the new automation technologies, specifically GenAI technologies?
This blog series intends to lower the threshold for acquiring foundational GenAI skills for a diverse audience, including engineers, architects, data scientists, and business leaders across industry verticals. The primary goal of the series is to encourage sensible, tangible, practical experimentation with GenAI.
This blog series will help teams adapt their platforms to new AI use cases. It will encourage business leaders to explore the realm of machine learning and to consider how GenAI can be used to understand, analyze, and solve critical business problems and streamline processes. It will also help engineers identify which skills to learn to stay competitive in a new market.
What is GenAI? Let us look at a few definitions first.
Generative AI (GenAI)
Generative AI is a subsection of AI that is trained on a vast array of heterogeneous data. It learns the underlying patterns and distributions within the data and uses that information to answer questions (usually in response to user prompts) and generate content, in the form of text, visual imagery or videos. It is considered an evolution of deep learning (DL), a subfield within machine learning (ML). ChatGPT (Generative Pre-trained Transformer), released by OpenAI in November 2022, is a web application that uses Gen AI at its simplest level. GenAI can be considered a subdomain of Deep Learning (DL) that builds on the foundation of DL and Machine Learning (ML) by combining principles from both (Fig 1).
Figure 1: Relationship between AI, Deep Learning, Machine Learning, and GenAI
GenAI enables machines to discern user intent, identify patterns within data (pattern recognition), replicate existing patterns (pattern assembly) based on the user’s instructions, and generate results that complete the task. The results can include a broad spectrum of functions, such as creating new knowledge, generating original creative content, and producing more complex outputs, such as design patterns and code, as mentioned in the previous edition of this blog series.2
As organizations turn their AI ambitions into reality and shift their GenAI applications into production, they must focus on building the technology stack to support them. The first step will be determining which base model or models to build on. This assumes that organizations already have robust data lakehouse architecture solutions or that a unified analytics platform (UAP) has been implemented to enable the seamless querying of existing internal proprietary data.
A research study conducted by Redis in 2024 found that 67% of the organizations it surveyed have begun building generative AI applications with third-party closed-source models such as OpenAI’s. And many organizations have aspirations to incorporate other types of models as well.3
Large Foundation Models (LFMs)
The term LFMs was coined by researchers at Stanford University in 2021. An LFMs (sometimes known as a base model), refers to a type of GenAI model that has been pre-trained on a broad spectrum of unlabeled diverse data, enabling it to grasp general patterns and relationships within the data and to perform a medley of general tasks, such as understanding language; generating text, images, audio, and video; and conversing in natural language. LFMs offer immense versatility because they can be adapted to a variety of specific tasks and a wide range of specialized downstream applications.
A considerable number of prominent foundation models have been released since 2018. ChatGPT is an example, but so are BERT (ChatGPT-like), Midjourney (generating images), Stable Diffusion (generating images), Claude 3 Haiku (mimicking human interactions for moderate content, optimizing inventory management) Synthesia (generating video from text), and Cresta (providing real-time coaching of call center agents). Makers of existing products such as Adobe, Shopify, Canva, and Autodesk are also embedding generative AI features in their offerings.
LFMs are poised to change the analytics product life cycle (APLCTM) significantly.4 Rather than training unique ML models from scratch, DS/ML engineers can use pre-trained LFMs as a jumping-off point to power new AI applications and can customize them for their specific needs. Due to their sample efficiency, LFMs empower DS/ML engineers by lowering the threshold to prototyping, more quickly and cost-effectively into their AI applications.
From a technological point of view, foundation models are not new. They are based on deep neural networks and self-supervised learning, both of which have existed for decades. LFMs use self-supervised learning in the pre-training step to create labels from input data.5 Self-supervised learning is enabled by transfer learning, where the models take the “knowledge” learned from one task (e.g., object recognition in images) and apply it to another task (e.g., activity recognition in videos). Self-supervised learning generates implicit labels from unstructured data and from the underlying meaningful structure and representation of unlabeled data. Instructions or supervisory signals with labeled data or annotations are not provided in the training step; instead, the models are trained on a surrogate task (often just as a means to an end) and are then adapted to the downstream task of interest via fine-tuning.6 Their general-purpose nature separates LFMs from previous, traditional ML architectures, which perform specific functions, like analyzing text for sentiment, classifying images, and forecasting trends using supervised or unsupervised learning.7
One notable characteristic of LFMs is their significant architecture, which contains millions or even billions of parameters. This extensive scale enables them to capture complex patterns and relationships within the data, contributing to their impressive performance across various tasks (Fig.2). We will dive deeper into how this works in future blog posts.
Figure 2: Large Foundation Models (LFMs) centralizing information of data from various modalities
Another unique feature of foundation models is their adaptability. These models can perform various disparate tasks with high accuracy based on input prompts. Some of these tasks include natural language processing (NLP), question answering, and image classification. The size and general-purpose nature of LFMs make them different from traditional ML models, which typically perform specific tasks, like analyzing text for sentiment, classifying images, and forecasting trends.
In this blog episode, we described LFMs, their ability to centralize data from various modalities, and how one model can be adapted to numerous downstream tasks. In future editions of this blog series, we will look at tech stack options and off-the-shelf large language models (LLMs) we can leverage. Subsequently, we will cover transformers, RAG, and other critical technology concepts.
Embracing GenAI in business means being open to radical change, questioning existing business processes without fear of disrupting the status quo, and being dauntless in throwing out the rulebook and starting anew to achieve better business outcomes. Trailblazers, innovators, and those who are curious and on the lookout for technological developments that lie around the corner will reap the greatest benefit from GenAI. AI will not replace the role of humans in critical functions, but those incapable of embracing AI technologies will find themselves at a disadvantage, unable to partner and collaborate with AI practitioners within their organizations and beyond.
References:
- https://arxiv.org/pdf/2303.10130
- https://dataworksai.com/embracing-genai-for-business-success-part-i/
- https://wp.technologyreview.com/wp-content/uploads/2024/11/MITTR_Redis_final_26nov24.pdf
- Analytics for Business Success, A Guide To Analytics FitnessTM , Hema Seshadri, Ph.D. https://a.co/d/e8haiUR
- Transformers to Improve Memory, a Paradigm Shift in AI? › SINGULARITY 2030. https://singularity2030.ch/transformers-to-improve-memory-a-paradigm-shift-in-ai/
- AI Business Models – FourWeekMBA. https://fourweekmba.com/ai-business-models/
- What are Foundation Models? – Foundation Models in Generative AI Explained – AWS. https://aws.amazon.com/what-is/foundation-models/