Embracing GenAI for Business Success- Part VIII

The generative AI (GenAI) revolution is here. Organizations of all sizes and in all industries are eagerly embracing the potential of GenAI’s transformative technology and tailoring it to their needs. GenAI raises the bar on how businesses think, plan, and operate. This powerful technology can boost operational efficiencies and thoroughly reinvent and reimagine business processes across industries, functions, and jobs, delivering novel customer and employee experiences and generating new revenue streams.

Adopting GenAI in enterprises requires a deep understanding of a technology evolving at an unprecedented, dizzying pace. While the onus of responsibility for developing expertise and keeping abreast of GenAI technology falls on the knowledge workers and technologists dedicated to AI initiatives in an organization, business leaders and CEOs must be familiar with terms associated with this nascent technology to have productive conversations with their organization’s technology leaders. This blog series is intended to give you the vocabulary to do just that.

Business leaders grappling with how to put GenAI to work in their organizations are on the lookout for projects with low barriers to entry.¹ Due to their limited technology requirements, search and retrieval projects provide an accessible platform for organizations to experiment and get off the ground. Companies can buy commercial, off-the-shelf, no-code AI solutions that can quickly be put to work by non-technical staff (citizen developers) using commands in everyday language instead of code. Using a “buy” approach enables organizations to use AI to quickly improve efficiency and productivity in routine business workflows and demonstrate the value of the technology, without the need for technologists with specialized engineering skills.

Organizations with a technology-savvy, hands-on-keyboard software engineering employee population can supplement the “buy” approach with build-and-boost pattern of GenAI implementation. With a build-and-boost approach, which involves rolling out infrastructure as a service (IAAS), organizations can fill in the gaps that commercial tools cannot offer, develop business-need-specific GenAI solutions, and simultaneously enable their employees to ramp up to the next step in their GenAI evolution.

Below, we summarize the technology steps involved in leveraging the “build-and-boost” implementation pattern.

Using a “build-and-boost” implementation pattern with Large Foundational Models (LFM)-powered search and retrieval is an approach that can fundamentally change how companies interact with their data. It can unlock insights, spark new ideas, and inform better decision-making, enhancing an organization’s operational efficiency and strategic decision-making processes.

We defined LFMs in earlier editions of this blog. To recap, LFMs are the engines that allow GenAI applications to extend their capabilities beyond those of pre-programmed machines. They are trained on large volumes of data.

LFMs can understand the semantic context of data and generate content in various formats, including audio, video, and text. Users can generate multi-modal outputs by inputting human text-based commands. LFM-based GenAI tools are invaluable in multiple applications, from content creation and design to coding and data analysis.

LFMs are designed with transfer learning capabilities (which apply the knowledge acquired during pre-training to new, related tasks) and large architecture containing millions to billions of parameters. This knowledge transfer, combined with distributed computing and extensive scale, enables them to capture complex patterns and relationships within the data and enhance their adaptability. It makes them efficient at quickly mastering new tasks (generalized) with relatively little additional data and time (Fig. 1).

Figure. 1: Evolution of AI from task-specific to generalized models²

Large language models (LLMs), a subset of LFMs, are statistical models used to predict words in a sequence of natural language.^2,3 Fully understanding the workings of LLMs requires grasping the underlying mathematical principles that power these systems. We can, however, simplify the core elements to provide an intuitive understanding of how these models operate. Ensuring the accuracy and reliability of LLMs is paramount, particularly within a business context.

A significant part of achieving this accuracy and reliability lies in the LLM model-building process, which comprises two core steps (Fig. 2):

Training (pre-training)
Inference (fine-tuning)

In the training step, also referred to as pretraining, the LLM model learns from the vast, curated trove of data provided and acquires a broad understanding of language.

Figure 2: LLM model-building process

Below, we explore how the ML model-building process differs from LLM model-building and offer some simple definitions of commonly used terms in model-building.

In contrast to the two-step process for building an LLM model, machine learning (ML) model-building involves a single, task-specific training step (Fig. 3).

Figure 3: Traditional ML model-building process

LLMs also differ from ML algorithms in two other ways:

They are trained on much larger amounts of data, which means that training an LLM from scratch is very costly. Companies such as OpenAI and Hugging Face offer pre-trained models, also called base models, from which to choose (e.g., GPT-3.5).
They are more versatile. The same LLM text generation model can be used for summarization, translation, classification, and so forth, whereas ML models are usually trained and used for a specific task.

These two differences conspire to shift the job of the Data Scientist/Software/ML /AI /Analytics Engineer. They spend more time figuring out how to make an LLM work for their use case.

Now that we have a basic understanding of the contrasting training steps in ML and LLM, let’s explore the pretraining step in LLM model building.

For LLMs, pre-training involves training a model for text completion. Pretraining is often by far the most resource-intensive part of the model-building process. For the InstructGPT model, pretraining takes up to 98% of the overall compute and data resources. Pretraining also takes a long time. A small mistake during pretraining can incur significant financial loss and set back a project significantly.

Fine-tuning uses a pre-trained model (base model), such as OpenAI’s GPT series, as a foundation. The process involves further training on a smaller, domain-specific dataset. This approach builds upon the knowledge the model acquired during the pre-training stage, adapting for specific tasks and honing its capabilities to analyze, recognize, generate, predict, create, and provide accurate and reliable outputs for specialized applications based on given inputs (Fig. 2). The inference, or fine-tuning step requires fewer data and computational resources than the pre-training step. Fine-tuning transfers the pre-trained model’s learned patterns and features to new tasks, improving performance and reducing training data needs. It has become popular in NLP for tasks like text classification, sentiment analysis, and question-answering. ⁷

Finetuning requires updating model weights generated in step 1. You adapt a model by making changes to the model itself. Finetuning techniques are more complicated and require more data, but they can significantly improve your model’s quality, latency, and cost. Many things aren’t possible without changing model weights, such as adapting the model to a new task it wasn’t exposed to during training.

For example, when you ask ChatGPT a question, the steps it takes to provide you with a response are called inference. Finetuning means continuing to train a previously trained model—the model weights are obtained from the previous training process. The process of fine-tuning is crucial in ensuring that the LLM is optimized for the specific tasks it needs to perform, thereby enhancing the accuracy and reliability of its outputs.

Choosing the right LLM dramatically improves the capabilities of your generative AI applications. There is no one-size-fits-all approach to selecting an LLM model. Balancing price and performance requires collaboration between business leaders and technical leads. In future posts in this blog series, we will cover the accuracy cost and performance implications of LLMs.

In this post, we explored the model-building components of the “build-and-boost” approach and the different steps involved in model training in traditional ML vs. LLM. We will examine how a user interacts with the LLM model in future posts in this blog series.

Embracing GenAI in business means being open to radical change, questioning existing business processes without fear of disrupting the status quo, and being dauntless in throwing out the rulebook and starting anew to achieve better business outcomes. Trailblazers, innovators, and those who are curious and on the lookout for technological developments that lie around the corner will reap the greatest benefit from GenAI. AI will not replace the role of humans in critical functions, but those incapable of embracing AI technologies will find themselves at a disadvantage, unable to partner and collaborate with AI practitioners within their organizations and beyond.

References:

https://hbr.org/2024/09/embracing-gen-ai-at-work
https://learning.oreilly.com/library/view/building-llm-powered/9781835462317/Text/Chapter_01.xhtml#_idParaDest-16
https://learning.oreilly.com/library/view/generative-ai-with/9781835083468/Text/Chapter_1.xhtml#_idParaDest-18
Generative AI with LangChain | Data | Print. https://www.packtpub.com/product/generative-ai-with-langchain/9781835083468
Analytics for Business Success, A Guide To Analytics Fitness^TM, Hema Seshadri, Ph.D. https://a.co/d/e8haiUR
https://learning.oreilly.com/library/view/learning-langchain/9781098167271/preface02.html
https://arxiv.org/html/2408.13296v1#Ch1.S1

Author

Hema Seshadri, Ph.D.

Embracing GenAI for Business Success- Part VIII

Figure. 1: Evolution of AI from task-specific to generalized models2

Figure 2: LLM model-building process

Figure 3: Traditional ML model-building process

Author

Figure. 1: Evolution of AI from task-specific to generalized models²