Overview
Generative AI is a kind of artificial intelligence technology that relies on deep learning models trained on large data sets to create new content. Generative AI models, which are used to generate new data, stand in contrast to discriminative AI models, which are used to sort data based on differences. People today are using generative AI applications to produce writing, pictures, code, and more. Common use cases for generative AI include chatbots, image creation and editing, software code assistance, and scientific research.
People are putting generative AI to use in professional settings to quickly visualize creative ideas and efficiently handle boring and time-consuming tasks. In emerging areas such as medical research and product design, generative AI holds the promise of helping professionals do their jobs better and significantly improving lives. AI also introduces new risks which users should understand and work to mitigate.
Some of the well-known generative AI apps to emerge in recent years include ChatGPT and DALL-E from OpenAI, GitHub CoPilot, Microsoft’s Bing Chat, Google’s Bard, Midjourney, Stable Diffusion, and Adobe Firefly. Red Hat partnered with IBM to create Red Hat® Ansible® Lightspeed with IBM watsonx Code Assistant—a generative AI service that helps developers create Ansible content more efficiently. Many other organizations are experimenting with their own generative AI systems to automate routine tasks and improve efficiency.
How does generative AI work?
If you’ve enjoyed a surprisingly coherent conversation with ChatGPT, or watched Midjourney render a realistic picture from a description you just made up, you know generative AI can feel like magic. What makes this sorcery possible?
Beneath the AI apps you use, deep learning models are recreating patterns they’ve learned from a vast amount of training data. Then they work within human-constructed parameters to make something new based on what they’ve learned.
Deep learning models do not store a copy of their training data, but rather an encoded version of it, with similar data points arranged close together. This representation can then be decoded to construct new, original data with similar characteristics.
Building a custom generative AI app requires a model, as well as adjustments such as human-supervised fine-tuning or a layer of data specific to a use case.
Most of today’s popular generative AI apps respond to user prompts. Describe what you want in natural language and the app returns whatever you asked for—like magic.
What are some use cases for generative AI?
Generative AI’s breakthroughs in writing and images have captured news headlines and people’s imaginations. Here are a few of the early use cases for this rapidly advancing technology.
Writing. Even before ChatGPT captured headlines (and began writing its own), generative AI systems were good at mimicking human writing. Language translation tools were among the first use cases for generative AI models. Current generative AI tools can respond to prompts for high-quality content creation on practically any topic. These tools can also adapt their writing to different lengths and various writing styles.
Image generation. Generative AI image tools can synthesize high-quality pictures in response to prompts for countless subjects and styles. Some AI tools, such as Generative Fill in Adobe Photoshop, can add new elements to existing works.
Speech and music generation. Using written text and sample audio of a person’s voice, AI vocal tools can create narration or singing that mimic the sounds of real humans. Other tools can create artificial music from prompts or samples.
Video generation. New services are experimenting with various generative AI techniques to create motion graphics. For example, some are able to match audio to a still image and make a subject’s mouth and facial expression appear to talk.
Code generation and completion. Some generative AI tools can take a written prompt and output computer code on request to assist software developers.
Data augmentation. Generative AI can create a large amount of synthetic data when using real data is impossible or not preferable. For example, synthetic data can be useful if you want to train a model to understand healthcare data without including any personally identifiable information. It can also be used to stretch a small or incomplete data set into a larger set of synthetic data for training or testing purposes.
What is deep learning?
Deep learning, which makes generative AI possible, is a machine learning technique for analyzing and interpreting large amounts of data. Also known as deep neural learning or deep neural networking, this process teaches computers to learn through observation, imitating the way humans gain knowledge. Deep learning is a critical concept in applying computers to the problem of understanding human language, or natural language processing (NLP).
It may help to think of deep learning as a type of flow chart, starting with an input layer and ending with an output layer. Sandwiched between these two layers are the “hidden layers” which process information at different levels, adjusting and adapting their behavior as they continuously receive new data. Deep learning models can have hundreds of hidden layers, each of which plays a part in discovering relationships and patterns within the data set.
Starting with the input layer, which is composed of several nodes, data is introduced to the model and categorized accordingly before it’s moved forward to the next layer. The path that the data takes through each layer is based upon the calculations set in place for each node. Eventually, the data moves through each layer, picking up observations along the way that ultimately create the output, or final analysis, of the data.
One technology that has sped the advancement of deep learning is the GPU, or graphics processing unit. GPUs were originally architected to accelerate the rendering of video game graphics. But as an efficient way to perform calculations in parallel, GPUs have proven to be well suited for deep learning workloads.
Breakthroughs in the size and speed of deep learning models led directly to the current wave of breakthrough generative AI apps.
What is a neural network?
A neural network is a way of processing information that mimics biological neural systems like the connections in our own brains. It’s how AI can forge connections among seemingly unrelated sets of information. The concept of a neural network is closely related to deep learning.
How does a deep learning model use the neural network concept to connect data points? Start with how the human brain works. Our brains contain many interconnected neurons, which act as information messengers when the brain is processing incoming data. These neurons use electrical impulses and chemical signals to communicate with one another and transmit information between different areas of the brain.
An artificial neural network (ANN) is based on this biological phenomenon, but formed by artificial neurons that are made from software modules called nodes. These nodes use mathematical calculations (instead of chemical signals as in the brain) to communicate and transmit information. This simulated neural network (SNN) processes data by clustering data points and making predictions.
Different neural network techniques are suited for different kinds of data. A recurrent neural network (RNN) is a model that uses sequential data, such as through learning words in order as a way to process language.
Building on the idea of the RNN, transformers are a specific kind of neural network architecture that can process language faster. Transformers learn the relationships of words in a sentence, which is a more efficient process compared to RNNs which ingest each word in sequential order.
A large language model (LLM) is a deep learning model trained by applying transformers to a massive set of generalized data. LLMs power many of the popular AI chat and text tools.
Another deep learning technique, the diffusion model, has proven to be a good fit for image generation. Diffusion models learn the process of turning a natural image into blurry visual noise. Then generative image tools take the process and reverse it—starting with a random noise pattern and refining it until it resembles a realistic picture.
Deep learning models can be described in parameters. A simple credit prediction model trained on 10 inputs from a loan application form would have 10 parameters. By contrast, an LLM can have billions of parameters. OpenAI’s Generative Pre-trained Transformer 4 (GPT-4), one of the foundation models that powers ChatGPT, is reported to have 1 trillion parameters.
What is a foundation model?
A foundation model is a deep learning model trained on a huge amount of generic data. Once trained, foundation models can be refined for specialized use cases. As the name suggests, these models can form the foundation for many different applications.
Creating a new foundation model today is a substantial project. The process requires enormous amounts of training data, typically collected from scrapes of the internet, digital libraries of books, databases of scholarly articles, stock image collections, or other large data sets. Training a model on this much data takes immense infrastructure, including building or leasing a cloud of GPUs. The largest foundational models to date are reported to have cost hundreds of millions of dollars to build.
Because of the high effort required to train a foundation model from scratch, it’s common to rely on models trained by third parties, then apply customization. There are multiple techniques for customizing a foundation model. These can include fine-tuning, prompt-tuning, and adding customer-specific or domain-specific data.
What is fine tuning?
Fine tuning is the process of refining a foundation model to create a new model better suited for a specific task or domain. An organization can add training data specific to its desired use case, instead of relying on an all-purpose model.
Fine tuning typically requires significantly less data and time than the initial training. While a foundation model can take weeks or months to train, the fine tuning process might take a few hours.
How does fine tuning help the user? If you are using an all-purpose model, you may have to enter specific examples and instructions each time you prompt the AI application to get what you want. With fine tuning, that work anticipating what kind of output you want is done already. Your prompts can be simpler, saving time and reducing resource usage.
What is retrieval-augmented generation?
Retrieval-augmented generation (RAG) is a method for getting better answers from a generative AI application by linking an LLM to an external resource.
Implementing RAG architecture into an LLM-based question-answering system (like a chatbot) provides a line of communication between an LLM and your chosen additional knowledge sources. This allows the LLM to cross-reference and supplement its internal knowledge, providing a more reliable and accurate output for the user making a query.
What are some of the risks of generative AI?
Having come a long way in a short time, generative AI technology has attracted more than its share of hype, both positive and negative. The benefits and downsides of this technology are still emerging. Here we provide a brief look at some prominent concerns about generative AI.
Enabling harm. There are immediate and obvious risks of bad actors using generative AI tools for malicious goals, such as large-scale disinformation campaigns on social media, or nonconsensual deepfake images that target real people.
Reinforcing harmful societal bias. Generative AI tools have been shown to regurgitate the human biases that are present in training data, including harmful stereotypes and hate speech.
Supplying wrong information. Generative AI tools can produce made-up and plainly wrong information and scenes, sometimes called “hallucinations.” Some generated content mistakes are harmless, such as a nonsense response to a chat question, or an image of a human hand with too many fingers. But there have been serious cases of AI gone wrong, such as a chatbot that gave harmful advice to people with questions about eating disorders.
Security and legal risks. Generative AI systems can pose security risks, including from users entering sensitive information into apps that were not designed to be secure. Generative AI responses may introduce legal risks by reproducing copyrighted content or appropriating a real person’s voice or identity without their consent. Additionally, some generative AI tools may have usage restrictions.
How Red Hat can help
Red Hat provides the common foundations for your teams to build and deploy AI applications and machine learning (ML) models with transparency and control.
Red Hat® OpenShift® AI is a platform that can train, prompt-tune, fine tune, and serve AI models for your unique use case and with your own data.
For large AI deployments, Red Hat OpenShift offers a scalable application platform suitable for AI workloads, complete with access to popular hardware accelerators.
Red Hat is also using our own Red Hat OpenShift AI tools to improve the utility of other open source software, starting with Red Hat Ansible® Lightspeed with IBM watsonx Code Assistant. This service helps automation teams learn, create, and maintain Ansible content more efficiently. It accepts prompts entered by a user and then interacts with IBM watsonx foundation models to produce code recommendations that are then used to create Ansible Playbooks.
Additionally, Red Hat’s partner integrations open the doors to an ecosystem of trusted AI tools built to work with open source platforms.