Generative AI: A Primer
What it is, how it was created, its applications, debate and criticisms
This first primer is on generative artificial intelligence (AI). Let’s dive in.
The basics of AI: A simple industry definition from IBM: “Any system capable of simulating human intelligence and thought processes”. The EU Artificial Intelligence Act defines it more broadly as "software that is developed with one or more of the techniques that can, for a given set of human-defined objectives, generate outputs such as content, predictions, recommendations, or decisions influencing the environments they interact with."
AI systems predict outcomes, make decisions, and automate tasks by identifying patterns in large datasets. The explosion in AI capabilities in recent years stems from three actors: the rapid expansion of available training data, significant improvements in neural networks1, and growth in computational power. Uses include:
Extracting information from images (computer vision)
Transcribing spoken words or understanding their meaning (speech-to-text, natural language processing)
Identifying insights and patterns in written text (natural language understanding)
Converting written text into speech (text-to-speech)
Navigating spaces autonomously based on sensory data (robotics)
What is generative AI specifically? Generative AI creates new content such as text, images, video and audio. Large Language Models (LLM) are one form, using complex algorithms and statistical models to understand and model the patterns in data, such as digital pixels in photographs, waveforms in audio, or text, producing original outputs.
ChatGPT from OpenAI broke into public consciousness on release in November 2022. It is a modified version of an LLM focused on natural conversation with users. The latest version of the model is GPT-4. Given a short prompt, it can respond coherently, perform tasks, answer questions, and converse on various topics. LLMs can be trained to produce varied outputs like computer code, audio, video and images. Programmes such as Midjourney and DALL-E are increasingly realistic at producing images, while Murf’s music is difficult to distinguish from original compositions. Photoshop has just released new capabilities that allow for merging real pictures with generated content.
ChatGPT reached one million users in just five days, suggesting it is a highly disruptive technology, relevant to MPs navigating its social impact.
How were LLMs created? Generative models such as ChatGPT are created through two steps. The first step is ‘deep learning’ involving feeding a vast amount of data – such as text from books, social media, web articles; digital images and audio – into a deep neural network. This training adjusts the connections between neurons in the network to recognise patterns and from this, a model based on statistical probability is built which predicts what comes next.
In the second step, models are fine-tuned through supervised learning with curated datasets, like human-provided questions and answers. Human trainers rank the model’s responses, setting ‘guardrails’ - for example, preventing questions asking for instructions on how to build a bomb – and in the case of ChatGPT, allowing for natural responses.
Models are trained with billions of adjustable parameters. This complexity means that we don’t fully understand their inner workings, so-called ‘black box’ models viewed in terms of their inputs and outputs only. They often come to conclusions, or develop capabilities, that were unexpected. This makes transparency and accountability very complex.
What are foundation models? Most AI models are trained to perform specific tasks and their training data is labelled to help the model learn (often laboriously by humans). Foundation models are built using vast amounts of unlabelled data. Their advantages include remarkable performance and productivity gains, outperforming task-specific models and enabling multi-format translation between text, code, images and video. This allows for users to input a prompt and for the system to generate an output in different formats, even if it wasn’t specifically trained that way. Building on top of foundation models allows for the development of specialised models tailored to specific uses or needs.
What is the technology’s potential? Chatbots such as ChatGPT have provided people with answers to various questions, and produced material such as college essays, fictional stories and job applications. They can assist with information gathering, summarising research, and aiding non-fluent speakers in writing. Potential beneficial applications include pro-social chatbots that can challenge misinformation on social media, mediate conflicts by rephrasing comments, and produce outputs tailored at reducing political divisiveness.
Despite LLMs capabilities in producing realistic and useful outputs, they have no insight into someone’s communicative intent and generate replies based on algorithmic guesses. They repackage existing information rather than creating anything new. However, the novel ways that LLMs make associations between data has been shown to give the models strategy level knowledge and offers many interesting applications.
LLMs like GPT-4 have the inherent capability to re-organise knowledge, connecting disparate material to form novel insights and compound ideas. This has resulted in unexpected effects, such as producing logically-cohesive arguments from unconnected statements. In various fields, generative AI can help to amplify human abilities, acting as a ‘co-pilot’ generating ideas and insights.
The economic benefit of generative AI to the global economy is estimated at up to USD$4.4 trillion annually. However, while applications hold promise, there are unresolved risks of misuse, technical challenges and concerns about factual accuracy, explored further below.
What are some emerging capabilities? Generative AI technologies are producing increasingly realistic outputs and moving towards new abilities like cloning a voice from a small sample and generating audio and video of people, real or synthetic. LLMs have also been shown to have the ability to autonomously write code, broadening potential applications.
OpenAI allows for ChatGPT to be integrated into products that browse and interact with the internet. Experimental projects such as AutoGPT connect chatbots up with web browsers or word processors, allowing the model to divide up complex tasks into sub-tasks and input prompts to achieve tasks independently. In the future, based on a text prompt, this may give LLMs the potential to autonomously act in the real world and strategise to achieve complex goals.
If guardrails are not in place, or humans are not in the loop, when the model devises sub-goals there may be unforeseen and damaging consequences. Sub-goals also raise risks of the model manipulating people. In a recent example, a chatbot lied to a human in order to help it solve a Captcha puzzle. Autonomous systems currently have limitations, but capabilities are likely to improve – just one example of the criticality of ethical guidelines and ongoing oversight around AI development.
Criticisms of generative AI
1. Transparency
Training and data secrecy: Some companies that developed LLMs have not yet released much information about the data sets, code, costs or environment impact, leading to strong criticism. There have been calls to step back and focus on understanding more about how and why a system produces certain unexpected results. This lack of transparency limits the ability assess potential impacts on society.
Legality and ethics: A lack of transparency in training for foundation models, using massive amounts of data scraped from the web, also raises questions of copyright, consent and privacy. LLMs have been criticized for using copyright material and 'sealing it off inside proprietary products'.
2. Accountability
Biased outputs. As LLMs are built on vast banks of digital information, it is impossible to vet every data point used to train the model, including biased or discriminatory data. This can lead to models producing racist, sexist, threatening, and otherwise objectionable content. Studies have also shown that LLMs can nudge users towards certain viewpoints, often reinforcing cultural views in the training data.
Hallucinations. LLMs at present have a tendency to make up ‘hallucinations’ - false statements, factual errors and nonsensical responses. Meta’s Galactica – a model trained on 48 million science articles– was taken down after less than three days as it misconstrued scientific facts. Other examples include models falsely accusing people of crimes. As LLMs present what are statistical guesses with the same confidence they present facts, this raised issues widespread public trust in their outputs. Although companies including Open AI are working hard to reduce the hallucination problem, we will need to ensure public understanding about the accuracy of generative AI.
Debate over open source models. Organisations such as OpenAI and Anthropic have not published details of their models to prevent misuse. Others such as Meta have released open-sourced models to the public. The argument is that open source fosters innovation, drives competition and provides greater economic opportunity, counterbalancing monopoly powers of big tech. However, it also potentially places damaging capabilities in the hands of bad actors who can fine tune models to perform a range of tasks such as “spam, fraud, malware, privacy violations, harassment, and other wrongdoing”. Oversight and accountability for potentially damaging uses of open source models will be essential.
This primer has pointed the way to potential beneficial uses and risks of LLMs - to be explored further.
Useful links:
An introduction to AI and its uses from the House of Lords library
AI 101: Explanation of key terms, background to industry actors and curated news on AI developments.
An accessible video explainer on generative AI from Google
And on generative AI models from IBM
A technical, but jargon-free, explanation of how LLMs work
A summary of the development and introduction of ChatGPT and debates around its use
Commonly said to mimic the structure and function of the human brain, a neural network is made up of interconnected nodes that process and transmit information, organised in layers. Each node can perform a simple mathematical operation and the nodes are connected by weights, which determine how much influence one node has on another. By adjusting the weights, the neural network can learn from data and perform various tasks, such as classification, regression, clustering, etc.