This document provides a detailed breakdown of key terms and technologies in the broader field of artificial intelligence (AI).
Research into Artificial Intelligence (AI) started as far back as the 1950s, and since that time it has branched out into a wide range of different field and techniques. However, for most of the general public, it only really registered as something particularly significant in 2022, with the first publicly available Large Language Models (LLMs) and Generative Pre-Trained Transformers (GPTs) in the form of ChatGPT, Gemini etc.
Over the next three years to 2025, the use and impact of GPTs has expanded massively, to the extent that, of all the many different fields of AI, GPT now occupy about 98% of total human thinking compared to the rest. So much so that people directly equate the terms AI and GPTs – in other words, to most people’s thinking, AI is ChatGPT, Gemini, Copilot, etc. where as technically it is just one part of one branch, several branches down the tree – as you can see in this diagram.
The purpose of this document is to briefly explain the wider field of AI, of which GPTs are just one part (albeit now a massively important part). It is also important to note that the gravitational effect of the massive focus on GPTs has drawn other AI technology branches into our understanding and development of GPTs, and so the branches described below are no longer as well defined and separate as they once might have been.
AI in General
This category includes broad, interdisciplinary fields and foundational concepts that form the basis of AI but are not confined to a single subfield like traditional machine learning or neural networks.
Robotics and Control Systems:
- Simultaneous Localization and Mapping (SLAM): Algorithms that allow a robot to build a map of its environment while simultaneously keeping track of its own location within it.
- Sensor Fusion: Techniques to combine data from multiple sensors (e.g., cameras, LiDAR, GPS) to get a more accurate and reliable understanding of the environment.
- Path Planning: Algorithms (e.g., A*, Dijkstra) that find the most efficient route for a robot to travel.
Other Notable Technologies:
- Expert Systems: Early AI systems that used human-defined rules and knowledge bases to solve complex problems in specific domains, like medical diagnosis.
- Evolutionary Algorithms: Optimization techniques inspired by natural evolution, such as Genetic Algorithms, which use concepts like mutation and selection to find optimal solutions to problems.
- Fuzzy Logic: A form of logic that deals with degrees of truth rather than a binary “true or false.” It’s used in control systems and decision-making to handle uncertainty.
- And Machine Learning >>>
Machine Learning
This broad category covers foundational and classical approaches to AI. These are algorithms that learn from data to make predictions or decisions without being explicitly programmed for every scenario.
Supervised Learning:
- Linear/Logistic Regression: Used for predicting a numerical value (regression) or classifying data into two categories (e.g., spam vs. not spam).
- Support Vector Machines (SVMs): Finds the optimal hyperplane to separate data points into different classes.
- Decision Trees: Creates a tree-like model of decisions and their possible consequences to solve classification or regression problems.
- Random Forests: An ensemble method that builds multiple decision trees and combines their outputs to improve accuracy.
- Naïve Bayes: A classification algorithm based on the Bayes’ theorem, often used for text classification and sentiment analysis.
- K-Nearest Neighbors (K-NN): Classifies a data point based on the majority class of its “k” nearest neighbors.
Unsupervised Learning:
- K-Means Clustering: An algorithm that groups unlabeled data into a specified number of clusters based on similarity.
- Principal Component Analysis (PCA): A dimensionality reduction technique used to simplify complex datasets.
Reinforcement Learning:
- Q-Learning: A classic RL algorithm that learns a table of “Q-values” to determine the best action to take in any given state.
- Policy Gradient Methods: A family of algorithms (e.g., A2C, PPO) that directly optimize a policy, which is the agent’s strategy for taking actions.
Classical NLP Techniques:
- Word2Vec/GloVe: Models that create numerical representations (embeddings) of words, capturing their meaning and relationships.
- Rule-Based Systems: Older NLP systems that rely on manually created rules and grammar to understand and process language.
- Bag-of-Words/TF-IDF: Statistical models that represent text as a collection of words, often used for simple text classification and information retrieval.
And Neural Networks >>>
Neural Networks
This section focuses on deep learning architectures that use interconnected nodes (neurons) to process data. These models are particularly powerful for pattern recognition in complex datasets, such as images, audio, and text.
- Recurrent Neural Networks (RNNs) and LSTMs: Architectures designed to process sequential data like text. They maintain an internal “memory” and were the dominant deep learning approach for language tasks before transformers.
- Convolutional Neural Networks (CNNs): A deep learning architecture that has been the dominant force in computer vision. It is adept at processing pixel data through convolutional layers.
- U-Net: A CNN architecture commonly used in medical imaging for precise segmentation of objects.
- You Only Look Once (YOLO): A very fast and popular object detection model that uses a single CNN to identify objects.
- Region-based Convolutional Neural Networks (R-CNN): A family of models that use a two-step process to first identify regions of interest and then classify objects within them using a CNN.
- Deep Q-Network (DQN): An extension of Q-Learning that uses a neural network to handle more complex environments and states.
- And Large Language Models >>>
Large Language Models
This category represents the modern state-of-the-art in Natural Language Processing (NLP). These models are primarily built on the Transformer architecture and are trained on vast amounts of text data, allowing them to understand and process human language at a highly sophisticated level.
- BERT (Bidirectional Encoder Representations from Transformers): A model developed by Google that focuses on understanding the context of words by looking at the words that come before and after them simultaneously. This makes it highly effective for tasks like question-answering and sentiment analysis.
- RoBERTa (Robustly Optimized BERT Pre-training Approach): An optimization of BERT developed by Facebook. It uses a modified training approach, including a larger dataset and longer training time, to improve performance on various NLP tasks without changing the model’s architecture.
- T5 (Text-to-Text Transfer Transformer): A model by Google that reframes all NLP problems into a text-to-text format. This means that tasks like classification, summarization, and translation are all treated as inputting text and outputting text.
- ULMFiT (Universal Language Model Fine-tuning): An approach that enables a pre-trained language model to be fine-tuned for a specific task using a smaller dataset, making transfer learning more accessible.
- And Generative Pre-trained Transformers >>>
Generative Pre-trained Transformers
This category describes a powerful class of large language models known for their exceptional ability to generate human-like text. They are built on the Transformer architecture and are characterized by a two-stage process: a massive pre-training phase followed by a fine-tuning phase for specific tasks. Their success is largely attributed to their decoder-only structure, which is optimized for sequential text generation.
- Pre-training and Fine-tuning: These models are first trained on an enormous dataset of text to learn the general rules of language, grammar, and a vast amount of world knowledge. This is the pre-training Afterward, they can be further trained on smaller, task-specific datasets in a fine-tuning phase to improve their performance on tasks like question answering, summarization, or translation.
- Decoder-only Architecture: Unlike models like BERT which use a bi-directional encoder to understand context, GPT-style models use a unidirectional decoder. This architecture processes text sequentially, predicting the next word in a sequence based on all the previous words, making it ideal for creative and conversational generation.
- In-context Learning: A key feature of these models is their ability to perform tasks without explicit fine-tuning. By providing a few examples of a task in the prompt itself, the model can learn and follow the desired pattern. This capability, also known as few-shot learning, allows for immense flexibility and is a hallmark of modern LLMs.


