Artificial Intelligence (AI) in Learning and Discovery: Key Terminology

This guide explores artificial intelligence (AI) in academic research, the healthcare community, and education. It aims to strengthen AI literacy, support critical evaluation, and connect users with key tools and resources.

About

The internet contains a surplus of definitions for the terms listed below. The definitions provided were carefully selected to ensure clarity, accuracy, and relevance. Links lead to reputable external platforms that provide additional valuable insights. Click on each hyperlink to explore the source of each term. If the term does not have a hyperlink, it is because the definition was developed internally.

Note: Navigate to the Home tab >> then to What is Artificial Intelligence? to visualize the union between artificial intelligence, machine learning, deep learning, generative-AI, large language models, and natural language processing, and how they collectively form the foundation of data science.

Terms and Definitions

Algorithm -- In machine learning, algorithms refer to computational techniques that can find a way to connect a set of inputs to a desired set of outputs by learning relevant data.

Bias -- AI bias, also called machine learning bias or algorithm bias, refers to the occurrence of biased results due to human biases that skew the original training data or AI algorithm—leading to distorted outputs and potentially harmful outcomes.

Deepfake -- A video of someone in which their face has been altered by AI to make them look like another person.

Deep Learning -- Deep learning is a type of machine learning in which multiple layers of networks are used to train algorithms using large data sets.

Generative AI (GenAI) -- Generative AI refers to deep-learning models that can take raw data and “learn” to generate statistically probable outputs when prompted.

Generative Pretrained Transformer (GPT) -- A language model developed by OpenAI. It uses deep learning techniques to generate natural language text.

Hallucination -- A phenomenon wherein a large language model perceives patterns or objects that are nonexistent or imperceptible to human observers, creating outputs that are nonsensical or altogether inaccurate.

Input Data -- Input data is data added to an artificial intelligence (AI) to explain a problem, situation, or request. Input data may be cleaned, labeled, and organized, or it may be raw data. Often an AI is asked to create or synthesize information based on input data and the AI’s algorithm (which itself might have been fine-tuned or trained using a separate data set) in order to create new data (“output data”).

Large Language Model (LLM) -- A category of foundation models trained on immense amounts of data

Machine Learning (ML) -- The methods and algorithms that allow AI systems to identify patterns, make decisions, and improve themselves through all sources of input data.

Natural Language Processing (NLP) -- NLP refers to a type of data science that helps computers understand and interpret human language. Data relies on language models and computational linguistics to understand grammar. NLP also becomes familiar with subtle shifts in the tone and intent of spoken speech.

Output Data -- Output data is new data an artificial intelligence (AI) creates or synthesizes based on input data and the AI’s algorithm.

Prompt -- The query that a user inputs into a generative AI model, guiding it to generate specific content such as text or images based on the provided instructions or context (see Input Data).

Prompt Engineering -- The strategic process of developing a set of inputs or queries to yield the most desired outputs.

Training Data -- The data used to train a machine learning (ML) model.