Calibration (Model)
The degree to which a model’s predicted probabilities reflect true likelihoods. A well‑calibrated classifier outputs probability estimates that match observed frequencies (e.g., events predicted at 70 % occur approximately 70 % of the time). Techniques like Platt scaling or isotonic regression improve calibration.
Capsule Network
A neural network architecture that uses groups of neurons (capsules) to model hierarchical relationships and preserve spatial information. Capsule networks aim to improve robustness to rotations and translation compared with standard convolutional networks by encoding pose parameters.
Categorical Cross‑Entropy
A loss function used for multi‑class classification. It measures the difference between the true one‑hot label distribution and the predicted probability distribution. Minimizing cross‑entropy encourages the model to assign high probability to the correct class.
Chain‑of‑Thought Prompting
A prompting strategy for large language models that encourages the model to reason step‑by‑step before answering. By explicitly asking for intermediate reasoning, chain‑of‑thought prompts can improve accuracy on complex questions and reduce hallucination.
Chatbot
A computer program designed to simulate human conversation through text or voice. Chatbots rely on natural language processing to parse user inputs and generate responses. They power virtual assistants, customer‑support bots and domain‑specific conversational agents.
Chunking (Text or Data)
The process of dividing long texts or sequences into smaller, manageable segments (chunks) for processing. In retrieval‑augmented systems, documents are split into chunks that can be embedded, indexed and retrieved based on relevance to a query.
Class Imbalance
A situation where certain classes appear much less frequently than others. Class imbalance can degrade model performance because many algorithms assume balanced data. Remedies include resampling, weighted loss functions and specialized algorithms.
Classification
The task of assigning input data to one of several discrete categories. Algorithms include logistic regression, decision trees, naïve Bayes, support vector machines and neural networks. Multi‑class and multi‑label classification extend this notion to multiple classes or simultaneous labels.
Clustering
An unsupervised learning technique that groups data points based on similarity. Common methods include k‑means, hierarchical clustering and DBSCAN. Clustering is used for customer segmentation, anomaly detection and exploratory data analysis.
CNN (Convolutional Neural Network)
A neural network architecture designed to process grid‑structured data such as images or audio spectrograms. CNNs use convolutional layers, pooling layers and nonlinear activations to learn spatial hierarchies of features. They dominate computer vision tasks like image classification and object detection.
Code Interpreter / Function Calling
An interface that allows language models to execute functions or code to augment their capabilities. By calling external tools or functions, models can perform operations like math, data retrieval or API requests, improving accuracy and reducing hallucination.
Cold Start
A challenge in recommendation systems where insufficient data are available about new users or items. Cold‑start methods leverage content features, demographic information or transfer learning to make initial recommendations.
Computational Graph
A directed graph representing the sequence of operations (nodes) and data dependencies (edges) in a computation. Frameworks like TensorFlow and PyTorch build computational graphs to automatically compute gradients via backpropagation.
Confusion Matrix
A tabular summary of model performance for classification problems. Rows correspond to true classes and columns to predicted classes. From the matrix one can derive metrics such as accuracy, precision, recall, specificity and F1 score.
Context Window
In large language models, the maximum number of tokens the model can consider at once when generating a response. A longer context window allows the model to maintain “memory” across longer conversations or documents.
Contrastive Learning
A self‑supervised learning technique where the model learns to pull related examples closer in representation space while pushing unrelated examples apart. Methods like SimCLR and CLIP use contrastive objectives to learn high‑quality embeddings without labeled data.
Convergence
In optimization, the point at which further training or iterations produce negligible changes in the loss function. Convergence criteria guide stopping conditions and help diagnose issues like vanishing gradients or overfitting.
Cost Function
Another term for loss function: a measure of how well a model’s predictions match the ground truth. Optimization algorithms aim to minimize the cost function by adjusting model parameters.
Cross‑Entropy Loss
A loss function commonly used for classification that measures the difference between the true probability distribution and the predicted distribution. Binary cross‑entropy applies to two classes; categorical cross‑entropy applies to multi‑class problems.
Cross‑Validation
A technique for assessing how a model generalizes to unseen data. The dataset is split into multiple folds; the model is trained on some folds and validated on the remaining fold. Repeating across all folds reduces bias in performance estimates.
Curriculum Learning
A training strategy where models are presented with easier examples first, gradually introducing harder examples. This mimics human learning and can lead to faster convergence and improved performance, especially in reinforcement learning and language modelling.