Glossary¶

Term (s)	Description	Notation
Data object (point, observation, sample, example)	A unit of analysis. Typically, a data object is represented as a vector of features.	Typically denoted as lower case letter, often bold, e.g., \(\bf x\) or \({\bf x}_i\) or \({\bf x}^{(i)}\), where the subscript or superscript \(i\) denotes membership in a data set.
Data set	A collection of data objects.	Typically denoted as upper case letter, often bold, e.g., \(\bf X\).
Vector	A list (or array) of real values (\(\in \{-\infty,\infty\}\)).	Typically denoted as bold lower case letter, e.g., \(\bf x\). \(\bf x \in \mathbb{R}^d\), means that the vector represents a point in a \(d\)-dimensional vector space. An individual element of the vector is denoted as \(x_i\).
Matrix	A 2-way array of real values (\(\in \{-\infty,\infty\}\)).	Typically denoted as bold upper case letter, e.g., \(\bf X\). \(\bf X \in \mathbb{R}^{m \times n}\), means that the matrix has \(m\) rows and \(n\) columns. A vector is represented as a \(m \times 1\) matrix. An individual element of the matrix is denoted as \(X_{ij}\).
Transpose	\({\bf X}^\top\), \({\bf x}^\top\)	A transpose of a matrix is an operator which flips a matrix over its diagonal, i.e., \(X_{ij} = X^\top_{ji}\). Transpose of a vector is a \(1 \times d\) matrix.
Matrix multiplication	\({\bf X} = {\bf YZ}\)	Only valid if the number of columns in \({\bf Y}\) is equal to the number of rows in \({\bf Z}\). The \(ij^{th}\) entry of the matrix \({\bf Z}\) is the dot product (see below) between the \(i^{th}\) row of \({\bf X}\) and \(j^{th}\) column of \({\bf Y}\), i.e., \(Z_{ij} = X_{i}^\top Y_{j}\) Where \(X_{i}\) denotes the \(i^{th}\) row of \({\bf X}\) and \(Y_{j}\) is the \(j^{th}\) column of \({\bf Y}\)
Vector dot (inner) product	\({\bf x}.{\bf x} = \sum_{i=1}^d x_i^2\)	In matrix notation, the dot product is expressed as \({\bf x}^\top{\bf x}\)
Data Matrix	A \(n \times d\) matrix, \({\bf X}\)	If each data object in a data set can be represented as a vector in \(\mathbb{R}^d\), the data set of \(n\) such objects is typically arranged as a \(n \times d\) matrix, \({\bf X}\), where the transpose of each row of the matrix corresponds to a data object, i.e., \({\bf x}_i = X_{i*}^\top\).
Random Variable	A variable whose possible values are outcomes of a random phenomena (distribution)	Typically denoted as an upper case letter, \(X\) (bold - \({\bf X}\), if multivariate)
Probability	A measure of the likelihood of an event to occur	\(P(A)\) denotes the probability of an event \(A\) to occur. For a categorical random variable, \(X\), \(P(X=x)\) denotes the probability of \(X\) to take the value \(x\) For a continuous random variable, \(X\), \(p(x)\) denotes the probability density of the distribution at the value \(x\). We sometimes use \(p(x)\) even if the random variable is categorical, in which case \(p()\) is the probability mass function, and \(p(x) = P(X=x)\) When talking about more than one random variables, we use \(p_X(x)\) and \(p_Y(y)\) to differentiate between the two pdfs