Kernel class
- class codpy.kernel.Kernel(x=None, y=None, fx=None, max_pool: int = 1000, max_nystrom: int = 1000, reg: float = 1e-09, order: int | None = None, dim: int = 1, set_kernel: callable | None = None, **kwargs: dict)[source]
Bases:
object- A class to manipulate datas for various kernel-based operations, such as interpolations or extrapolations of functions, or mapping between distributions.
- Note:
This class is similar to libraries as scikit-learn or XGBoost, in the sense that they use a fit / predict pattern, with the following correspondances and differences.
Datas are loaded into memory in the contructor
__init__(), or viaset()For matching distributions, use
map(),The predict function is made directly through
__call__()
It implements the following methods :
In the context of functions interpolation / extrapolation
\[f_{k,\theta}(\cdot) = K(\cdot, Y) \theta, \quad \theta = K(X, Y)^{-1} f(X),\]\(K(X, Y)\) is the Gram matrix, see
Knm()\(K(X, Y)^{-1} = (K(Y, X)K(X, Y) + \epsilon R(Y,Y))^{-1}K(Y,X)\) is computed as a least-square method with optional regularization terms, , see
get_knm_inv().
- For matching distributions
- \[f_{k,\theta}(\cdot) = K(\cdot, Y) K(X, Y)^{-1} f(X\circ \sigma)\], where \(\sigma\) is a permutation.
Fitting is done just-in-time (at first prediction), and means computing the parameters \(\theta = K(X, Y)^{-1} f(X)\), together with \(\sigma\) for distributions. The function
get_theta()performs those computations and corresponds to fit in others frameworks.
- __init__(x=None, y=None, fx=None, max_pool: int = 1000, max_nystrom: int = 1000, reg: float = 1e-09, order: int | None = None, dim: int = 1, set_kernel: callable | None = None, **kwargs: dict) None[source]
Initializes the Kernel class with default or user-defined parameters.
- Parameters:
x – A bi-dimensional numpy array.
fx – A bi-dimensional numpy array. If x or fx is not None, then call
set()max_pool (
int, optional) – Maximum pool size for the kernel operations. Defaults to 1000.max_nystrom (
int, optional) – Maximum number of Nystrom samples. Defaults to 1000.reg (
float, optional) – Regularization parameter for kernel operations. Defaults to 1e-9.order (
int, optional) – Polynomial order for polynomial kernel functions. Defaults toNone.dim (
int, optional) – Dimensionality of the input data. Defaults to 1.set_kernel (
callable, optional) – A custom kernel function initializer. If not provided, a default kernel is used.kwargs (
dict) – Additional keyword arguments for further customization.
- default_kernel_functor() callable[source]
Initialize and return a default kernel function.
This method provides a default kernel initialization. We picked up a quite simple but robust kernel functor
>>> core.kernel_setter("maternnorm", "standardmean", 0, 1e-9)
defining the maternnorm kernel with the standardmean map. It sets a polynomial order of 0 and a regularization value of 1e-9.
- Returns:
The initialized default kernel function using
core.kernel_setter().- Return type:
callable
Example
>>> default_kernel = kernel.default_kernel_functor()
- set_custom_kernel(kernel_name: str, map_name: str, poly_order: int = 0, reg: float | None = None, bandwidth: float = 1.0, **kwargs) None[source]
Provide a downlink to internal codpy kernel with flexible parameters.
- Parameters:
kernel_name (
str) – Name of the kernel function to use (e.g.,'gaussian').map_name (
str) – Name of the mapping function (e.g.,'standardmin').poly_order (
int, optional) – The polynomial order if using a polynomial kernel. Defaults to 0.reg (
float, optional) – Regularization parameter. If not provided, uses the instance’s reg value.bandwidth (
float, optional) – Bandwidth for kernel functions that require it. Defaults to 1.0.
- Returns:
None
- get_order(**kwargs) int[source]
Retrieve the polynomial order for the kernel.
- Returns:
The polynomial order if available, otherwise
None.- Return type:
intorNone
Example
>>> order = kernel.get_order()
- get_polynomial_values(**kwargs) ndarray[source]
Retrieve the predicted polynomial values based on the current input data.
This method returns the values obtained from the polynomial regression model. If the polynomial values are not yet computed, it calls
_set_polynomial_regressor()to set up the polynomial regressor using the current input dataxand function valuesfx.- Parameters:
kwargs – Additional keyword arguments for flexibility (not used directly).
- Returns:
The predicted polynomial values or
Noneif the polynomial order is not set.- Return type:
numpy.ndarrayorNone
Example
>>> poly_values = kernel.get_polynomial_values()
- get_polynomial_regressor(z: ndarray, x: ndarray | None = None, fx: ndarray | None = None, **kwargs) ndarray[source]
Set up the polynomial regressor based on the input data and the polynomial order.
- Parameters:
z (
numpy.ndarray) – New input data points for the regressor.x (
numpy.ndarray, optional) – Input data points.fx (
numpy.ndarray, optional) – Function values corresponding to the input data.
- Returns:
The predicted polynomial values or None if unavailable.
- Return type:
numpy.ndarrayorNone
Example
>>> z_data = np.random.rand(100, 10) >>> pred = kernel.get_polynomial_regressor(z_data)
- Knm(x: ndarray, y: ndarray, fy: ndarray = [], **kwargs) ndarray[source]
Compute the kernel matrix \(K(X, Y)=k(x^i, y^j)_{i,j}\), where the kernel function \(k\) is defined at class initialization, see
self.set_kernel.- Parameters:
x (
numpy.ndarray) – Input data points \((N, D)\), where \(N\) is the number of points and \(D\) is the dimensionality.y (
numpy.ndarray) – Secondary data points \((M, D)\), where \(M\) is the number of points and \(D\) is the dimensionality.fy (
numpy.ndarray, optional) – Optional matrix values for optimization purposes. If not None, perform and return the multiplication \(K(X, Y)f_y\).
- Returns:
The computed kernel matrix \(K\) of size \((N, M)\).
- Return type:
numpy.ndarray
Example
>>> x_data = np.array([...]) >>> y_data = np.array([...]) >>> kernel_matrix = Kernel(x=x_data,y=y_data).Knm()
- get_knm_inv(epsilon: float | None = None, epsilon_delta: ndarray | None = None, **kwargs) ndarray[source]
Retrieve the inverse of the kernel matrix \(K^{-1}(x, y)\) using least squares computations.
- Parameters:
epsilon (
float, optional) – Regularization parameter for the inverse computation. Defaults to None.epsilon_delta (
numpy.ndarray, optional) – Delta values for adjusting regularization. Defaults to None.
- Returns:
The inverse kernel matrix or the product with function values if provided.
- Return type:
numpy.ndarray
Note
- If the regularization parameter (
reg) is empty: If
fxis empty: Returns aNumPyarray of size \((N, M)\), representing the least square inverse of \(K(x, y)\).If
fxis provided: Returns the product of \(K^{-1}(x, y)\) and \(f(x)\). This allows performance and memory optimizations.
- If the regularization parameter (
- If the regularization parameter (
reg) is provided: If
fxis empty: Returns aNumPyarray of size \((N, M)\), computed as \((K(y, x) K(x, y) + \epsilon)^{-1} K(y, x)\)If
fxis provided: Returns the product of \(K^{-1}(x, y)\) and \(f(x)\).
- If the regularization parameter (
Example
>>> x_data = np.random.rand(100, 10) >>> y_data = np.random.rand(80, 10) >>> fx_data = np.random.rand(80, 5) >>> inv_kernel = kernel.get_knm_inv(x=x_data, y=y_data, fx=fx_data)
- get_knm(**kwargs) ndarray[source]
Retrieve or compute the Gram matrix \(K(x, y)\) for the kernel.
- Returns:
The Gram matrix \(K(x,y)\).
- Return type:
numpy.ndarray
- get_x(**kwargs) ndarray[source]
Retrieve the input data
x.- Returns:
The input data or
Noneif not set.- Return type:
numpy.ndarrayorNone
- set_x(x: ndarray, set_polynomial_regressor: bool = True, **kwargs) None[source]
Set the input data
xfor the kernel and update related internal states.This method sets the input data and optionally recalculates the polynomial regressor and kernel matrices.
- Parameters:
x (
numpy.ndarray) – Input data points to be set.set_polynomial_regressor (
bool, optional) – Whether to recalculate the polynomial regressor after setting the data. Defaults toTrue.
- set_y(y: ndarray | None = None, **kwargs) None[source]
Set the target data
yfor the kernel. If no target data is provided,yis set equal tox.If interpolation/extrapolation is used, the following formula is applied:
\[ f_{\theta}(.) = K(., Y)\theta, \quad \theta = K(X, Y)^{-1} f(X). \]- Parameters:
y (
numpy.ndarray, optional) – Target data points. If None,yis set equal tox.
- get_y(**kwargs) ndarray[source]
Retrieve the target data
y.- Returns:
The target data or the input data
xifyis not set.- Return type:
numpy.ndarray
- get_fx(**kwargs) ndarray[source]
Retrieve the function values
fxfor the input data.- Returns:
The function values or
Noneif not set.- Return type:
numpy.ndarrayorNone
- set_fx(fx: ndarray, set_polynomial_regressor: bool = True, **kwargs) None[source]
Set the function values
fxfor the input data.- Parameters:
fx (
numpy.ndarray) – Function values corresponding to the input data.set_polynomial_regressor (
bool, optional) – Whether to recalculate the polynomial regressor after setting the function values. Defaults toTrue.
- set_theta(theta: ndarray, **kwargs) None[source]
Set the coefficient
thetafor the kernel regression.The coefficient is computed by the formula:
\[\theta = K(X, Y)^{-1} f(X)\]- Parameters:
theta (
numpy.ndarray) – Coefficients for kernel regression.
- get_theta(**kwargs) ndarray[source]
Retrieve the coefficient
thetafor kernel regression.If
fxis not defined, the polynomial regressor is used to adjust the function values.- Returns:
The regression coefficient
theta.- Return type:
numpy.ndarray
- get_Delta() ndarray[source]
Compute and retrieve the discrete Laplace-Beltrami operator
Delta.- Returns:
The Laplace-Beltrami operator.
- Return type:
numpy.ndarray
- greedy_select(N, x=None, fx=None, all=False, norm_='frobenius', **kwargs)[source]
Select a subset of points using a greedy Nystrom approximation technique :
\[Y^{n+1} = Y^{n} \cup \arg \sup_{x \in X} d(Y^n,x),\]to quickly approximate the clustering problem \(Y = \arg \inf_{Y \subset X} d(Y,X),\) where we suppose the following structure \(d(Y,X) = \sum_i d(Y,x^i)\).The selection is typically based on norms such as the discrepancy errors for distributions, Frobenius or classifier type distances.
- Parameters:
x (
numpy.ndarray) – Input data points.N (
int) – The number of points to select.fx (
numpy.ndarray, optional) –Function values corresponding to
x.- if fx is None,
- \[d(Y,X) = \frac{1}{N_X} \sum_{n=1}^{N_x} k(x^n,\cdot) - \frac{2}{N_Y} \sum_{m=1}^{N_Y} k(\cdot,y^m)\]This choice corresponds to minimizing the discrepancy error, see
core.op.discrepancy_error().
- if fx is not None, \(d(X,Y) = \|f(X)-f_{k,\theta}(X)\|\)
In which case, we are interested in adaptive mesh or control variate technics.
all (
bool, optional) – IfTrue, all points are selected. Defaults toFalse.norm (
str, optional) –a string to identify the norm used for selection. Can be “frobenius” or “classifier”.
if “frobenius”, \(d(X,Y) = \|f(X)-f_{k,\theta}(X)\|_{\ell2}^2\)
if “classifier”, \(d(X,Y) = \|\softmax(f(X))-\softmax(f_{k,\theta}(X))\|_{\ell_2}^2\) to account for probabilities representation.
user-defined functions coming soon.
start_indices (
list, optional) – an array of indices to set \(Y^0\), otherwise the first is chosen randomly.
- Returns:
Indices of the selected points.
- Return type:
numpy.ndarray
- set(x: ndarray | None = None, fx: ndarray | None = None, y: ndarray | None = None, **kwargs) None[source]
Set the input data
x, function valuesfx, and target datayfor the kernel.- Parameters:
x (
numpy.ndarray) – Input data points.fx (
numpy.ndarray, optional) – Function values corresponding to the input datax.y (
numpy.ndarray, optional) – Target data points. If None,yis set equal tox.
- map(x: ndarray, y: ndarray, distance: str | None = None, sub: bool = False) None[source]
Maps the input data points
xto the target data pointsyusing the kernel and optimal transport techniques.- Parameters:
x (
numpy.ndarray) – Input data points (\(N\), \(D_{source}\)).y (
numpy.ndarray) – Target data points (\(M\), \(D_{target}\)).distance (
str, optional) – Distance metric to use in mapping. Defaults toNone.sub (
bool, optional) – Whether to apply a sub-permutation. Defaults to False.
- Returns:
None
Example
>>> x_data = np.array([...]) # Input data with shape (N, D_source) >>> y_data = np.array([...]) # Target data with shape (M, D_target) >>> kernel.map(x_data, y_data)
Note
This method computes a permutation that maps \(x\) to \(y\) using the Linear Sum Assignment Problem (LSAP) or a descent method.
If the dimensionalities of \(x\) and \(y\) are the same (\(D_{source} = D_{target}\)), the classical LSAP algorithm is used.
If the dimensionalities differ (\(D_{source} \neq D_{target}\)), a descent-based method is used to encode the data into a lower-dimensional latent space before finding the optimal permutation, following principles of discrete optimal transport.
This permutation can be used to transform the input data \(x\) to approximate the target data \(y\).
- update_set(z: ndarray, fz: ndarray) Tuple[ndarray, ndarray][source]
Update the training set by limiting the data to a maximum pool size.
This method trims the input data
zand corresponding function valuesfzto the size defined by themax_poolparameter.- Parameters:
z (
numpy.ndarray) – Input data points to update.fz (
numpy.ndarray) – Function values corresponding to the input dataz.
- Returns:
The trimmed input data points and corresponding function values, limited by
max_pool.- Return type:
Tuple[
numpy.ndarray,numpy.ndarray]
- update(z: ndarray, fz: ndarray, eps: float | None = None, **kwargs) None[source]
Fit the regressor to new data points
(z, fz)while maintaining the existing kernel structure.This method allows fitting a kernel-based regressor that is originally defined on the set
xbut is updated to match new input valueszand their corresponding function valuesfz.The regression is defined by the formula:
\[ f_{k, \theta}(z) \approx K(z, X)\theta = f(z) \]Where the coefficient theta is computed as:
\[ \theta = K(z, X)^{-1}f(z) \]- Parameters:
z (
numpy.ndarray) – New input data points to update the regressor.fz (
numpy.ndarray) – Function values corresponding to the new data points z.eps (
float, optional) – Regularization parameter used in the least squares solution. Defaults to self.reg if not provided.
- Returns:
Updates the internal state of the regressor with new z and fz values.
- Return type:
None
- add(y: ndarray | None = None, fy: ndarray | None = None) None[source]
Augments the training set by adding new data points and their corresponding function values.
This method optimizes the computation for training set augmentation by efficiently updating the kernel matrix and applying a block-inversion algorithm, which reduces the overall complexity compared to recalculating the full kernel matrix.
- Parameters:
y (
numpy.ndarray) – New data points to be added to the training set.fy (
numpy.ndarray) – Function values corresponding to the new data points y.
- Returns:
This method updates the internal state of the class, modifying the training set with the new data points and their function values.
- Return type:
None
Note
The kernel matrix \(K([X,Y], [X,Y])\) is of size \(\mathbb{R}^{(N_X+N_Y) \times (N_X+N_Y)}\), and directly computing its inverse has a complexity of \((N_X + N_Y)^3\).
By using the block-inversion method, the complexity can be reduced to \(N_X^3 + N_Y^3\), significantly improving performance.
The function \(f_{k,\theta}(.)\) is then computed as:
\[ f_{k,\theta}(.) = K(., [X,Y])\theta, \quad \theta = K([X,Y], [X,Y])^{-1} \begin{bmatrix} f(X) \; f(Y) \end{bmatrix} \]Here, \([.]\) denotes standard matrix concatenation, where \(f(X)\) and \(f(Y)\) are the function values for the original and new data points, respectively.
- kernel_distance(z: ndarray) ndarray[source]
Compute a MMD-like (Maximum Mean Discrepancy) based distance matrix between the input data
xand the new dataz.The distance is computed as:
\[ D(X,Z) = \Big(d_k(x^i,z^j) \Big)_{i,j},\quad d_k(x,y)= k(x,x) + k(z,z)-2k(x,z) \]- Parameters:
z (
numpy.ndarray) – New input data points.- Returns:
The computed MMD-based distance matrix.
- Return type:
numpy.ndarray
- discrepancy(z: ndarray) float[source]
Compute the MMD (Maximum Mean Discrepancy) between the kernel features \(x\) and \(z\).
- Parameters:
z (
numpy.ndarray) – New input data points.- Returns:
The computed MMD-based distance matrix.
- Return type:
numpy.ndarray
- get_kernel() callable[source]
Retrieve the current kernel function for the input data.
This method retrieves a positive semi-definite (PSD) kernel function, represented as: \(k(S(x), S(y))\), where \(S\) is a predefined mapping.
- Returns:
The kernel function used by the current model.
- Return type:
callable
- set_kernel_ptr() None[source]
Set the Codpy interface to use the current kernel function.
This method updates the Codpy kernel interface with the current kernel function, sets the polynomial order to zero, and applies the regularization parameter defined in the object.
- rescale() None[source]
Rescale the input data using the current mapping.
This method rescales the input data by applying the map function associated with the current kernel. It also retrieves and updates the internal kernel function based on the rescaled data.
If
xis set, the rescaling is applied toxwith a maximum number of points defined bymax_nystrom.
- class codpy.kernel.KernelClassifier(x=None, y=None, fx=None, max_pool: int = 1000, max_nystrom: int = 1000, reg: float = 1e-09, order: int | None = None, dim: int = 1, set_kernel: callable | None = None, **kwargs: dict)[source]
Bases:
Kernel- A simple overload of the kernel
Kernelfor proabability handling. - Note:
It overloads the prediction method as follows :
$$ ext{softmax} (log(f)_{k,theta})(cdot)$$
- set_fx(fx: ndarray, set_polynomial_regressor: bool = True, **kwargs) None[source]
Set the function values
fxfor the input data.- Parameters:
fx (
numpy.ndarray) – Function values corresponding to the input data.set_polynomial_regressor (
bool, optional) – Whether to recalculate the polynomial regressor after setting the function values. Defaults toTrue.
- greedy_select(N, x=None, fx=None, all=False, norm_='classifier', **kwargs)[source]
Select a subset of points using a greedy Nystrom approximation technique :
\[Y^{n+1} = Y^{n} \cup \arg \sup_{x \in X} d(Y^n,x),\]to quickly approximate the clustering problem \(Y = \arg \inf_{Y \subset X} d(Y,X),\) where we suppose the following structure \(d(Y,X) = \sum_i d(Y,x^i)\).The selection is typically based on norms such as the discrepancy errors for distributions, Frobenius or classifier type distances.
- Parameters:
x (
numpy.ndarray) – Input data points.N (
int) – The number of points to select.fx (
numpy.ndarray, optional) –Function values corresponding to
x.- if fx is None,
- \[d(Y,X) = \frac{1}{N_X} \sum_{n=1}^{N_x} k(x^n,\cdot) - \frac{2}{N_Y} \sum_{m=1}^{N_Y} k(\cdot,y^m)\]This choice corresponds to minimizing the discrepancy error, see
core.op.discrepancy_error().
- if fx is not None, \(d(X,Y) = \|f(X)-f_{k,\theta}(X)\|\)
In which case, we are interested in adaptive mesh or control variate technics.
all (
bool, optional) – IfTrue, all points are selected. Defaults toFalse.norm (
str, optional) –a string to identify the norm used for selection. Can be “frobenius” or “classifier”.
if “frobenius”, \(d(X,Y) = \|f(X)-f_{k,\theta}(X)\|_{\ell2}^2\)
if “classifier”, \(d(X,Y) = \|\softmax(f(X))-\softmax(f_{k,\theta}(X))\|_{\ell_2}^2\) to account for probabilities representation.
user-defined functions coming soon.
start_indices (
list, optional) – an array of indices to set \(Y^0\), otherwise the first is chosen randomly.
- Returns:
Indices of the selected points.
- Return type:
numpy.ndarray
- A simple overload of the kernel
- Kernel.__call__(z: ndarray) ndarray[source]
Predict the output using the kernel for input data
z.- Parameters:
z (
numpy.ndarray) – Input data points for prediction.- Returns:
The predicted values based on the kernel and function values.
- Return type:
numpy.ndarray
Example
>>> z_data = np.array([...]) >>> prediction = kernel(z_data)
Note
This function is similar to
predictin libraries like scikit-learn or XGBoost.If
fxis defined, the prediction is given by the formula \(f_{k, \theta}(z)\).If
fxis not defined, the function returns the projection operator:
\[P_{k,\theta}(z) = K(Z, K) K(X, X)^{-1}\]