FunGP

Function comparison using Gaussian Process and Hypothesis testing.

from dwse import FunGP
model = FunGP(Xlist, ylist, testset)
mu1, mu2 = model.mu1, model.mu2
class dswe.funGP.FunGP(Xlist, ylist, testset, conf_level=0.95, limit_memory=True, opt_method='L-BFGS-B', sample_size={'band_size': 5000, 'optim_size': 500}, rng_seed=1)[source]
Parameters
  • Xlist (list) – A list, consisting of data sets to match, also each of the individual data set can be a matrix with each column corresponding to one input variable.

  • ylist (list) – A list, consisting of data sets to match, and each list is an array that corresponds to target values of the data sets.

  • testset (np.array) – Test points at which the functions will be compared.

  • conf_level (float) – A single value representing the statistical significance level for constructing the band. Default value is 0.95.

  • limit_memory (bool) – A boolean (True/False) indicating whether to limit the memory use or not. Default is True. If set to True, 5000 datapoints are randomly sampled from each dataset under comparison for inference.

  • opt_method (string) – A string specifying the optimization method to be used for hyperparameter estimation. The best working solver are [‘L-BFGS-B’, ‘BFGS’]. Default value is ‘L-BFGS-B’.

  • sample_size (dict) – A dictionary with two keys: optim_size and band_size, denoting the sample size for each dataset for hyperparameter optimization and confidence band computation, respectively, when limit_memory = TRUE. Default value is {optim_size: 500, band_size: 5000}.

  • rng_seed (int) – Random number genrator (rng) seed for sampling data when limit_memory = TRUE. Default value is 1.

Returns

self with trained parameters.

  • mu1: An array of test prediction for first data set.

  • mu2: An array of test prediction for second data set.

  • mu_diff: An array of pointwise difference between the predictions from the two datasets (mu2-mu1).

  • band: An array of the allowed statistical difference between functions at testpoints in testset.

  • conf_level: A numeric representing the statistical significance level for constructing the band.

  • estimated_params: A list of estimated hyperparameters for GP.

Return type

FunGP

Reference

Prakash, Tuo, and Ding, 2022, “Gaussian process aided function comparison using noisy scattered data,” Technometrics, Vol. 64, pp. 92-102.