evalml.preprocessing.KMeansSMOTETVSplit.__init__

KMeansSMOTETVSplit.__init__(sampling_strategy='auto', k_neighbors=2, test_size=None, random_seed=0, **kwargs)[source]

Create a TV or CV data splitter instance

Parameters
  • sampler (sampler instance) – The sampler instance to use for resampling the training data. Must have a fit_resample method. Defaults to None, which is equivalent to regular TV split.

  • test_size (float) – What percentage of data points should be included in the validation set. Defalts to the complement of train_size if train_size is set, and 0.25 otherwise.

  • n_splits (int) – How many CV folds to use. Defaults to 3.

  • shuffle (bool) – Whether or not to shuffle the data. Defaults to True.

  • split_type (str) – Whether to use TV or CV split. Defaults to TV.

  • random_seed (int) – Random seed for the splitter. Defaults to 0