evalml.preprocessing.SMOTETomekTVSplit.__init__

SMOTETomekTVSplit.__init__(sampling_strategy='auto', test_size=None, n_jobs=- 1, random_seed=0)[source]

Create a TV or CV data splitter instance

Parameters
  • sampler (sampler instance) – The sampler instance to use for resampling the training data. Must have a fit_resample method. Defaults to None, which is equivalent to regular TV split.

  • test_size (float) – What percentage of data points should be included in the validation set. Defalts to the complement of train_size if train_size is set, and 0.25 otherwise.

  • n_splits (int) – How many CV folds to use. Defaults to 3.

  • shuffle (bool) – Whether or not to shuffle the data. Defaults to True.

  • split_type (str) – Whether to use TV or CV split. Defaults to TV.

  • random_seed (int) – Random seed for the splitter. Defaults to 0