training_validation_split

Training Validation Split class.

Module Contents

Classes Summary

TrainingValidationSplit

Split the training data into training and validation sets.

Contents

class evalml.preprocessing.data_splitters.training_validation_split.TrainingValidationSplit(test_size=None, train_size=None, shuffle=False, stratify=None, random_seed=0)[source]

Split the training data into training and validation sets.

Parameters
  • test_size (float) – What percentage of data points should be included in the validation set. Defalts to the complement of train_size if train_size is set, and 0.25 otherwise.

  • train_size (float) – What percentage of data points should be included in the training set. Defaults to the complement of test_size

  • shuffle (boolean) – Whether to shuffle the data before splitting. Defaults to False.

  • stratify (list) – Splits the data in a stratified fashion, using this argument as class labels. Defaults to None.

  • random_seed (int) – The seed to use for random sampling. Defaults to 0.

Methods

get_n_splits

Return the number of splits of this object.

split

Divide the data into training and testing sets.

static get_n_splits()[source]

Return the number of splits of this object.

Returns

Always returns 1.

Return type

int

split(self, X, y=None)[source]

Divide the data into training and testing sets.

Parameters
  • X (pd.DataFrame) – Dataframe of points to split

  • y (pd.Series) – Series of points to split

Returns

Indices to split data into training and test set

Return type

list