evalml.model_understanding.get_prediction_vs_actual_data(y_true, y_pred, outlier_threshold=None)[source]

Combines y_true and y_pred into a single dataframe and adds a column for outliers. Used in graph_prediction_vs_actual().

  • y_true (pd.Series, ww.DataColumn, or np.ndarray) – The real target values of the data

  • y_pred (pd.Series, ww.DataColumn, or np.ndarray) – The predicted values outputted by the regression model.

  • outlier_threshold (int, float) – A positive threshold for what is considered an outlier value. This value is compared to the absolute difference between each value of y_true and y_pred. Values within this threshold will be blue, otherwise they will be yellow. Defaults to None


  • prediction: Predicted values from regression model.

  • actual: Real target values.

  • outlier: Colors indicating which values are in the threshold for what is considered an outlier value.

Return type

pd.DataFrame with the following columns