decision_boundary ====================================================== .. py:module:: evalml.model_understanding.decision_boundary .. autoapi-nested-parse:: Model Understanding for decision boundary on Binary Classification problems. Module Contents --------------- Functions ~~~~~~~~~ .. autoapisummary:: :nosignatures: evalml.model_understanding.decision_boundary.find_confusion_matrix_per_thresholds Contents ~~~~~~~~~~~~~~~~~~~ .. py:function:: find_confusion_matrix_per_thresholds(pipeline, X, y, n_bins=None, top_k=5, to_json=False) Gets the confusion matrix and histogram bins for each threshold as well as the best threshold per objective. Only works with Binary Classification Pipelines. :param pipeline: A fitted Binary Classification Pipeline to get the confusion matrix with. :type pipeline: PipelineBase :param X: The input features. :type X: pd.DataFrame :param y: The input target. :type y: pd.Series :param n_bins: The number of bins to use to calculate the threshold values. Defaults to None, which will default to using Freedman-Diaconis rule. :type n_bins: int :param top_k: The maximum number of row indices per bin to include as samples. -1 includes all row indices that fall between the bins. Defaults to 5. :type top_k: int :param to_json: Whether or not to return a json output. If False, returns the (DataFrame, dict) tuple, otherwise returns a json. :type to_json: bool :returns: The dataframe has the actual positive histogram, actual negative histogram, the confusion matrix, and a sample of rows that fall in the bin, all for each threshold value. The threshold value, represented through the dataframe index, represents the cutoff threshold at that value. The dictionary contains the ideal threshold and score per objective, keyed by objective name. If json, returns the info for both the dataframe and dictionary as a json output. :rtype: (tuple(pd.DataFrame, dict)), json) :raises ValueError: If the pipeline isn't a binary classification pipeline or isn't yet fitted on data.