Changelog

Future Releases
  • Enhancements

  • Fixes

  • Changes

  • Documentation Changes

  • Testing Changes

v0.8.0 Apr. 1, 2020
  • Enhancements
    • Add normalization option and information to confusion matrix #484

    • Add util function to drop rows with NaN values #487

    • Renamed PipelineBase.name as PipelineBase.summary and redefined PipelineBase.name as class property #491

    • Added access to parameters in Pipelines with PipelineBase.parameters (used to be return of PipelineBase.describe) #501

    • Added fill_value parameter for SimpleImputer #509

    • Added functionality to override component hyperparemeters and made pipelines take hyperparemeters from components #516

    • Allow numpy.random.RandomState for random_state parameters #556

  • Fixes

  • Changes
    • Undo version cap in XGBoost placed in #402 and allowed all released of XGBoost #407

    • Support pandas 1.0.0 #486

    • Made all references to the logger static #503

    • Refactored model_type parameter for components and pipelines to model_family #507

    • Refactored problem_types for pipelines and components into supported_problem_types #515

    • Moved pipelines/utils.save_pipeline and pipelines/utils.load_pipeline to PipelineBase.save and PipelineBase.load #526

    • Limit number of categories encoded by OneHotEncoder #517

  • Documentation Changes
    • Updated API reference to remove PipelinePlot and added moved PipelineBase plotting methods #483

    • Add code style and github issue guides #463 #512

    • Updated API reference for to surface class variables for pipelines and components #537

  • Testing Changes
    • Added automated dependency check PR #482, #505

    • Updated automated dependency check comment #497

    • Have build_docs job use python executor, so that env vars are set properly #547

    • Run windows unit tests on PRs #557

Warning

Breaking Changes

  • AutoClassificationSearch and AutoRegressionSearch’s model_types parameter has been refactored into allowed_model_families

  • ModelTypes enum has been changed to ModelFamily

  • Components and Pipelines now have a model_family field instead of model_type

  • get_pipelines utility function now accepts model_families as an argument instead of model_types

  • PipelineBase.name no longer returns structure of pipeline and has been replaced by PipelineBase.summary

  • PipelineBase.problem_types and Estimator.problem_types has been renamed to supported_problem_types

  • pipelines/utils.save_pipeline and pipelines/utils.load_pipeline moved to PipelineBase.save and PipelineBase.load

v0.7.0 Mar. 9, 2020
  • Enhancements
    • Added emacs buffers to .gitignore #350

    • Add CatBoost (gradient-boosted trees) classification and regression components and pipelines #247

    • Added Tuner abstract base class #351

    • Added n_jobs as parameter for AutoClassificationSearch and AutoRegressionSearch #403

    • Changed colors of confusion matrix to shades of blue and updated axis order to match scikit-learn’s #426

    • Added PipelineBase graph and feature_importance_graph methods, moved from previous location #423

    • Added support for python 3.8 #462

  • Fixes
    • Fixed ROC and confusion matrix plots not being calculated if user passed own additional_objectives #276

    • Fixed ReadtheDocs FileNotFoundError exception for fraud dataset #439

  • Changes
    • Added n_estimators as a tunable parameter for XGBoost #307

    • Remove unused parameter ObjectiveBase.fit_needs_proba #320

    • Remove extraneous parameter component_type from all components #361

    • Remove unused rankings.csv file #397

    • Downloaded demo and test datasets so unit tests can run offline #408

    • Remove _needs_fitting attribute from Components #398

    • Changed plot.feature_importance to show only non-zero feature importances by default, added optional parameter to show all #413

    • Refactored PipelineBase to take in parameter dictionary and moved pipeline metadata to class attribute #421

    • Dropped support for Python 3.5 #438

    • Removed unused apply.py file #449

    • Clean up requirements.txt to remove unused deps #451

    • Support installation without all required dependencies #459

  • Documentation Changes
    • Update release.md with instructions to release to internal license key #354

  • Testing Changes
    • Added tests for utils (and moved current utils to gen_utils) #297

    • Moved XGBoost install into it’s own separate step on Windows using Conda #313

    • Rewind pandas version to before 1.0.0, to diagnose test failures for that version #325

    • Added dependency update checkin test #324

    • Rewind XGBoost version to before 1.0.0 to diagnose test failures for that version #402

    • Update dependency check to use a whitelist #417

    • Update unit test jobs to not install dev deps #455

Warning

Breaking Changes

  • Python 3.5 will not be actively supported.

v0.6.0 Dec. 16, 2019
  • Enhancements
    • Added ability to create a plot of feature importances #133

    • Add early stopping to AutoML using patience and tolerance parameters #241

    • Added ROC and confusion matrix metrics and plot for classification problems and introduce PipelineSearchPlots class #242

    • Enhanced AutoML results with search order #260

  • Fixes
    • Lower botocore requirement #235

    • Fixed decision_function calculation for FraudCost objective #254

    • Fixed return value of Recall metrics #264

    • Components return self on fit #289

  • Changes
    • Renamed automl classes to AutoRegressionSearch and AutoClassificationSearch #287

    • Updating demo datasets to retain column names #223

    • Moving pipeline visualization to PipelinePlots class #228

    • Standarizing inputs as pd.Dataframe / pd.Series #130

    • Enforcing that pipelines must have an estimator as last component #277

    • Added ipywidgets as a dependency in requirements.txt #278

    • Added Random and Grid Search Tuners #240

  • Documentation Changes
    • Adding class properties to API reference #244

    • Fix and filter FutureWarnings from scikit-learn #249, #257

    • Adding Linear Regression to API reference and cleaning up some Sphinx warnings #227

  • Testing Changes
    • Added support for testing on Windows with CircleCI #226

    • Added support for doctests #233

Warning

Breaking Changes

  • The fit() method for AutoClassifier and AutoRegressor has been renamed to search().

  • AutoClassifier has been renamed to AutoClassificationSearch

  • AutoRegressor has been renamed to AutoRegressionSearch

  • AutoClassificationSearch.results and AutoRegressionSearch.results now is a dictionary with pipeline_results and search_order keys. pipeline_results can be used to access a dictionary that is identical to the old .results dictionary. Whereas,``search_order`` returns a list of the search order in terms of pipeline id.

  • Pipelines now require an estimator as the last component in component_list. Slicing pipelines now throws an NotImplementedError to avoid returning Pipelines without an estimator.

v0.5.2 Nov. 18, 2019
  • Enhancements
    • Adding basic pipeline structure visualization #211

  • Documentation Changes
    • Added notebooks to build process #212

v0.5.1 Nov. 15, 2019
  • Enhancements
    • Added basic outlier detection guardrail #151

    • Added basic ID column guardrail #135

    • Added support for unlimited pipelines with a max_time limit #70

    • Updated .readthedocs.yaml to successfully build #188

  • Fixes
    • Removed MSLE from default additional objectives #203

    • Fixed random_state passed in pipelines #204

    • Fixed slow down in RFRegressor #206

  • Changes
    • Pulled information for describe_pipeline from pipeline’s new describe method #190

    • Refactored pipelines #108

    • Removed guardrails from Auto(*) #202, #208

  • Documentation Changes
    • Updated documentation to show max_time enhancements #189

    • Updated release instructions for RTD #193

    • Added notebooks to build process #212

    • Added contributing instructions #213

    • Added new content #222

v0.5.0 Oct. 29, 2019
  • Enhancements
    • Added basic one hot encoding #73

    • Use enums for model_type #110

    • Support for splitting regression datasets #112

    • Auto-infer multiclass classification #99

    • Added support for other units in max_time #125

    • Detect highly null columns #121

    • Added additional regression objectives #100

    • Show an interactive iteration vs. score plot when using fit() #134

  • Fixes
    • Reordered describe_pipeline #94

    • Added type check for model_type #109

    • Fixed s units when setting string max_time #132

    • Fix objectives not appearing in API documentation #150

  • Changes
    • Reorganized tests #93

    • Moved logging to its own module #119

    • Show progress bar history #111

    • Using cloudpickle instead of pickle to allow unloading of custom objectives #113

    • Removed render.py #154

  • Documentation Changes
    • Update release instructions #140

    • Include additional_objectives parameter #124

    • Added Changelog #136

  • Testing Changes
    • Code coverage #90

    • Added CircleCI tests for other Python versions #104

    • Added doc notebooks as tests #139

    • Test metadata for CircleCI and 2 core parallelism #137

v0.4.1 Sep. 16, 2019
  • Enhancements
    • Added AutoML for classification and regressor using Autobase and Skopt #7 #9

    • Implemented standard classification and regression metrics #7

    • Added logistic regression, random forest, and XGBoost pipelines #7

    • Implemented support for custom objectives #15

    • Feature importance for pipelines #18

    • Serialization for pipelines #19

    • Allow fitting on objectives for optimal threshold #27

    • Added detect label leakage #31

    • Implemented callbacks #42

    • Allow for multiclass classification #21

    • Added support for additional objectives #79

  • Fixes
    • Fixed feature selection in pipelines #13

    • Made random_seed usage consistent #45

  • Documentation Changes
    • Documentation Changes

    • Added docstrings #6

    • Created notebooks for docs #6

    • Initialized readthedocs EvalML #6

    • Added favicon #38

  • Testing Changes
    • Added testing for loading data #39

v0.2.0 Aug. 13, 2019
  • Enhancements
    • Created fraud detection objective #4

v0.1.0 July. 31, 2019
  • First Release

  • Enhancements
    • Added lead scoring objecitve #1

    • Added basic classifier #1

  • Documentation Changes
    • Initialized Sphinx for docs #1