Changelog¶

Future Releases

Enhancements
Fixes
Changes
Documentation Changes
Testing Changes

v0.8.0 Apr. 1, 2020

Enhancements
- Add normalization option and information to confusion matrix #484
- Add util function to drop rows with NaN values #487
- Renamed PipelineBase.name as PipelineBase.summary and redefined PipelineBase.name as class property #491
- Added access to parameters in Pipelines with PipelineBase.parameters (used to be return of PipelineBase.describe) #501
- Added fill_value parameter for SimpleImputer #509
- Added functionality to override component hyperparemeters and made pipelines take hyperparemeters from components #516
- Allow numpy.random.RandomState for random_state parameters #556
Fixes
Changes
- Undo version cap in XGBoost placed in #402 and allowed all released of XGBoost #407
- Support pandas 1.0.0 #486
- Made all references to the logger static #503
- Refactored model_type parameter for components and pipelines to model_family #507
- Refactored problem_types for pipelines and components into supported_problem_types #515
- Moved pipelines/utils.save_pipeline and pipelines/utils.load_pipeline to PipelineBase.save and PipelineBase.load #526
- Limit number of categories encoded by OneHotEncoder #517
Documentation Changes
- Updated API reference to remove PipelinePlot and added moved PipelineBase plotting methods #483
- Add code style and github issue guides #463 #512
- Updated API reference for to surface class variables for pipelines and components #537
Testing Changes
- Added automated dependency check PR #482, #505
- Updated automated dependency check comment #497
- Have build_docs job use python executor, so that env vars are set properly #547
- Run windows unit tests on PRs #557

Warning

Breaking Changes

AutoClassificationSearch and AutoRegressionSearch’s model_types parameter has been refactored into allowed_model_families
ModelTypes enum has been changed to ModelFamily
Components and Pipelines now have a model_family field instead of model_type
get_pipelines utility function now accepts model_families as an argument instead of model_types
PipelineBase.name no longer returns structure of pipeline and has been replaced by PipelineBase.summary
PipelineBase.problem_types and Estimator.problem_types has been renamed to supported_problem_types
pipelines/utils.save_pipeline and pipelines/utils.load_pipeline moved to PipelineBase.save and PipelineBase.load

v0.7.0 Mar. 9, 2020

Enhancements
- Added emacs buffers to .gitignore #350
- Add CatBoost (gradient-boosted trees) classification and regression components and pipelines #247
- Added Tuner abstract base class #351
- Added n_jobs as parameter for AutoClassificationSearch and AutoRegressionSearch #403
- Changed colors of confusion matrix to shades of blue and updated axis order to match scikit-learn’s #426
- Added PipelineBase graph and feature_importance_graph methods, moved from previous location #423
- Added support for python 3.8 #462
Fixes
- Fixed ROC and confusion matrix plots not being calculated if user passed own additional_objectives #276
- Fixed ReadtheDocs FileNotFoundError exception for fraud dataset #439
Changes
- Added n_estimators as a tunable parameter for XGBoost #307
- Remove unused parameter ObjectiveBase.fit_needs_proba #320
- Remove extraneous parameter component_type from all components #361
- Remove unused rankings.csv file #397
- Downloaded demo and test datasets so unit tests can run offline #408
- Remove _needs_fitting attribute from Components #398
- Changed plot.feature_importance to show only non-zero feature importances by default, added optional parameter to show all #413
- Refactored PipelineBase to take in parameter dictionary and moved pipeline metadata to class attribute #421
- Dropped support for Python 3.5 #438
- Removed unused apply.py file #449
- Clean up requirements.txt to remove unused deps #451
- Support installation without all required dependencies #459
Documentation Changes
- Update release.md with instructions to release to internal license key #354
Testing Changes
- Added tests for utils (and moved current utils to gen_utils) #297
- Moved XGBoost install into it’s own separate step on Windows using Conda #313
- Rewind pandas version to before 1.0.0, to diagnose test failures for that version #325
- Added dependency update checkin test #324
- Rewind XGBoost version to before 1.0.0 to diagnose test failures for that version #402
- Update dependency check to use a whitelist #417
- Update unit test jobs to not install dev deps #455

Warning

Breaking Changes

Python 3.5 will not be actively supported.

v0.6.0 Dec. 16, 2019

Enhancements
- Added ability to create a plot of feature importances #133
- Add early stopping to AutoML using patience and tolerance parameters #241
- Added ROC and confusion matrix metrics and plot for classification problems and introduce PipelineSearchPlots class #242
- Enhanced AutoML results with search order #260
Fixes
- Lower botocore requirement #235
- Fixed decision_function calculation for FraudCost objective #254
- Fixed return value of Recall metrics #264
- Components return self on fit #289
Changes
- Renamed automl classes to AutoRegressionSearch and AutoClassificationSearch #287
- Updating demo datasets to retain column names #223
- Moving pipeline visualization to PipelinePlots class #228
- Standarizing inputs as pd.Dataframe / pd.Series #130
- Enforcing that pipelines must have an estimator as last component #277
- Added ipywidgets as a dependency in requirements.txt #278
- Added Random and Grid Search Tuners #240
Documentation Changes
- Adding class properties to API reference #244
- Fix and filter FutureWarnings from scikit-learn #249, #257
- Adding Linear Regression to API reference and cleaning up some Sphinx warnings #227
Testing Changes
- Added support for testing on Windows with CircleCI #226
- Added support for doctests #233

Warning

Breaking Changes

The fit() method for AutoClassifier and AutoRegressor has been renamed to search().
AutoClassifier has been renamed to AutoClassificationSearch
AutoRegressor has been renamed to AutoRegressionSearch
AutoClassificationSearch.results and AutoRegressionSearch.results now is a dictionary with pipeline_results and search_order keys. pipeline_results can be used to access a dictionary that is identical to the old .results dictionary. Whereas,``search_order`` returns a list of the search order in terms of pipeline id.
Pipelines now require an estimator as the last component in component_list. Slicing pipelines now throws an NotImplementedError to avoid returning Pipelines without an estimator.

v0.5.2 Nov. 18, 2019

Enhancements
- Adding basic pipeline structure visualization #211
Documentation Changes
- Added notebooks to build process #212

v0.5.1 Nov. 15, 2019

Enhancements
- Added basic outlier detection guardrail #151
- Added basic ID column guardrail #135
- Added support for unlimited pipelines with a max_time limit #70
- Updated .readthedocs.yaml to successfully build #188
Fixes
- Removed MSLE from default additional objectives #203
- Fixed random_state passed in pipelines #204
- Fixed slow down in RFRegressor #206
Changes
- Pulled information for describe_pipeline from pipeline’s new describe method #190
- Refactored pipelines #108
- Removed guardrails from Auto(*) #202, #208
Documentation Changes
- Updated documentation to show max_time enhancements #189
- Updated release instructions for RTD #193
- Added notebooks to build process #212
- Added contributing instructions #213
- Added new content #222

v0.5.0 Oct. 29, 2019

Enhancements
- Added basic one hot encoding #73
- Use enums for model_type #110
- Support for splitting regression datasets #112
- Auto-infer multiclass classification #99
- Added support for other units in max_time #125
- Detect highly null columns #121
- Added additional regression objectives #100
- Show an interactive iteration vs. score plot when using fit() #134
Fixes
- Reordered describe_pipeline #94
- Added type check for model_type #109
- Fixed s units when setting string max_time #132
- Fix objectives not appearing in API documentation #150
Changes
- Reorganized tests #93
- Moved logging to its own module #119
- Show progress bar history #111
- Using cloudpickle instead of pickle to allow unloading of custom objectives #113
- Removed render.py #154
Documentation Changes
- Update release instructions #140
- Include additional_objectives parameter #124
- Added Changelog #136
Testing Changes
- Code coverage #90
- Added CircleCI tests for other Python versions #104
- Added doc notebooks as tests #139
- Test metadata for CircleCI and 2 core parallelism #137

v0.4.1 Sep. 16, 2019

Enhancements
- Added AutoML for classification and regressor using Autobase and Skopt #7 #9
- Implemented standard classification and regression metrics #7
- Added logistic regression, random forest, and XGBoost pipelines #7
- Implemented support for custom objectives #15
- Feature importance for pipelines #18
- Serialization for pipelines #19
- Allow fitting on objectives for optimal threshold #27
- Added detect label leakage #31
- Implemented callbacks #42
- Allow for multiclass classification #21
- Added support for additional objectives #79
Fixes
- Fixed feature selection in pipelines #13
- Made random_seed usage consistent #45
Documentation Changes
- Documentation Changes
- Added docstrings #6
- Created notebooks for docs #6
- Initialized readthedocs EvalML #6
- Added favicon #38
Testing Changes
- Added testing for loading data #39

v0.2.0 Aug. 13, 2019

Enhancements
- Created fraud detection objective #4

v0.1.0 July. 31, 2019

First Release
Enhancements
- Added lead scoring objecitve #1
- Added basic classifier #1
Documentation Changes
- Initialized Sphinx for docs #1