Changelog¶
- Future Releases
Enhancements
Fixes
Changes
- Documentation Changes
Add instructions to freeze master on release.md #726
Testing Changes
- v0.9.0 Apr. 27, 2020
- Enhancements
Added accuracy as an standard objective #624
Added verbose parameter to load_fraud #560
Added Balanced Accuracy metric for binary, multiclass #612 #661
Added XGBoost regressor and XGBoost regression pipeline #666
Added Accuracy metric for multiclass #672
Added objective name in AutoBase.describe_pipeline #686
- Fixes
Removed direct access to cls.component_graph #595
Add testing files to .gitignore #625
Remove circular dependencies from Makefile #637
Add error case for normalize_confusion_matrix() #640
Fixed XGBoostClassifier and XGBoostRegressor bug with feature names that contain [, ], or < #659
Update make_pipeline_graph to not accidentally create empty file when testing if path is valid #649
Fix pip installation warning about docsutils version, from boto dependency #664
Removed zero division warning for F1/precision/recall metrics #671
Fixed summary for pipelines without estimators #707
- Changes
Updated default objective for binary/multiseries classification to log loss #613
Created classification and regression pipeline subclasses and removed objective as an attribute of pipeline classes #405
Changed the output of score to return one dictionary #429
Created binary and multiclass objective subclasses #504
Updated objectives API #445
Removed call to get_plot_data from AutoML #615
Set raise_error to default to True for AutoML classes #638
Remove unnecessary “u” prefixes on some unicode strings #641
Changed one-hot encoder to return uint8 dtypes instead of ints #653
Pipeline _name field changed to custom_name #650
Removed graphs.py and moved methods into PipelineBase #657, #665
Remove s3fs as a dev dependency #664
Changed requirements-parser to be a core dependency #673
Replace supported_problem_types field on pipelines with problem_type attribute on base classes #678
Changed AutoML to only show best results for a given pipeline template in rankings, added full_rankings property to show all #682
Update ModelFamily values: don’t list xgboost/catboost as classifiers now that we have regression pipelines for them #677
Changed AutoML’s describe_pipeline to get problem type from pipeline instead #685
Standardize import_or_raise error messages #683
Updated argument order of objectives to align with sklearn’s #698
Renamed pipeline.feature_importance_graph to pipeline.graph_feature_importances #700
Moved ROC and confusion matrix methods to evalml.pipelines.plot_utils #704
Renamed MultiClassificationObjective to MulticlassClassificationObjective, to align with pipeline naming scheme #715
- Documentation Changes
Fixed some sphinx warnings #593
Fixed docstring for AutoClassificationSearch with correct command #599
Limit readthedocs formats to pdf, not htmlzip and epub #594 #600
Clean up objectives API documentation #605
Fixed function on Exploring search results page #604
Update release process doc #567
AutoClassificationSearch and AutoRegressionSearch show inherited methods in API reference #651
Fixed improperly formatted code in breaking changes for changelog #655
Added configuration to treat Sphinx warnings as errors #660
Removed separate plotting section for pipelines in API reference #657, #665
Have leads example notebook load S3 files using https, so we can delete s3fs dev dependency #664
Categorized components in API reference and added descriptions for each category #663
Fixed Sphinx warnings about BalancedAccuracy objective #669
Updated API reference to include missing components and clean up pipeline docstrings #689
Reorganize API ref, and clarify pipeline sub-titles #688
Add and update preprocessing utils in API reference #687
Added inheritance diagrams to API reference #695
Documented which default objective AutoML optimizes for #699
Create seperate install page #701
Include more utils in API ref, like import_or_raise #704
Add more color to pipeline documentation #705
- Testing Changes
Matched install commands of check_latest_dependencies test and it’s GitHub action #578
Added Github app to auto assign PR author as assignee #477
Removed unneeded conda installation of xgboost in windows checkin tests #618
Update graph tests to always use tmpfile dir #649
Changelog checkin test workaround for release PRs: If ‘future release’ section is empty of PR refs, pass check #658
Add changelog checkin test exception for dep-update branch #723
Warning
Breaking Changes
Pipelines will now no longer take an objective parameter during instantiation, and will no longer have an objective attribute.
fit()
andpredict()
now use an optionalobjective
parameter, which is only used in binary classification pipelines to fit for a specific objective.score()
will now use a requiredobjectives
parameter that is used to determine all the objectives to score on. This differs from the previous behavior, where the pipeline’s objective was scored on regardless.score()
will now return one dictionary of all objective scores.ROC
andConfusionMatrix
plot methods viaAuto(*).plot
have been removed by #615 and are replaced byroc_curve
andconfusion_matrix
in evamlm.pipelines.plot_utils` in #704normalize_confusion_matrix
has been moved toevalml.pipelines.plot_utils
#704Pipelines
_name
field changed tocustom_name
Pipelines
supported_problem_types
field is removed because it is no longer necessary #678Updated argument order of objectives’ objective_function to align with sklearn #698
pipeline.feature_importance_graph has been renamed to pipeline.graph_feature_importances in #700
Removed unsupported
MSLE
objective #704
- v0.8.0 Apr. 1, 2020
- Enhancements
Add normalization option and information to confusion matrix #484
Add util function to drop rows with NaN values #487
Renamed PipelineBase.name as PipelineBase.summary and redefined PipelineBase.name as class property #491
Added access to parameters in Pipelines with PipelineBase.parameters (used to be return of PipelineBase.describe) #501
Added fill_value parameter for SimpleImputer #509
Added functionality to override component hyperparameters and made pipelines take hyperparemeters from components #516
Allow numpy.random.RandomState for random_state parameters #556
- Fixes
Removed unused dependency matplotlib, and move category_encoders to test reqs #572
- Changes
Undo version cap in XGBoost placed in #402 and allowed all released of XGBoost #407
Support pandas 1.0.0 #486
Made all references to the logger static #503
Refactored model_type parameter for components and pipelines to model_family #507
Refactored problem_types for pipelines and components into supported_problem_types #515
Moved pipelines/utils.save_pipeline and pipelines/utils.load_pipeline to PipelineBase.save and PipelineBase.load #526
Limit number of categories encoded by OneHotEncoder #517
Warning
Breaking Changes
AutoClassificationSearch
andAutoRegressionSearch
’smodel_types
parameter has been refactored intoallowed_model_families
ModelTypes
enum has been changed toModelFamily
Components and Pipelines now have a
model_family
field instead ofmodel_type
get_pipelines
utility function now acceptsmodel_families
as an argument instead ofmodel_types
PipelineBase.name
no longer returns structure of pipeline and has been replaced byPipelineBase.summary
PipelineBase.problem_types
andEstimator.problem_types
has been renamed tosupported_problem_types
pipelines/utils.save_pipeline
andpipelines/utils.load_pipeline
moved toPipelineBase.save
andPipelineBase.load
- v0.7.0 Mar. 9, 2020
- Enhancements
Added emacs buffers to .gitignore #350
Add CatBoost (gradient-boosted trees) classification and regression components and pipelines #247
Added Tuner abstract base class #351
Added n_jobs as parameter for AutoClassificationSearch and AutoRegressionSearch #403
Changed colors of confusion matrix to shades of blue and updated axis order to match scikit-learn’s #426
Added PipelineBase graph and feature_importance_graph methods, moved from previous location #423
Added support for python 3.8 #462
- Changes
Added n_estimators as a tunable parameter for XGBoost #307
Remove unused parameter ObjectiveBase.fit_needs_proba #320
Remove extraneous parameter component_type from all components #361
Remove unused rankings.csv file #397
Downloaded demo and test datasets so unit tests can run offline #408
Remove _needs_fitting attribute from Components #398
Changed plot.feature_importance to show only non-zero feature importances by default, added optional parameter to show all #413
Refactored PipelineBase to take in parameter dictionary and moved pipeline metadata to class attribute #421
Dropped support for Python 3.5 #438
Removed unused apply.py file #449
Clean up requirements.txt to remove unused deps #451
Support installation without all required dependencies #459
- Documentation Changes
Update release.md with instructions to release to internal license key #354
- Testing Changes
Added tests for utils (and moved current utils to gen_utils) #297
Moved XGBoost install into it’s own separate step on Windows using Conda #313
Rewind pandas version to before 1.0.0, to diagnose test failures for that version #325
Added dependency update checkin test #324
Rewind XGBoost version to before 1.0.0 to diagnose test failures for that version #402
Update dependency check to use a whitelist #417
Update unit test jobs to not install dev deps #455
Warning
Breaking Changes
Python 3.5 will not be actively supported.
- v0.6.0 Dec. 16, 2019
- Enhancements
Added ability to create a plot of feature importances #133
Add early stopping to AutoML using patience and tolerance parameters #241
Added ROC and confusion matrix metrics and plot for classification problems and introduce PipelineSearchPlots class #242
Enhanced AutoML results with search order #260
Added utility function to show system and environment information #300
- Changes
Renamed automl classes to AutoRegressionSearch and AutoClassificationSearch #287
Updating demo datasets to retain column names #223
Moving pipeline visualization to PipelinePlots class #228
Standarizing inputs as pd.Dataframe / pd.Series #130
Enforcing that pipelines must have an estimator as last component #277
Added ipywidgets as a dependency in requirements.txt #278
Added Random and Grid Search Tuners #240
Warning
Breaking Changes
The
fit()
method forAutoClassifier
andAutoRegressor
has been renamed tosearch()
.AutoClassifier
has been renamed toAutoClassificationSearch
AutoRegressor
has been renamed toAutoRegressionSearch
AutoClassificationSearch.results
andAutoRegressionSearch.results
now is a dictionary withpipeline_results
andsearch_order
keys.pipeline_results
can be used to access a dictionary that is identical to the old.results
dictionary. Whereas,search_order
returns a list of the search order in terms ofpipeline_id
.Pipelines now require an estimator as the last component in
component_list
. Slicing pipelines now throws anNotImplementedError
to avoid returning pipelines without an estimator.
- v0.5.2 Nov. 18, 2019
- v0.5.1 Nov. 15, 2019
- v0.5.0 Oct. 29, 2019
- Enhancements
Added basic one hot encoding #73
Use enums for model_type #110
Support for splitting regression datasets #112
Auto-infer multiclass classification #99
Added support for other units in max_time #125
Detect highly null columns #121
Added additional regression objectives #100
Show an interactive iteration vs. score plot when using fit() #134
- v0.4.1 Sep. 16, 2019
- Enhancements
Added AutoML for classification and regressor using Autobase and Skopt #7 #9
Implemented standard classification and regression metrics #7
Added logistic regression, random forest, and XGBoost pipelines #7
Implemented support for custom objectives #15
Feature importance for pipelines #18
Serialization for pipelines #19
Allow fitting on objectives for optimal threshold #27
Added detect label leakage #31
Implemented callbacks #42
Allow for multiclass classification #21
Added support for additional objectives #79
- Testing Changes
Added testing for loading data #39
- v0.2.0 Aug. 13, 2019
- Enhancements
Created fraud detection objective #4
- v0.1.0 July. 31, 2019