Commit Graph

295 Commits

Author SHA1 Message Date
Shaokun dd9202bb01
Update OptunaSearch (#1106)
* update optuna

* update setup

* fix dependencies

* fix bugs in test

* fix bugs, web format

---------

Co-authored-by: “skzhang1” <“shaokunzhang529@gmail.com”>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-07-05 03:14:02 +00:00
Chi Wang 4f1dfe6676
doc update (#1089)
* doc update

* add link to mathchat notebook

* use_docker property

* function name

* version update

---------

Co-authored-by: kevin666aa <yrwu000627@gmail.com>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-07-04 20:29:32 +00:00
Shaokun 7a64148676
support string alg in tune (#1093)
* support string alg in tune

* add test, enforce string feasible, support lexico in set_search_priorities in CFO

* fix bug

* fix bug

* fix bug

* fix bug

* fix bugs

* fix yiran

---------

Co-authored-by: “skzhang1” <“shaokunzhang529@gmail.com”>
2023-07-01 03:01:14 +00:00
Yiran Wu e3ca95bf8a
An agent implementation of MathChat (#1090)
* mathcaht implementation

* code forrmat

* update readme

* update openai.yml

* update openai.yml

* update openai.yml
2023-06-25 13:49:34 +00:00
EgorKraevTransferwise 5245efbd2c
Factor out time series-related functionality into a time series Task object (#989)
* Refactor into automl subpackage

Moved some of the packages into an automl subpackage to tidy before the
task-based refactor. This is in response to discussions with the group
and a comment on the first task-based PR.

Only changes here are moving subpackages and modules into the new
automl, fixing imports to work with this structure and fixing some
dependencies in setup.py.

* Fix doc building post automl subpackage refactor

* Fix broken links in website post automl subpackage refactor

* Fix broken links in website post automl subpackage refactor

* Remove vw from test deps as this is breaking the build

* Move default back to the top-level

I'd moved this to automl as that's where it's used internally, but had
missed that this is actually part of the public interface so makes sense
to live where it was.

* Re-add top level modules with deprecation warnings

flaml.data, flaml.ml and flaml.model are re-added to the top level,
being re-exported from flaml.automl for backwards compatability. Adding
a deprecation warning so that we can have a planned removal later.

* Fix model.py line-endings

* WIP

* WIP - Notes below

Got to the point where the methods from AutoML are pulled to
GenericTask. Started removing private markers and removing the passing
of automl to these methods. Done with decide_split_type, started on
prepare_data. Need to do the others after

* Re-add generic_task

* Most of the merge done, test_forecast_automl fit succeeds, fails at predict()

* Remaining fixes - test_forecast.py passes

* Comment out holidays-related code as it's not currently used

* Further holidays cleanup

* Fix imports in a test

* tidy up validate_data in time series task

* Test fixes

* Fix tests: add Task.__str__

* Fix tests: test for ray.ObjectRef

* Hotwire TS_Sklearn wrapper to fix test fail

* Attempt at test fix

* Fix test where val_pred_y is a list

* Attempt to fix remaining tests

* Push to retrigger tests

* Push to retrigger tests

* Push to retrigger tests

* Push to retrigger tests

* Remove plots from automl/test_forecast

* Remove unused data size field from Task

* Fix import for CLASSIFICATION in notebook

* Monkey patch TFT to avoid plotting, to fix tests on MacOS

* Monkey patch TFT to avoid plotting v2, to fix tests on MacOS

* Monkey patch TFT to avoid plotting v2, to fix tests on MacOS

* Fix circular import

* remove redundant code in task.py post-merge

* Fix test: set svd_solver="full" in PCA

* Update flaml/automl/data.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Fix review comments

* Fix task -> str in custom learner constructor

* Remove unused CLASSIFICATION imports

* Hotwire TS_Sklearn wrapper to fix test fail by setting
optimizer_for_horizon == False

* Revert changes to the automl_classification and pin FLAML version

* Fix imports in reverted notebook

* Fix FLAML version in automl notebooks

* Fix ml.py line endings

* Fix CLASSIFICATION task import in automl_classification notebook

* Uncomment pip install in notebook and revert import

Not convinced this will work because of installing an older version of
the package into the environment in which we're running the tests, but
let's see.

* Revert c6a5dd1a0

* Fix get_classification_objective import in suggest.py

* Remove hcrystallball docs reference in TS_Sklearn

* Merge markharley:extract-task-class-from-automl into this

* Fix import, remove smooth.py

* Fix dependencies to fix TFT fail on Windows Python 3.8 and 3.9

* Add tensorboardX dependency to fix TFT fail on Windows Python 3.8 and 3.9

* Set pytorch-lightning==1.9.0 to fix  TFT fail on Windows Python 3.8 and 3.9

* Set pytorch-lightning==1.9.0 to fix  TFT fail on Windows Python 3.8 and 3.9

* Disable PCA reduction of lagged features for now, to fix svd convervence fail

* Merge flaml/main into time_series_task

* Attempt to fix formatting

* Attempt to fix formatting

* tentatively implement holt-winters-no covariates

* fix forecast method, clean class

* checking external regressors too

* update test forecast

* remove duplicated test file, re-add sarimax, search space cleanup

* Update flaml/automl/model.py

removed links. Most important one probably was: https://robjhyndman.com/hyndsight/ets-regressors/

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* prevent short series

* add docs

* First attempt at merging Holt-Winters

* Linter fix

* Add holt-winters to TimeSeriesTask.estimators

* Fix spark test fail

* Attempt to fix another spark test fail

* Attempt to fix another spark test fail

* Change Black max line length to 127

* Change Black max line length to 120

* Add logging for ARIMA params, clean up time series models inheritance

* Add more logging for missing ARIMA params

* Remove a meaningless test causing a fail, add stricter check on ARIMA params

* Fix a bug in HoltWinters

* A pointless change to hopefully trigger the on and off KeyError in ARIMA.fit()

* Fix formatting

* Attempt to fix formatting

* Attempt to fix formatting

* Attempt to fix formatting

* Attempt to fix formatting

* Add type annotations to _train_with_config() in state.py

* Add type annotations to prepare_sample_train_data() in state.py

* Add docstring for time_col argument of AutoML.fit()

* Address @sonichi's comments on PR

* Fix formatting

* Fix formatting

* Reduce test time budget

* Reduce test time budget

* Increase time budget for the test to pass

* Remove redundant imports

* Remove more redundant imports

* Minor fixes of points raised by Qingyun

* Try to fix pandas import fail

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Formatting fixes

* More formatting fixes

* Added test that loops over TS models to ensure coverage

* Fix formatting issues

* Fix more formatting issues

* Fix random fail in check

* Put back in tests for ARIMA predict without fit

* Put back in tests for lgbm

* Update test/test_model.py

cover dedup

* Match target length to X length in missing test

---------

Co-authored-by: Mark Harley <mark.harley@transferwise.com>
Co-authored-by: Mark Harley <mharley.code@gmail.com>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Andrea W <a.ruggerini@ammagamma.com>
Co-authored-by: Andrea Ruggerini <nescio.adv@gmail.com>
Co-authored-by: Egor Kraev <Egor.Kraev@tw.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-06-19 11:20:32 +00:00
Chi Wang 8760631349
string to array (#1086)
* string to array

* exclude aoai
2023-06-17 13:11:22 +00:00
Chi Wang e1da7f7d68
update openai model support (#1082)
* update openai model support

* new gpt3.5

* docstr

* function_call and content may co-exist

* test function call

---------

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-06-16 00:58:44 +00:00
Qingyun Wu 0c7082c7bf
Docmentation for agents (#1057)
* add agent notebook and documentation

* fix bug

* set flush to True when printing msg in agent

* add a math problem in agent notebook

* remove

* header

* improve notebook doc

* notebook update

* improve notebook example

* improve doc

* improve notebook doc

* improve print

* doc

* human_input_mode

* human_input_mode str

* indent

* indent

* Update flaml/autogen/agent/user_proxy_agent.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* add agent doc

* del old files

* remove chat

* agent doc

* remove chat_agent

* naming

* improve documentation

* wording

* improve agent doc

* wording

* general auto reply

* update agent doc

* human input mode

* add agent figure

* update agent figure

* update agent example figure

* update code example

* extensibility of UserProxyAgent

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-06-14 23:56:13 +00:00
Chi Wang c5dfb03f0e
encode timeout msg in bytes (#1078)
* encode timeout msg in bytes

* fix msg and test
2023-06-12 18:07:14 +00:00
Chi Wang 5387a0a607
Agent notebook example with human feedback; Support shell command and multiple code blocks; Improve the system message for assistant agent; Improve utility functions for config lists; reuse docker image (#1056)
* add agent notebook and documentation

* fix bug

* set flush to True when printing msg in agent

* add a math problem in agent notebook

* remove

* header

* improve notebook doc

* notebook update

* improve notebook example

* improve doc

* agent notebook example with user feedback

* log

* log

* improve notebook doc

* improve print

* doc

* human_input_mode

* human_input_mode str

* indent

* indent

* Update flaml/autogen/agent/user_proxy_agent.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* shell command and multiple code blocks

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* coding agent

* math notebook

* renaming and doc format

* typo

* infer lang

* sh

* docker

* docker

* reset consecutive autoreply counter

* fix explanation

* paper talk

* human feedback

* web info

* rename test

* config list explanation

* link to blogpost

* installation

* homepage features

* features

* features

* rename agent

* remove notebook

* notebook test

* docker command

* notebook update

* lang -> cmd

* notebook

* make it work for gpt-3.5

* return full log

* quote

* docker

* docker

* docker

* docker

* docker

* docker image list

* notebook

* notebook

* use_docker

* use_docker

* use_docker

* doc

* agent

* doc

* abs path

* pandas

* docker

* reuse docker image

* context window

* news

* print format

* pyspark version in py3.8

* pyspark in py3.8

* pyspark and ray

* quote

* pyspark

* pyspark

* pyspark

---------

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-06-09 18:40:04 +00:00
Chi Wang b90e9ee283
doc and test update (#1053)
* doc and test update

* docker update
2023-05-26 20:24:30 +00:00
Chi Wang a0b318b12e
create an automl option to remove unnecessary dependency for autogen and tune (#1007)
* version update post release v1.2.2

* automl option

* import pandas

* remove automl.utils

* default

* test

* type hint and version update

* dependency update

* link to open in colab

* use packging.version to close #725

---------

Co-authored-by: Li Jiang <lijiang1@microsoft.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-05-24 23:55:04 +00:00
Chi Wang e463146cb8
response filter (#1039)
* response filter

* rewrite implement based on the filter

* multi responses

* abs path

* code handling

* option to not use docker

* context

* eval_only -> raise_error

* notebook

* utils

* utils

* separate tests

* test

* test

* test

* test

* test

* test

* test

* test

* **config in test()

* test

* test

* filename
2023-05-21 22:22:29 +00:00
Li Jiang 7de4eb347d
Fix PULL_REQUEST_TEMPLATE and improve test by removing unnecessary environment variable (#1043)
* Improve test by removing unnecessary environment variable

* Fix PULL_REQUEST_TEMPLATE

* Hide pre-commit check

* remove the checkbox for pre-commit

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-05-19 20:05:14 +00:00
Qingyun Wu 2e43509690
Human agent (#1025)
* add human agent and chat agent

* feedback msg

* clean print

* remove redundant import

* make coding agent work

* import check

* terminate condition

* rename

* add docstr

* exitcode to str

* print

* save and execute code

* add max_turn_num

* add max_turn_num in test_agent.py

* reduce max_turn_num in the test

* change max_turn_num to max_consecutive_auto_reply

* update human proxy agent

* remove execution agent and dated docstr

* clean doc

* add back work_dir

* add is_termination_msg when mode is NEVER

* revise stop condition

* remove work_dir in coding agent

* human_proxy_agent docstr

* auto_reply

* clean auto_reply

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-05-16 00:37:38 +00:00
Susan Xueqing Liu f01acb67f6
update model of text summarization (#1030) 2023-05-10 00:48:22 +00:00
Chi Wang 59e882e5cc
chat completion check (#1024)
* chat completion check

* add test

* doc

* timeout

* bump version to 1.2.4
2023-05-09 20:39:46 +00:00
Chi Wang b3fba9734e
Mark experimental classes; doc; multi-config trial (#1021)
* Mark experimental classes

* template

* multi model

* test

* multi-config doc

* doc

* doc

* test

---------

Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-05-05 02:48:31 +00:00
Li Jiang 8b2411b219
update spark session in spark tests (#1006)
* add mlflow and spark integration tests

* remove unused params

* remove mlflow tests
2023-05-03 09:59:29 +00:00
Li Jiang fd1f36597b
update max_spark_parallelism to fit in auto-scale spark cluster (#1008)
* update max_spark_parallelism to fit in auto-scale spark cluster

* update test
2023-05-03 09:16:32 +00:00
garar 31864d2d77
Add mlflow_logging param (#1015)
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-05-03 03:09:04 +00:00
Chi Wang 19aee67f55
coding agent; logging (#1011)
* coding agent

* tsp

* tsp

* aoai

* logging

* compact

* Handle Import Error

* cost function

* reset counter; doc

* reset_counter

* home page update

* use case

* catboost in linux

* catboost

* catboost

* catboost

* doc

* intro

* catboost
2023-05-02 20:38:23 +00:00
Chi Wang fa5ccea862
extract code from text; solve_problem; request_timeout in config; improve code (#999)
* extract code from text

* solve_problem; request_timeout in config

* improve

* move import statement

* improve code

* generate assertions

* constant

* configs for implement; voting

* doc

* execute code in docker

* success indicator of code executation in docker

* success indicator

* execute code

* strip n

* add cost in generate_code

* add docstr

* filename

* bytes

* check docker version

* print log

* python test

* remove api key address

* rename exit code

* success exit code

* datasets

* exit code

* recover openai tests

* cache and pattern match

* wait

* wait

* cache and test

* timeout test

* python image name and skip macos

* windows image

* docker images

* volume path and yaml

* win path -> posix

* extensions

* path

* path

* path

* path

* path

* path

* path

* path

* path

* path

* path

* skip windows

* path

* timeout in windows

* use_docker

* use_docker

* hot fix from #1000

---------

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-04-23 11:50:29 +00:00
Chi Wang d4070e24c1
make context optional; improve error handling and doc (#997)
* make context optional

* better error handling and doc

* skip instantiation if no context

* skip test
2023-04-16 21:18:32 +00:00
Li Jiang c9fc622af1
fix tests failure caused by version incompatibility (#995) 2023-04-15 14:52:40 +00:00
Chi Wang c780d79004
Post release update (#985)
* news update

* doc update

* avoid KeyError

* bump version to 1.2.1

* handle empty responses

* typo

* eval function
2023-04-10 20:46:28 +00:00
Jirka Borovec a701cd82f8
set black with 120 line length (#975)
* set black with 120 line length

* apply pre-commit

* apply black
2023-04-10 19:50:40 +00:00
Susan Xueqing Liu ef5a17cd83
handling nlp divide by zero (#926)
* handling nlp divide by zero

* catching zerodivisionerror

* catching zerodivisionerror

* catching zerodivisionerror

* addressing comments

* addressing comments

* updating test case

* update

* add blank to last line

* update nlp notebook

* rerun

* rerun

* sync with main

* add model selection for nlg

* addressing keyerror

* add raise exception

* update

* fix bug

* revert

* updating automl_nlp

* Update flaml/automl/model.py

Co-authored-by: Zvi Baratz <z.baratz@gmail.com>

* address comments

* address comments

---------

Co-authored-by: Li Jiang <lijiang1@microsoft.com>
Co-authored-by: Zvi Baratz <z.baratz@gmail.com>
2023-04-09 16:53:30 +00:00
Chi Wang 82f0a4309d
autogen subpackage (#968)
* math utils in autogen

* cleanup

* code utils

* remove check function from code response

* comment out test

* GPT-4

* increase request timeout

* name

* logging and error handling

* better doc

* doc

* codegen optimized

* GPT series

* text

* no demo example

* math

* import openai

* import openai

* azure model name

* azure model name

* openai version

* generate assertion if necessary

* condition to generate assertions

* init region key

* rename

* comments about budget

* prompt

---------

Co-authored-by: Susan Xueqing Liu <liususan091219@users.noreply.github.com>
2023-04-08 03:04:01 +00:00
Andrea Ruggerini 7f9402b8fd
Add Holt-Winters exponential smoothing (#962)
* tentatively implement holt-winters-no covariates

* fix forecast method, clean class

* checking external regressors too

* update test forecast

* remove duplicated test file, re-add sarimax, search space cleanup

* Update flaml/automl/model.py

removed links. Most important one probably was: https://robjhyndman.com/hyndsight/ets-regressors/

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* prevent short series

* add docs

---------

Co-authored-by: Andrea W <a.ruggerini@ammagamma.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-04-04 17:29:54 +00:00
Qingyun Wu 45641000c0
Adding a test function for OpenAI completion in flaml (#951)
* improve max_valid_n and doc

* Update README.md

Co-authored-by: Li Jiang <lijiang1@microsoft.com>

* add support for chatgpt

* notebook

* newline at end of file

* chatgpt notebook

* ChatGPT in Azure

* doc

* math

* warning, timeout, log file name

* handle import error

* doc update; default value

* paper

* doc

* docstr

* eval_func

* add a test func in completion

* update notebook

* update math notebook

* improve notebok

* lint and handle exception

* flake8

* exception in test

* add agg_method

* NameError

* refactor

* Update flaml/integrations/oai/completion.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update flaml/integrations/oai/completion.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* add example

* merge files from oai_eval_test

* Revert "merge files from oai_eval_test"
This reverts commit 1e6a550f913bb94df6e9680934ccb7175d00702e.

* merge

* save results to notebook_output

* update version and cache

* update doc

* save nb cell results to file

* fix typo in model name

* code improvements

* improve docstr

* docstr

* docstr on the Returns of test

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
Co-authored-by: Susan Xueqing Liu <liususan091219@users.noreply.github.com>
2023-04-02 16:14:11 +00:00
levscaut 05c5f8f426
more tolerant time limit for test_overtime (#960)
* more tolerant time limit for test_overtime

* Cancel assertion becasue github VM sometimes is super slow

---------

Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2023-03-27 04:12:50 +00:00
Chi Wang 595f5a8025
gpt-4 support; openai workflow fix; model str; timeout; voting (#958)
* workflow; model str; timeout

* voting

* notebook

* pull request

* recover workflow

* voted answer

* aoai

* ignore None answer

* default config

* note

* gpt-4

* n=5

* cleanup

* config name

* introduction

* readme

* avoid None

* add output/ to gitignore

* openai version

* invalid var

* comment long running cells
2023-03-26 17:13:06 +00:00
Li Jiang 50334f2c52
Support spark dataframe as input dataset and spark models as estimators (#934)
* add basic support to Spark dataframe

add support to SynapseML LightGBM model

update to pyspark>=3.2.0 to leverage pandas_on_Spark API

* clean code, add TODOs

* add sample_train_data for pyspark.pandas dataframe, fix bugs

* improve some functions, fix bugs

* fix dict change size during iteration

* update model predict

* update LightGBM model, update test

* update SynapseML LightGBM params

* update synapseML and tests

* update TODOs

* Added support to roc_auc for spark models

* Added support to score of spark estimator

* Added test for automl score of spark estimator

* Added cv support to pyspark.pandas dataframe

* Update test, fix bugs

* Added tests

* Updated docs, tests, added a notebook

* Fix bugs in non-spark env

* Fix bugs and improve tests

* Fix uninstall pyspark

* Fix tests error

* Fix java.lang.OutOfMemoryError: Java heap space

* Fix test_performance

* Update test_sparkml to test_0sparkml to use the expected spark conf

* Remove unnecessary widgets in notebook

* Fix iloc java.lang.StackOverflowError

* fix pre-commit

* Added params check for spark dataframes

* Refactor code for train_test_split to a function

* Update train_test_split_pyspark

* Refactor if-else, remove unnecessary code

* Remove y from predict, remove mem control from n_iter compute

* Update workflow

* Improve _split_pyspark

* Fix test failure of too short training time

* Fix typos, improve docstrings

* Fix index errors of pandas_on_spark, add spark loss metric

* Fix typo of ndcgAtK

* Update NDCG metrics and tests

* Remove unuseful logger

* Use cache and count to ensure consistent indexes

* refactor for merge maain

* fix errors of refactor

* Updated SparkLightGBMEstimator and cache

* Updated config2params

* Remove unused import

* Fix unknown parameters

* Update default_estimator_list

* Add unit tests for spark metrics
2023-03-25 19:59:46 +00:00
Mark Harley 27b2712016
Extract task class from automl (#857)
* Refactor into automl subpackage

Moved some of the packages into an automl subpackage to tidy before the
task-based refactor. This is in response to discussions with the group
and a comment on the first task-based PR.

Only changes here are moving subpackages and modules into the new
automl, fixing imports to work with this structure and fixing some
dependencies in setup.py.

* Fix doc building post automl subpackage refactor

* Fix broken links in website post automl subpackage refactor

* Fix broken links in website post automl subpackage refactor

* Remove vw from test deps as this is breaking the build

* Move default back to the top-level

I'd moved this to automl as that's where it's used internally, but had
missed that this is actually part of the public interface so makes sense
to live where it was.

* Re-add top level modules with deprecation warnings

flaml.data, flaml.ml and flaml.model are re-added to the top level,
being re-exported from flaml.automl for backwards compatability. Adding
a deprecation warning so that we can have a planned removal later.

* Fix model.py line-endings

* WIP

* WIP - Notes below

Got to the point where the methods from AutoML are pulled to
GenericTask. Started removing private markers and removing the passing
of automl to these methods. Done with decide_split_type, started on
prepare_data. Need to do the others after

* Re-add generic_task

* Fix tests: add Task.__str__

* Fix tests: test for ray.ObjectRef

* Hotwire TS_Sklearn wrapper to fix test fail

* Remove unused data size field from Task

* Fix import for CLASSIFICATION in notebook

* Update flaml/automl/data.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Fix review comments

* Fix task -> str in custom learner constructor

* Remove unused CLASSIFICATION imports

* Hotwire TS_Sklearn wrapper to fix test fail by setting
optimizer_for_horizon == False

* Revert changes to the automl_classification and pin FLAML version

* Fix imports in reverted notebook

* Fix FLAML version in automl notebooks

* Fix ml.py line endings

* Fix CLASSIFICATION task import in automl_classification notebook

* Uncomment pip install in notebook and revert import

Not convinced this will work because of installing an older version of
the package into the environment in which we're running the tests, but
let's see.

* Revert c6a5dd1a0

* Revert "Revert c6a5dd1a0"

This reverts commit e55e35adea03993de87b23f092b14c6af623d487.

* Black format model.py

* Bump version to 1.1.2 in automl_xgboost

* Add docstrings to the Task ABC

* Fix import in custom_learner

* fix 'optimize_for_horizon' for ts_sklearn

* remove debugging print statements

* Check for is_forecast() before is_classification() in decide_split_type

* Attempt to fix formatting fail

* Another attempt to fix formatting fail

* And another attempt to fix formatting fail

* Add type annotations for task arg in signatures and docstrings

* Fix formatting

* Fix linting

---------

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
Co-authored-by: EgorKraevTransferwise <egor.kraev@transferwise.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Kevin Chen <chenkevin.8787@gmail.com>
2023-03-11 02:39:08 +00:00
Chi Wang 169012f3e7
ChatGPT support (#942)
* improve max_valid_n and doc

* Update README.md

Co-authored-by: Li Jiang <lijiang1@microsoft.com>

* add support for chatgpt

* notebook

* newline at end of file

* chatgpt notebook

* ChatGPT in Azure

* doc

* math

* warning, timeout, log file name

* handle import error

* doc update; default value

* paper

* doc

* docstr

* eval_func

* prompt and messages

* remove confusing words

* notebook name

---------

Co-authored-by: Li Jiang <lijiang1@microsoft.com>
Co-authored-by: Susan Xueqing Liu <liususan091219@users.noreply.github.com>
2023-03-10 19:35:36 +00:00
Jirka Borovec 2ff1035733
precommit: end-of-file-fixer (#929)
* precommit: end-of-file-fixer

* exclude .gitignore

* apply

---------

Co-authored-by: Shaokun <shaokunzhang529@gmail.com>
2023-02-28 16:27:14 +00:00
levscaut c6a2440348
add PySparkOvertimeMonitor to avoid exceeding time budget (#923)
* merging

* clean commit

* Delete mylearner.py

This file is not needed.

* fix py4j import error

* more tolerant cancelling time

* fix problems following suggestions

* Update flaml/tune/spark/utils.py

Co-authored-by: Li Jiang <bnujli@gmail.com>

* remove redundant model

* Update test/spark/custom_mylearner.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* add docstr

* reverse change in gitignore

* Update test/spark/custom_mylearner.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

---------

Co-authored-by: Li Jiang <bnujli@gmail.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-02-24 08:07:00 +00:00
Jirka Borovec 6aa1d16ebc
pre-commit: update config (#925)
* update config

* apply precommit
2023-02-22 00:49:38 +00:00
Chi Wang 35ce9b79e8
azure oai (#920)
* azure oai

* price update in notebook

* text Davinci

* pytorch-lightning version

* trigger action in merge queue

* types

* doc check in mege group
2023-02-16 23:38:50 +00:00
Chi Wang 63d350d4c8
Openai (#905)
* add cost budget; move loc of make_dir

* support openai completion

* install pytest in workflow

* skip openai test

* test openai

* path for docs rebuild

* install datasets

* signal

* notebook

* notebook in workflow

* optional arguments and special params

* key -> k

* improve readability

* assumption

* optimize for model selection

* larger range of max_tokens

* notebook

* python package workflow

* skip on win
2023-02-05 20:13:08 -08:00
Chi Wang 3b6bfc2876
add cost budget; move loc of make_dir (#888)
* add cost budget; move loc of make_dir

* remove None in return

---------

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-02-05 19:34:59 -05:00
Chi Wang fbea1d06dd
stratified group kfold splitter (#899)
* stratified group kfold splitter

* exclude catboost

---------

Co-authored-by: Shaokun <shaokunzhang529@gmail.com>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-02-05 18:26:14 -05:00
skzhang1 184251a2a7 update 2023-01-28 06:53:37 -08:00
Shaokun 60a3e85b98
Merge branch 'main' into support_percentages 2023-01-17 10:06:51 -05:00
skzhang1 3a68da8774 update 2023-01-17 06:49:59 -08:00
Chi Wang 75e3454120
notebook test; spark warning message; reproducibility bug; sequential tuning stop condition (#869)
* notebook test

* add ipykernel, remove except

* only create dir if not empty

* Stop sequential tuning when result is None

* fix reproducibility of global search

* save gs seed

* use get to avoid KeyError

* test
2023-01-07 18:39:29 -08:00
skzhang1 b7c0c24269 support percentage tolerance for lexicographic 2023-01-07 11:41:24 -08:00
Antoni Baum 5f67c0ab8a
Do not persist entire AutoMLState in Searcher (#870)
* Do not persist entire AutoMLState in Searcher

Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>

* Fix tests

Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>

Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
2023-01-05 18:00:05 -08:00
Chi Wang 90aea9c28b
create dir for log file name (#867) 2022-12-30 10:21:30 -08:00
Li Jiang da2cd7ca89
Add supporting using Spark as the backend of parallel training (#846)
* Added spark support for parallel training.

* Added tests and fixed a bug

* Added more tests and updated docs

* Updated setup.py and docs

* Added customize_learner and tests

* Update spark tests and setup.py

* Update docs and verbose

* Update logging, fix issue in cloud notebook

* Update github workflow for spark tests

* Update github workflow

* Remove hack of handling _choice_

* Allow for failures

* Fix tests, update docs

* Update setup.py

* Update Dockerfile for Spark

* Update tests, remove some warnings

* Add test for notebooks, update utils

* Add performance test for Spark

* Fix lru_cache maxsize

* Fix test failures on some platforms

* Fix coverage report failure

* resovle PR comments

* resovle PR comments 2nd round

* resovle PR comments 3rd round

* fix lint and rename test class

* resovle PR comments 4th round

* refactor customize_learner to broadcast_code
2022-12-23 08:18:49 -08:00
Jing Dong 3a194d047b fix checkpoint.value in the notebook and test 2022-12-19 09:22:16 -08:00
Chi Wang 232c356a4b
fix bug related to _choice_ (#848)
* fix bug related to _choice_

* remove py 3.6

* sanitize config

* optimize test
2022-12-13 15:48:32 -05:00
Mark Harley 44ddf9e104
Refactor into automl subpackage (#809)
* Refactor into automl subpackage

Moved some of the packages into an automl subpackage to tidy before the
task-based refactor. This is in response to discussions with the group
and a comment on the first task-based PR.

Only changes here are moving subpackages and modules into the new
automl, fixing imports to work with this structure and fixing some
dependencies in setup.py.

* Fix doc building post automl subpackage refactor

* Fix broken links in website post automl subpackage refactor

* Fix broken links in website post automl subpackage refactor

* Remove vw from test deps as this is breaking the build

* Move default back to the top-level

I'd moved this to automl as that's where it's used internally, but had
missed that this is actually part of the public interface so makes sense
to live where it was.

* Re-add top level modules with deprecation warnings

flaml.data, flaml.ml and flaml.model are re-added to the top level,
being re-exported from flaml.automl for backwards compatability. Adding
a deprecation warning so that we can have a planned removal later.

* Fix model.py line-endings

* Pin pytorch-lightning to less than 1.8.0

We're seeing strange lightning related bugs from pytorch-forecasting
since the release of lightning 1.8.0. Going to try constraining this to
see if we have a fix.

* Fix the lightning version pin

Was optimistic with setting it in the 1.7.x range, but that isn't
compatible with python 3.6

* Remove lightning version pin

* Revert dependency version changes

* Minor change to retrigger the build

* Fix line endings in ml.py and model.py

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
Co-authored-by: EgorKraevTransferwise <egor.kraev@transferwise.com>
2022-12-06 15:46:08 -05:00
Chi Wang 92b79221b6
make performance test reproducible (#837)
* make performance test reproducible

* fix test error

* Doc update and disable logging

* document random_state and version

* remove hardcoded budget

* fix test error and dependency; close #777

* iloc
2022-12-06 10:13:39 -08:00
Shreyas 3b3b0bfa8e
roc_auc_weighted metric addition (#827)
* Pending changes exported from your codespace

* Update flaml/automl.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update flaml/automl.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update flaml/ml.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update flaml/ml.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update website/docs/Examples/Integrate - Scikit-learn Pipeline.md

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* added documentation for new metric

* Update flaml/ml.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* minor notebook changes

* Update Integrate - Scikit-learn Pipeline.md

* Update notebook/automl_classification.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update integrate_azureml.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2022-12-02 19:27:32 -08:00
Li Jiang 2501b86444
fix typo of output directory (#828)
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2022-11-30 17:04:29 -08:00
Chi Wang 70d86942f4
skip test in py 3.6 (#832) 2022-11-29 13:10:35 -08:00
Chi Wang 595af7a04f
install editable package in codespace (#826)
* install editable package in codespace

* fix test error in test_forecast

* fix test error in test_space

* openml version

* break tests; pre-commit

* skip on py10+win32

* install mlflow in test

* install mlflow in [test]

* skip test in windows

* import

* handle PermissionError

* skip test in windows

* skip test in windows

* skip test in windows

* skip test in windows

* remove ts_forecast_panel from doc
2022-11-27 14:22:54 -05:00
Anonymous-submission-repo 5eb9927642
Add performance test for LexiFlow (#812)
* add test

* fix

* change test name
2022-11-15 10:44:53 -05:00
Chi Wang 30e200985c
Fix issues related to zero-shot automl (#783)
* skip in-search-space check for small max iter

* resolve Pickle Transformer #730

* resolve default config unrecognized #784

* Change definition of init_config

* copy points_to_evaluate

* make test pass

* check learner selector
2022-11-13 12:47:59 -08:00
Anonymous-submission-repo 2daaa4c637 clean up 2022-10-15 03:53:08 +00:00
Anonymous-submission-repo 6df7782c5e
Update test/tune/test_lexiflow.py
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2022-10-14 22:52:07 -04:00
Anonymous-submission-repo a1d9e333fe update 2022-10-14 23:48:05 +00:00
Anonymous-submission-repo c3baf2d4ee delete automl 2022-10-14 23:30:24 +00:00
Anonymous-submission-repo 585bde1ce6 Merge branch 'LexiFlow' of https://github.com/Anonymous-submission-repo/FLAML into LexiFlow 2022-10-14 20:43:50 +00:00
Anonymous-submission-repo bf81912f09 update 2022-10-14 20:40:49 +00:00
Chi Wang cafb67123a
Merge branch 'main' into LexiFlow 2022-10-14 11:04:18 -07:00
Susan Xueqing Liu 2ebddd67ae
Remove NLP classification head (#756)
* rm classification head in nlp

* rm classification head in nlp

* rm classification head in nlp

* adding test cases for switch classification head

* adding test cases for switch classification head

* Update test/nlp/test_autohf_classificationhead.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* adding test cases for switch classification head

* run each test separately

* skip classification head test on windows

* disabling wandb reporting

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* fix test nlp custom metric

* Update website/docs/Examples/AutoML-NLP.md

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update website/docs/Examples/AutoML-NLP.md

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* fix test nlp custom metric

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2022-10-12 17:04:42 -07:00
Anonymous-submission-repo 2d18c49cdd update 2022-10-12 04:31:51 +00:00
Anonymous-submission-repo 4e37826417 update 2022-10-10 01:24:22 +00:00
Anonymous-submission-repo f7a9d42dc7 update 2022-10-10 01:15:17 +00:00
Anonymous-submission-repo 9bc32acafb first 2022-10-09 11:39:29 -04:00
Chi Wang 860cbc233e
move searcher and scheduler into tune (#746)
* move into tune

* correct path

* correct path

* import path
2022-10-04 16:03:22 -07:00
Xueqing Liu ceb3e300cd
Issue724 (#745)
* fixing issue724

* fixing issue724
2022-10-04 10:51:12 -04:00
Chi Wang b7a010e657
Move import location for Ray 2 (#721)
* ray version check when importing

* display learner_class when starting_points removed

* test ray 2
2022-09-13 19:13:06 -07:00
Xueqing Liu 2314cc5a7e
"intermediate_results" TypeError: argument of type 'NoneType' is not iterable (#695)
* fix mlflow bug

* bump version
2022-08-22 13:36:50 -04:00
Chi Wang dffa802b3e
use_best_model for catboost (#679)
* use_best_model for catboost

* bump version to 1.0.11
2022-08-20 18:38:56 -07:00
Xueqing Liu 3d1a28bfc0
Add preserve_checkpoint to preserve the checkpoint after del (#692)
* fix del bug
2022-08-20 18:17:10 -04:00
Qingyun Wu 8b3c6e4d7b
VW version requirement and documentation on config_constraints vs metric_constraints (#686)
* add vw version requirement

* vw version

* version range

* add documentation

* vw version range

* skip test on py3.10

* vw version

* rephrase

* don't install vw on py 3.10

* move import location

* remove inherit

* 3.10 in version

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2022-08-15 20:16:11 -07:00
Chi Wang d60d38b3e9
log_file_name in tune.run() (#681)
* log_file_name in tune.run()

* use_ray validates log_file_name

* assert no ray_args when not use_ray

* import os and use os.path
2022-08-15 06:15:31 -07:00
Chi Wang 5e1059ab82
check config constraints for the initial config (#685)
* check config constraints for the initial config

* default config value
2022-08-15 05:30:23 -07:00
jmrichardson e43485607a
Disable shuffle for custom CV (#659)
* Disable shuffle for custom CV

* Add custom fold shuffle test

* Update test_split.py

* Update test_split.py
2022-08-12 17:05:32 -07:00
Chi Wang ca9f9054e7
categorical choice can be ordered or unordered (#677)
* categorical choice can be ordered or unordered

* ordered -> order

* move choice into utils

* version comparison

* packaging -> setuptools

* import version

* version_parse

* test order for choice
2022-08-12 13:55:17 -07:00
Kevin Chen f718d18b5e
time series forecasting with panel datasets (#541)
* time series forecasting with panel datasets
- integrate Temporal Fusion Transformer as a learner based on pytorchforecasting

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update setup.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update test_forecast.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update setup.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update setup.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update model.py and test_forecast.py
- remove blank lines

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update model.py to prevent errors

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update automl.py and data.py
- change forecast task name
- update documentation for fit() method

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update test_forecast.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update test_forecast.py
- add performance test
- use 'fit_kwargs_by_estimator'

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* add time index function

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update test_forecast.py performance test

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update data.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update automl.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update data.py to prevent type error

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update setup.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update for pytorch forecasting tft on panel datasets

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update automl.py documentations

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* - rename estimator
- add 'gpu_per_trial' for tft estimator

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update test_forecast.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* include ts panel forecasting as an example

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update model.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update documentations

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update automl_time_series_forecast.ipynb

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update documentations

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* "weights_summary" argument deprecated and removed for pl.Trainer()

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update model.py tft estimator prediction method

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update model.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update `fit_kwargs` documentation

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update automl.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2022-08-12 08:39:22 -07:00
jmrichardson 25ad397d55
Skip transform (#665)
* Skip transform

* Fix logic and docstring, add test

* Add period ending to skip_transform doc

* Add skip_transform to retrain_from_log method

* Update test/automl/test_classification.py

Co-authored-by: Xueqing Liu <liususan091219@users.noreply.github.com>

Co-authored-by: Xueqing Liu <liususan091219@users.noreply.github.com>
2022-08-11 19:41:23 -04:00
Rui Zhuang b6e8b9ccca
Add pipeline tuner component and dependencies. (#671)
* add pipeline tuner component and dependencies.

* clean code.

* do not need force rerun.

* replace the resources.

* update metrics retrieving.

* Update test/pipeline_tuning_example/requirements.txt

* Update test/pipeline_tuning_example/train/env.yaml

* Update test/pipeline_tuning_example/tuner/env.yaml

* Update test/pipeline_tuning_example/tuner/tuner_func.py

* Update test/pipeline_tuning_example/data_prep/env.yaml

* fix issues found by lint with flake8.

* add documentation

* add data.

* do not need AML resource for local run.

* AML -> AzureML

* clean code.

* Update website/docs/Examples/Tune-AzureML pipeline.md

* rename and add pip install.

* update figure name.

* align docs with code.

* remove extra line.
2022-08-10 20:20:21 -07:00
Chi Wang 816a82a115
make test result more stable (#646) 2022-08-05 10:17:41 -07:00
Xueqing Liu 21fa6c10ec
Fixing the issue that FLAML trial number is significantly smaller than Transformers.hyperparameter_search (#657)
* fix 636

* adding low cost config

* update padding; update tokenization output y type (series -> DF); update low cost init config

* updating todf; updating metric_loss_score
2022-08-03 00:11:29 -04:00
Xueqing Liu 5eb5d43d7f
Fix HPO evaluation bug (#645)
* fix eval automl metric bug on val_loss inconsistency

* updating starting point search space to continuous

* shortening notebok
2022-07-28 23:08:42 -04:00
Xueqing Liu 731afec9eb
This PR fixes the frequent NLP bugs in the other PRs (#647)
* fix nlp bug

* resetting model to electra small

* removing model_path from fit_kwargs_by_estimator
2022-07-25 17:46:33 -04:00
Chi Wang e14e909af9
Feature names and importances (#621)
* feature names and importances

* None check

* StackingClassifier has no feature_importances_

* StackingClassifier has no feature_names_in_
2022-07-10 12:25:59 -07:00
Qingyun Wu b7846048dc
Allow FLAML_sample_size in starting_points (#619)
* FLAML_sample_size

* clean up

* starting_points as a list

* catch AssertionError

* per estimator sample size

* import

* per estimator min_sample_size

* Update flaml/automl.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update test/automl/test_warmstart.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* add warnings

* adding more tests

* fix a bug in validating starting points

* improve test

* revise test

* revise test

* documentation about custom_hp

* doc and efficiency

* update test

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2022-07-09 16:04:46 -04:00
Xueqing Liu 6108493e0b
fix ner bug; refactor post processing of TransformersEstimator prediction (#615)
* fix ner bug; refactor post processing

* fix too many values to unpack

* supporting id/token label for NER
2022-07-05 13:38:21 -04:00
Chi Wang 74cca60606
Allow custom GroupKFold object as split_type (#616)
* Allow custom GroupKFold object

* handle unpickle error for prophet 1.1

* eval_method in test_object()
2022-06-29 21:04:25 -07:00
Chi Wang cbb85e2aab
Py36 (#614)
* allow installation in py 3.6

* test py 3.6
2022-06-26 08:32:28 -07:00
Chi Wang 7d6822aa40 cath URLError 2022-06-24 08:07:26 -07:00
Chi Wang 4377d53a73
update got version (#607)
* update got version

* None check
2022-06-23 08:02:46 -07:00
Chi Wang c45741a67b
support latest xgboost version (#599)
* support latest xgboost version

* Update test_classification.py

* Update 

Exists problems when installing xgb1.6.1 in py3.6

* cleanup

* xgboost version

* remove time_budget_s in test

* remove redundancy

* stop support of python 3.6

Co-authored-by: zsk <shaokunzhang529@gmail.com>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2022-06-21 18:59:07 -07:00
Chi Wang 1b40b4b3a6
set_search_properties (#595)
* update the signature of set_search_properties
2022-06-16 16:30:50 -07:00