> Dask-ML works with {scikit-learn, xgboost, tensorflow, TPOT,}. ETL is your responsibility. Loading things into parquet format affords a lot of flexibility in terms of (non-SQL) datastores or just efficiently packed files on disk that need to be paged into/over in RAM. (Edit)
>> Featuretools is a framework to perform automated feature engineering. It excels at transforming temporal and relational datasets into feature matrices for machine learning
> Creating a feature matrix from a very large dataset can be problematic if the underlying pandas dataframes that make up the entities cannot easily fit in memory. To help get around this issue, Featuretools supports creating Entity and EntitySet objects from Dask dataframes. A Dask EntitySet can then be passed to featuretools.dfs or featuretools.calculate_feature_matrix to create a feature matrix, which will be returned as a Dask dataframe. In addition to working on larger than memory datasets, this approach also allows users to take advantage of the parallel and distributed processing capabilities offered by Dask
It is a pain to move in and out of scikit and write all these wrappers and converters. I would prefer to do more in one framework.
For example, doing hyperparameter optimization in pytorch using scikit can be a bit painful sometimes