Using CivisML for building great models quickly
CivisML is a machine learning service that allows data science teams to build and score models using the cloud computing resources available through Civis Platform. With CivisML, data scientists can massively increase the speed with which great models make it into production. CivisML uses the open source Python package, scikit-learn, and allows data scientists to build sophisticated modeling pipelines, without writing many lines of redundant code. And because both model training and scoring can be scaled across as many servers as needed, iterating and delivering models is dramatically faster than with just a laptop.
CivisML can be used in a few different ways:
API clients: Civis Analytics maintains open source API clients in Python and R. These API clients are automatically available in Jupyter notebooks started in Civis Platform, or you can import the clients to wherever you’re writing your code.
CivisML script templates: These templates (one for training and validation, and one for prediction) can be used for all CivisML features without writing any code. Since input formats can be complex, they are most convenient to use when making changes to an existing model initially set up via the other methods. Find these script templates by selecting “CODE” in the top navigation bar, clicking “More Script Templates...”, and then choosing the latest “Model Training” or “Model Prediction” templates. For more details on the CivisML templates, click here.
There are several ways to get started with CivisML:
- For those who learn through example, there are two useful example Jupyter notebooks. You can download these from Github and load them to Civis Platform to go through step by step:
- The Machine Learning section of the Python API client documentation provides a useful overview of the CivisML Python interface.
- The R API client has a vignette on how to use CivisML in R, as well as complete documentation of the API client more generally.
By default, CivisML automatically uses the latest version in production.
If you would like to bring in a custom model, please note that the latest version of CivisML has the following libraries preinstalled:
- scikit-learn v0.22.1
- glmnet v2.1.1
- xgboost v0.81
- muffin v2.3.0
- civisml-extensions v0.2.1
For reference, here are the released versions of CivisML so far. “v2.1”, “v2.0”, “v1.1”, “v1.0”, “v0.5” are no longer supported by the Civis Platform API and should not be used:
|Version||Release Month||Key Changes||Supported|
|v2.3||March, 2020||scikit-learn upgraded to v0.22.1||Yes|
|v2.2||April, 2018||Implemented model registration, so that a user can bring in a pre-trained model of their own.||Deprecated -- not supported from September 2020|
|v2.1||January, 2018||Hyperband added for stacking estimators.||No|
|v2.0||October, 2017||Model estimator from training now returned as a pipeline with the ETL estimator at the head.||No|
|v1.1||July, 2017||Allowed custom dependencies from private git repositories; added "is null" for all categorical expansions; shuffling enabled for cross-validation in meta-estimators.||No|
|v1.0||May, 2017||Updated "ModelPipeline" and "ModelFuture" classes for parameter and attribute names.||No|
|v0.5||April, 2017||First public release.||No|
Please sign in to leave a comment.