Advanced Topics

Passing specific features into a task

Certain features of the same data type may need to be processed differently than others.

For example, suppose you are working on solving a problem which has a dataset containing text features. One of which lends itself well to using word-grams for preprocessing, while the other char-grams.

When using Composable ML in DataRobot, a user may pass one or more specific features to another task.

Any time project-specific functionality is being used, make sure to:

w.set_project(project_id="<project_id>")
# or
# w = Workshop(project_id="<project_id>")

Here we only select the “Age” feature, perform missing value imputation, and pass it to the Keras neural network classifier. Note that similar to other pieces of functionality, you may auto-complete feature names with w.Features.<tab> to complete available features.

features = w.FeatureSelection(w.Features.Age)
pni = w.Tasks.PNI2(features)
keras = w.Tasks.KERASC(pni)
keras_blueprint = w.BlueprintGraph(keras)

You may link a blueprint to a specific project, if desired. This will ensure the blueprint is validated based on the _linked_ project, e.g. ensuring the selected features exist in the dataset associated with the project.

# Make sure it is saved at least once, or pass `user_blueprint_id` to `link_to_project`
keras_blueprint.save()
keras_blueprint.link_to_project(project_id="<project_id>")

Features may also be excluded instead, which is particularly useful when a particular feature should be processed one way, and everything else, processed another way.

without_insurance_type = w.FeatureSelection(w.Features.Insurance_Type, exclude=True)
only_insurance_type = w.FeatureSelection(w.Features.Insurance_Type)
one_hot = w.Tasks.PDM3(without_insurance_type)
ordinal = w.Tasks.ORDCAT2(only_insurance_type)
keras = w.Tasks.KERASC(one_hot, ordinal)
keras_blueprint = w.BlueprintGraph(keras)