Train module¶

pyro_risks.pipeline.train.calibrate_pipeline(y_test: Union[pandas.core.series.Series, numpy.ndarray], y_scores: Union[pandas.core.series.Series, numpy.ndarray], ignore_prints: Optional[bool] = False) → numpy.float64[source]¶

Calibrate Classification Pipeline.

Parameters

y_test – Binary test target.
y_scores – Predicted probabilities from the test set.
ignore_prints – Whether to print results. Defaults to False.

Returns

Threshold maximizing the f1-score.

pyro_risks.pipeline.train.save_pipeline(pipeline: imblearn.pipeline.Pipeline, model: str, optimal_threshold: numpy.float64, destination: Optional[str] = None, ignore_html: Optional[bool] = False) → None[source]¶

Serialize pipeline.

Parameters

pipeline – imbalanced-learn preprocessing pipeline.
model – model name.
optimal_threshold – model calibration optimal threshold.
destination – folder where the pipeline should be saved. Defaults to ‘cfg.MODEL_REGISTRY’.
ignore_html – Persist pipeline html description. Defaults to False.

pyro_risks.pipeline.train.train_pipeline(X: pandas.core.frame.DataFrame, y: pandas.core.series.Series, model: str, pipeline: Optional[imblearn.pipeline.Pipeline] = None, destination: Optional[str] = None, ignore_prints: Optional[bool] = False, ignore_html: Optional[bool] = False) → None[source]¶

Train a classification pipeline.

Parameters

X – Training dataset features pd.DataFrame.
y – Training dataset target pd.Series.
model – model name.
pipeline – imbalanced-learn preprocessing pipeline. Defaults to None.
destination – folder where the pipeline should be saved. Defaults to ‘cfg.MODEL_REGISTRY’.
ignore_prints – Whether to print results. Defaults to False.
ignore_html – Persist pipeline html description. Defaults to False.