Train module

pyro_risks.pipeline.train.calibrate_pipeline(y_test: Union[pandas.core.series.Series, numpy.ndarray], y_scores: Union[pandas.core.series.Series, numpy.ndarray], ignore_prints: Optional[bool] = False)numpy.float64[source]

Calibrate Classification Pipeline.

Parameters
  • y_test – Binary test target.

  • y_scores – Predicted probabilities from the test set.

  • ignore_prints – Whether to print results. Defaults to False.

Returns

Threshold maximizing the f1-score.

pyro_risks.pipeline.train.save_pipeline(pipeline: imblearn.pipeline.Pipeline, model: str, optimal_threshold: numpy.float64, destination: Optional[str] = None, ignore_html: Optional[bool] = False)None[source]

Serialize pipeline.

Parameters
  • pipeline – imbalanced-learn preprocessing pipeline.

  • model – model name.

  • optimal_threshold – model calibration optimal threshold.

  • destination – folder where the pipeline should be saved. Defaults to ‘cfg.MODEL_REGISTRY’.

  • ignore_html – Persist pipeline html description. Defaults to False.

pyro_risks.pipeline.train.train_pipeline(X: pandas.core.frame.DataFrame, y: pandas.core.series.Series, model: str, pipeline: Optional[imblearn.pipeline.Pipeline] = None, destination: Optional[str] = None, ignore_prints: Optional[bool] = False, ignore_html: Optional[bool] = False)None[source]

Train a classification pipeline.

Parameters
  • X – Training dataset features pd.DataFrame.

  • y – Training dataset target pd.Series.

  • model – model name.

  • pipeline – imbalanced-learn preprocessing pipeline. Defaults to None.

  • destination – folder where the pipeline should be saved. Defaults to ‘cfg.MODEL_REGISTRY’.

  • ignore_prints – Whether to print results. Defaults to False.

  • ignore_html – Persist pipeline html description. Defaults to False.