PipeGraph

The PipeGraph module provides an extension to Scikit-Learn’s Pipeline that enables the users to express their models by using graphs of steps.

class pipegraph.PipeGraph(steps, fit_connections=None, predict_connections=None, log_level=None)[source]

PipeGraph class holds the steps, connections and graphs needed to perform graph-like fits and predicts.

steps : list
List of (name, action) tuples that are chained. The last one is considered to be the output step.
connections: dictionary

A dictionary whose keys of the top level entries of the dictionary must the same as those of the previously defined steps. The values assocciated to these keys define the variables from other steps that are going to be considered as inputs for the current step. They are dictionaries themselves, where:

  • The keys of the nested dictionary represents the input variable as named at the current step.

  • The values associated to these keys define the steps that hold the desired information, and the variables as named at that step. This information can be written as:

    • A tuple with the label of the step in position 0 followed by the name of the output variable in position 1.
    • A string representing a variable from an external source to the PipeGraphRegressor object, such as those provided by the user while invoking the fit, predict or fit_predict methods.
alternative_connections: dictionary
A dictionary as described for connections. This parameters provides the possibility of specifying a PipeGraph that uses a different connections dictionary during fit than during predict. The default value, None, implies that it is equivalent to the connections value, and thus PipeGraph uses the same graph for both fit and predict.
log_level: int
Log level for traceability purposes. This is yet a unimplemented feature.
decision_function(X)[source]

Applies PipeGraph’s predict method and returns the decision_function output of the final estimator

X: iterable object
Data to predict on. Must fulfill input requirements of first step of the PipeGraph.
Returns:y_score
Return type:array-like, shape = [n_samples, n_classes]
fit(X, y=None, **kwargs)[source]

Fit the PipeGraph steps one after the other and following the topological order of the graph defined by the connections attribute.

Parameters:
  • X (iterable object) – Input fit data.
  • y (iterable, default=None) – Output fit data.
  • **kwargs (dict of string -> object) – Other input data
Returns:

self – This estimator

Return type:

PipeGraphClassifier

fit_predict(X, y=None, **fit_params)[source]

Applies fit_predict of last step in PipeGraph after it predicts the PipeGraph steps one after the other and following the topological order of the graph.

Applies predict of a PipeGraph to the data following the topological order of the graph, followed by the fit_predict method of the final step in the PipeGraph. Valid only if the final step implements fit_predict.

Parameters:
  • X (iterable object) – Training data. Must fulfill input requirements of first step of the pipeline.
  • y (iterable, default=None) – Training targets. Must fulfill label requirements for all steps of the pipeline.
  • **fit_params (dict of string -> object) – Parameters passed to the fit method of each step, where each parameter name is prefixed such that parameter p for step s has key s__p.
Returns:

y_pred

Return type:

array-like

get_params(deep=True)[source]

Get parameters for this estimator. :param deep: If True, will return the parameters for this estimator and

contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
inject(sink, sink_var, source='_External', source_var='predict', into='fit')[source]

Adds a connection to the graph.

sink: Destination sink_var: Name of the variable at destination that is going to hold the information source: Origin source_var: Name of the variable at origin holding the information into: This can be either ‘fit’ or ‘predict’, indicating which connections are described: those belonging

to ‘fit_connections’ or those belonging to ‘predict_connections’. Default is ‘fit’.
self: PipeGraph
Returning self allows chaining operations
named_steps

type: Returns

predict(X)[source]

Predict the PipeGraph steps one after the other and following the topological order defined by the alternative_connections attribute, in case it is not None, or the connections attribute otherwise.

Parameters:X (iterable object) – Data to predict on. Must fulfill input requirements of first step of the PipeGraph.
Returns:y_pred
Return type:array-like
predict_dict(X, **kwargs)[source]

Predict the PipeGraph steps one after the other and following the topological order defined by the alternative_connections attribute, in case it is not None, or the connections attribute otherwise.

Parameters:
  • X – Input data
  • **kwargs (dict of string -> object) – Arbitrary number of keyword arguments.
Returns:

result_dict – Dictionary containing a single or multiple outputs

Return type:

dict

predict_log_proba(X)[source]

Applies PipeGraph’s predict method and returns the predict_log_proba output of the final estimator

Parameters:X (iterable object) – Data to predict on. Must fulfill input requirements of first step of the PipeGraph
Returns:y_proba
Return type:array-like, shape = [n_samples, n_classes]
predict_proba(X)[source]

Applies PipeGraph’s predict method and returns the predict_proba output of the final estimator

Parameters:X (iterable object) – Data to predict on. Must fulfill input requirements of first step of the PipeGraph.
Returns:y_proba
Return type:array-like, shape = [n_samples, n_classes]
score(X, y=None, sample_weight=None)[source]

Applies PipeGraph’s predict method and returns the score output of the final estimator

Parameters:
  • X (iterable) – Data to predict on. Must fulfill input requirements of first step of the pipeGraph.
  • y (iterable, default=None) – Targets used for scoring. Must fulfill label requirements for all steps of the pipeGraph.
  • sample_weight (array-like, default=None) – If not None, this argument is passed as sample_weight keyword argument to the score method of the final estimator.
  • Returns
  • -------
  • y_proba (array-like, shape = [n_samples, n_classes]) –
set_params(**kwargs)[source]

Set the parameters of this estimator. Valid parameter keys can be listed with get_params(). :returns: :rtype: self