Sampling procedures for testing models (testing)

class Orange.evaluation.testing.Results(data=None, nmethods=0, *, learners=None, train_data=None, nrows=None, nclasses=None, store_data=False, store_models=False, domain=None, actual=None, row_indices=None, predicted=None, probabilities=None, preprocessor=None, callback=None, n_jobs=1)[source]

Class for storing predictions in model testing.

Attributes:
data (Optional[Table]): Data used for testing. When data is stored,
this is typically not a copy but a reference.

models (Optional[List[Model]]): A list of induced models.

row_indices (np.ndarray): Indices of rows in data that were used in
testing, stored as a numpy vector of length nrows. Values of actual[i], predicted[i] and probabilities[i] refer to the target value of instance data[row_indices[i]].

nrows (int): The number of test instances (including duplicates).

actual (np.ndarray): Actual values of target variable;
a numpy vector of length nrows and of the same type as data (or np.float32 if the type of data cannot be determined).
predicted (np.ndarray): Predicted values of target variable;
a numpy array of shape (number-of-methods, nrows) and of the same type as data (or np.float32 if the type of data cannot be determined).
probabilities (Optional[np.ndarray]): Predicted probabilities
(for discrete target variables); a numpy array of shape (number-of-methods, nrows, number-of-classes) of type np.float32.
folds (List[Slice or List[int]]): A list of indices (or slice objects)
corresponding to rows of each fold.
get_augmented_data(model_names, include_attrs=True, include_predictions=True, include_probabilities=True)[source]

Return the data, augmented with predictions, probabilities (if the task is classification) and folds info. Predictions, probabilities and folds are inserted as meta attributes.

Args:
model_names (list): A list of strings containing learners’ names. include_attrs (bool): Flag that tells whether to include original attributes. include_predictions (bool): Flag that tells whether to include predictions. include_probabilities (bool): Flag that tells whether to include probabilities.
Returns:
Orange.data.Table: Data augmented with predictions, (probabilities) and (fold).
fit(train_data, test_data=None)[source]

Fits self.learners using folds sampled from the provided data.

Parameters:

train_data : Table

table to sample train folds

test_data : Optional[Table]

tap to sample test folds of None then train_data will be used

prepare_arrays(test_data)[source]

Initialize arrays that will be used by fit method.

setup_indices(train_data, test_data)[source]

Initializes self.indices with iterable objects with slices (or indices) for each fold.

Args:
train_data (Table): train table test_data (Table): test table
split_by_model()[source]

Split evaluation results by models

class Orange.evaluation.testing.CrossValidation(data, learners, k=10, stratified=True, random_state=0, store_data=False, store_models=False, preprocessor=None, callback=None, warnings=None, n_jobs=1)[source]

K-fold cross validation.

If the constructor is given the data and a list of learning algorithms, it runs cross validation and returns an instance of Results containing the predicted values and probabilities.

k

The number of folds.

random_state
class Orange.evaluation.testing.CrossValidationFeature(data, learners, feature, store_data=False, store_models=False, preprocessor=None, callback=None, n_jobs=1)[source]

Cross validation with folds according to values of a feature.

feature

The feature defining the folds.

class Orange.evaluation.testing.LeaveOneOut(data, learners, store_data=False, store_models=False, preprocessor=None, callback=None, n_jobs=1)[source]

Leave-one-out testing

class Orange.evaluation.testing.TestOnTestData(train_data, test_data, learners, store_data=False, store_models=False, preprocessor=None, callback=None, n_jobs=1)[source]

Test on a separate test data set.

class Orange.evaluation.testing.TestOnTrainingData(data, learners, store_data=False, store_models=False, preprocessor=None, callback=None, n_jobs=1)[source]

Trains and test on the same data

Orange.evaluation.testing.sample(table, n=0.7, stratified=False, replace=False, random_state=None)[source]

Samples data instances from a data table. Returns the sample and a data set from input data table that are not in the sample. Also uses several sampling functions from scikit-learn.

table : data table
A data table from which to sample.
n : float, int (default = 0.7)
If float, should be between 0.0 and 1.0 and represents the proportion of data instances in the resulting sample. If int, n is the number of data instances in the resulting sample.
stratified : bool, optional (default = False)
If true, sampling will try to consider class values and match distribution of class values in train and test subsets.
replace : bool, optional (default = False)
sample with replacement
random_state : int or RandomState
Pseudo-random number generator state used for random sampling.