Sampling procedures for testing models (
Results(data=None, nmethods=0, *, learners=None, train_data=None, nrows=None, nclasses=None, store_data=False, store_models=False, domain=None, actual=None, row_indices=None, predicted=None, probabilities=None, preprocessor=None, callback=None, n_jobs=1)¶
Class for storing predictions in model testing.
- data (Optional[Table]): Data used for testing. When data is stored,
- this is typically not a copy but a reference.
models (Optional[List[Model]]): A list of induced models.
- row_indices (np.ndarray): Indices of rows in data that were used in
- testing, stored as a numpy vector of length nrows. Values of actual[i], predicted[i] and probabilities[i] refer to the target value of instance data[row_indices[i]].
nrows (int): The number of test instances (including duplicates).
- actual (np.ndarray): Actual values of target variable;
- a numpy vector of length nrows and of the same type as data (or np.float32 if the type of data cannot be determined).
- predicted (np.ndarray): Predicted values of target variable;
- a numpy array of shape (number-of-methods, nrows) and of the same type as data (or np.float32 if the type of data cannot be determined).
- probabilities (Optional[np.ndarray]): Predicted probabilities
- (for discrete target variables); a numpy array of shape (number-of-methods, nrows, number-of-classes) of type np.float32.
- folds (List[Slice or List[int]]): A list of indices (or slice objects)
- corresponding to rows of each fold.
get_augmented_data(model_names, include_attrs=True, include_predictions=True, include_probabilities=True)¶
Return the data, augmented with predictions, probabilities (if the task is classification) and folds info. Predictions, probabilities and folds are inserted as meta attributes.
- model_names (list): A list of strings containing learners’ names. include_attrs (bool): Flag that tells whether to include original attributes. include_predictions (bool): Flag that tells whether to include predictions. include_probabilities (bool): Flag that tells whether to include probabilities.
- Orange.data.Table: Data augmented with predictions, (probabilities) and (fold).
Fits self.learners using folds sampled from the provided data.
train_data (Table): table to sample train folds test_data (Optional[Table]): tap to sample test foldsof None then train_data will be used
Initialize arrays that will be used by fit method.
Initializes self.indices with iterable objects with slices (or indices) for each fold.
- train_data (Table): train table test_data (Table): test table
Split evaluation results by models
CrossValidation(data, learners, k=10, stratified=True, random_state=0, store_data=False, store_models=False, preprocessor=None, callback=None, warnings=None, n_jobs=1)¶
K-fold cross validation.
If the constructor is given the data and a list of learning algorithms, it runs cross validation and returns an instance of Results containing the predicted values and probabilities.
The number of folds.
LeaveOneOut(data, learners, store_data=False, store_models=False, preprocessor=None, callback=None, n_jobs=1)¶
TestOnTestData(train_data, test_data, learners, store_data=False, store_models=False, preprocessor=None, callback=None, n_jobs=1)¶
Test on a separate test data set.
TestOnTrainingData(data, learners, store_data=False, store_models=False, preprocessor=None, callback=None, n_jobs=1)¶
Trains and test on the same data
sample(table, n=0.7, stratified=False, replace=False, random_state=None)¶
Samples data instances from a data table. Returns the sample and a data set from input data table that are not in the sample. Also uses several sampling functions from scikit-learn.
- table : data table
- A data table from which to sample.
- n : float, int (default = 0.7)
- If float, should be between 0.0 and 1.0 and represents the proportion of data instances in the resulting sample. If int, n is the number of data instances in the resulting sample.
- stratified : bool, optional (default = False)
- If true, sampling will try to consider class values and match distribution of class values in train and test subsets.
- replace : bool, optional (default = False)
- sample with replacement
- random_state : int or RandomState
- Pseudo-random number generator state used for random sampling.