Data Filters (filter)

Instances of classes derived from Filter are used for filtering the data.

When called with an individual data instance (Orange.data.Instance), they accept or reject the instance by returning either True or False.

When called with a data storage (e.g. an instance of Orange.data.Table) they check whether the corresponding class provides the method that implements the particular filter. If so, the method is called and the result should be of the same type as the storage; e.g., filter methods of Orange.data.Table return new instances of Orange.data.Table, and filter methods of SQL proxies return new SQL proxies.

If the class corresponding to the storage does not implement a particular filter, the fallback computes the indices of the rows to be selected and returns data[indices].

class Orange.data.filter.Filter(negate=False)[source]

The base class for filters.

negate

Reverts the selection

class Orange.data.filter.IsDefined(columns=None, negate=False)[source]

Select the data instances with no undefined values. The check can be restricted to a subset of columns.

The filter’s behaviour may depend upon the storage implementation.

In particular, Table with sparse matrix representation will select all data instances whose values are defined, even if they are zero. However, if individual columns are checked, it will select all rows with non-zero entries for this columns, disregarding whether they are stored as zero or omitted.

columns

The columns to be checked, given as a sequence of indices, names or Orange.data.Variable.

class Orange.data.filter.HasClass(negate=False)[source]

Return all rows for which the class value is known.

Orange.data.Table implements the filter on the sparse data so that it returns all rows for which all class values are defined, even if they equal zero.

class Orange.data.filter.Random(prob=None, negate=False)[source]

Return a random selection of data instances.

prob

The proportion (if below 1) or the probability (if 1 or above) of selected instances

class Orange.data.filter.SameValue(column, value, negate=False)[source]

Return the data instances with the given value in the specified column.

column

The column, described by an index, a string or Orange.data.Variable.

value

The reference value

class Orange.data.filter.Values(conditions, conjunction=True, negate=False)[source]

Select the data instances based on conjunction or disjunction of filters derived from ValueFilter that check values of individual features or another (nested) Values filter.

conditions

A list of conditions, derived from ValueFilter or Values

conjunction

If True, the filter computes a conjunction, otherwise a disjunction

negate

Revert the selection

class Orange.data.filter.ValueFilter(column)[source]

The base class for subfilters that check individual values of data instances. Derived classes handle discrete, continuous and string attributes. These filters are used to compose conditions in Orange.data.filter.Values.

The internal implementation of filter.Values in data storages, like Orange.data.Table, recognize these filters and retrieve their, attributes, like operators and reference values, but do not call them.

The fallback implementation of Orange.data.filter.Values calls the subfilters with individual data instances, which is very inefficient.

column

The column to which the filter applies (int, str or Orange.data.Variable).

class Orange.data.filter.FilterDiscrete(column, values)[source]

Subfilter for discrete variables, which selects the instances whose value matches one of the given values.

column

The column to which the filter applies (int, str or Orange.data.Variable).

values

The list (or a set) of accepted values. If None, it checks whether the value is defined.

class Orange.data.filter.FilterContinuous(position, oper, ref=None, max=None, min=None)[source]

Subfilter for continuous variables.

column

The column to which the filter applies (int, str or Orange.data.Variable).

ref

The reference value; also aliased to min for operators Between and Outside.

max

The upper threshold for operators Between and Outside.

oper

The operator; should be FilterContinuous.Equal, NotEqual, Less, LessEqual, Greater, GreaterEqual, Between, Outside or IsDefined.

class Orange.data.filter.FilterString(position, oper, ref=None, max=None, case_sensitive=True, **a)[source]

Subfilter for string variables.

column

The column to which the filter applies (int, str or Orange.data.Variable).

ref

The reference value; also aliased to min for operators Between and Outside.

max

The upper threshold for operators Between and Outside.

oper

The operator; should be FilterString.Equal, NotEqual, Less, LessEqual, Greater, GreaterEqual, Between, Outside, Contains, StartsWith, EndsWith or IsDefined.

case_sensitive

Tells whether the comparisons are case sensitive

class Orange.data.filter.FilterStringList(column, values, case_sensitive=True)[source]

Subfilter for strings variables which checks whether the value is in the given list of accepted values.

column

The column to which the filter applies (int, str or Orange.data.Variable).

values

The list (or a set) of accepted values.

case_sensitive

Tells whether the comparisons are case sensitive

class Orange.data.filter.FilterRegex(column, pattern, flags=0)[source]

Filter that checks whether the values match the regular expression.