Distances
=========
Computes distances between rows/columns in a dataset.
Inputs
Data
input dataset
Outputs
Distances
distance matrix
The **Distances** widget computes distances between rows or
columns in a dataset.
.. figure:: images/Distances-stamped.png
1. Choose whether to measure distances between rows or columns.
2. Choose the *Distance Metric*:
- `Euclidean `_
("straight line", distance between two points)
- `Manhattan `_
(the sum of absolute differences for all attributes)
- `Cosine `_
(the cosine of the angle between two vectors of an inner product
space)
- `Jaccard `__ (the
size of the intersection divided by the size of the union of the
sample sets)
- `Spearman `_
(linear correlation between the rank of the values, remapped as a
distance in a [0, 1] interval)
- `Spearman
absolute `_
(linear correlation between the rank of the absolute values,
remapped as a distance in a [0, 1] interval)
- `Pearson `_
(linear correlation between the values, remapped as a distance in
a [0, 1] interval)
- `Pearson absolute `_
(linear correlation between the absolute values, remapped as a
distance in a [0, 1] interval)
In case of missing values, the widget automatically imputes the average
value of the row or the column.
Since the widget cannot compute distances between discrete and
continuous attributes, it only uses continuous attributes and ignores
the discrete ones. If you want to use discrete attributes, continuize
them with the :doc:`Continuize <../data/continuize>` widget first.
3. Produce a report.
4. Tick *Apply Automatically* to automatically commit changes to other widgets. Alternatively, press '*Apply*'.
Example
-------
This widget needs to be connected to another widget to display results,
for instance to :doc:`Distance Map <../unsupervised/distancemap>` to visualize distances, :doc:`Hierarchical
Clustering <../unsupervised/hierarchicalclustering>` to cluster the attributes, or :doc:`MDS<../unsupervised/mds>` to visualize the
distances in a plane.
.. figure:: images/DistancesExample.png