public class MutualInformationVectorSimilarity extends Object implements VectorSimilarity, Serializable
Similarity function that assumes the two vectors are paired samples from 2 correlated random variables. Using this we estimate the mutual information between the two variables.
Note, this uses the naive estimator of mutual information, which can be heavily biased when the two vectors have little overlap.
Constructor and Description |
---|
MutualInformationVectorSimilarity(Quantizer quantizer)
Construct a new mutual information similarity.
|
Modifier and Type | Method and Description |
---|---|
boolean |
isSparse()
Query whether this similarity function is sparse (returns 0 for vectors with disjoint key sets).
|
boolean |
isSymmetric()
Query whether this similarity function is symmetric.
|
double |
similarity(SparseVector vec1,
SparseVector vec2)
Compute similarity using mutual information.
|
@Inject public MutualInformationVectorSimilarity(Quantizer quantizer)
Construct a new mutual information similarity.
quantizer
- A quantizer to allow discrete mutual information to be computed.public double similarity(SparseVector vec1, SparseVector vec2)
Compute similarity using mutual information.
Note, this similarity function measures the absolute correlation between two vectors. Because of this it ranges from [0,inf), not [-1,1] as specified by superclass. Caution should be used when using this vector similarity function that your implementation will accept values in this range.
similarity
in interface VectorSimilarity
vec1
- The first vector.vec2
- The second vector.VectorSimilarity.similarity(SparseVector, SparseVector)
public boolean isSparse()
VectorSimilarity
Query whether this similarity function is sparse (returns 0 for vectors with disjoint key sets).
isSparse
in interface VectorSimilarity
true
iff VectorSimilarity.similarity(SparseVector, SparseVector)
will always return true when applied to two vectors with no keys in common.public boolean isSymmetric()
VectorSimilarity
Query whether this similarity function is symmetric. Symmetric similarity functions return the same result when called on (A,B) and (B,A).
isSymmetric
in interface VectorSimilarity
true
if the function is symmetric.