public final class IdMeanAccumulator extends Object
An accumulator for means associated with IDs.
Constructor and Description |
---|
IdMeanAccumulator() |
Modifier and Type | Method and Description |
---|---|
ImmutableSparseVector |
computeIdMeans(double offset,
double damping)
Compute the means for each ID.
|
double |
globalMean()
Get the global mean.
|
ImmutableSparseVector |
idMeanOffsets()
Compute offsets from the global mean for each ID.
|
ImmutableSparseVector |
idMeanOffsets(double damping)
Compute mean offsets for each item.
|
ImmutableSparseVector |
idMeans()
Get the per-ID means.
|
void |
put(long id,
double val)
Accumulate a value with an ID.
|
public void put(long id, double val)
Accumulate a value with an ID.
id
- The ID.val
- The value.public double globalMean()
Get the global mean.
public ImmutableSparseVector idMeans()
Get the per-ID means. Equivalent to computeIdMeans(0, 0)
.
computeIdMeans(double, double)
public ImmutableSparseVector computeIdMeans(double offset, double damping)
Compute the means for each ID. This is a generalized mean function, capable of offsetting the individual values and damping the overall mean. For an ID with (n) values (x_1,\dots,xn), offset (y) and damping (\gamma), it computes (\frac{\sum{i=1}^n x_i - ny}{n + \gamma}). If (y) is the global mean, this computes each ID’s average deviation from the global mean. If (\gamma) is additionally positive, then these average deviations are then damped towards 0, effectively pretending that each ID has an additional (\gamma) values at exactly the global mean. If (y=0) and (gamma > 0), it pretends each ID has additional values at 0.
The prior (assumed value for additional values) is always 0 in the output domain. If (y>0), then the values are offset first, and then damped towards 0. This method does not yet support damping towards some other value; if you need actual damped means, where each is damped towards the global mean, add the global mean to the resulting vector.
offset
- An offset to subtract from each value prior to averaging.damping
- The damping term (see idMeanOffsets(double)
).public ImmutableSparseVector idMeanOffsets(double damping)
Compute mean offsets for each item. Equivalent to computeIdMeans(globalMean(), damping)
.
damping
- The damping term.computeIdMeans(double, double)
public ImmutableSparseVector idMeanOffsets()
Compute offsets from the global mean for each ID. This is equivalent to calling idMeanOffsets(0).