be.hogent.tarsos.util.histogram
Class Histogram

java.lang.Object
  extended by be.hogent.tarsos.util.histogram.Histogram
All Implemented Interfaces:
java.lang.Cloneable
Direct Known Subclasses:
PitchClassHistogram, PitchHistogram

public class Histogram
extends java.lang.Object
implements java.lang.Cloneable

A histogram is defined by a start value, a stop value and a number of classes. The 'key' of a class is the middle of the class. E.g. the keys of a histogram that starts at 0, stops at 5 and has 5 classes are {0.5,1.5,2.5,3.5,4.5}. The intervals for each key are {[0,1[;[1,2[;[2,3[;[3,4];[4,5[} with [0,1[ meaning the interval between 0 inclusive and 1 exclusive.

The histogram uses a red and black tree as underlying structure: Search, insert and delete are O(LOG n). The tree keeps the keys in order and makes iteration (in order) easy. Optimization is possible by replacing the tree with arrays.

The histogram uses doubles as key values. Java doubles are prone to rounding errors. To prevent rounding errors the keys are rounded to a predefined number of decimals. The number can be found in PRECISION_FACTOR. E.g. if PRECISION_FACTOR is 10000 then the number of significant decimals is 4; the minimum classWidth is 0.0001.

Author:
Joren Six

Constructor Summary
Histogram(double startVal, double stopVal, int totalClasses)
           Create a Histogram with a certain number of classes with values in the range ]start - classWidht / 2, stop + classWidth / 2 [ if the histogram wraps otherwise values outside the range are mapped to values inside using a modulo calculation.
Histogram(double startVal, double stopVal, int totalClasses, boolean wrapping)
           Create a Histogram with a certain number of classes with values in the range ]start - classWidht / 2, stop + classWidth / 2 [ if the histogram wraps otherwise values outside the range are mapped to values inside using a modulo calculation.
Histogram(double startVal, double stopVal, int totalClasses, boolean wrapping, boolean ignoreOutsideRange)
           Create a Histogram with a certain number of classes with values in the range ]start - classWidht / 2, stop + classWidth / 2 [ if the histogram wraps otherwise values outside the range are mapped to values inside using a modulo calculation.
Histogram(Histogram original)
          Creates a new, empty histogram using the same parameters of the original histogram.
 
Method Summary
 Histogram add(double value)
          Adds a value to the Histogram.
 Histogram add(Histogram other)
          Calculates the sum of two histograms.
 Histogram add(Histogram other, int offset)
          Calculates the sum of two histograms.
 Histogram addToEachBin(long value)
          Adds a number of items to each bin.
 Histogram baselineHistogram()
          Searches the minimum number of items in a bin and subtracts all bins with this value.
 void clear()
          Sets each bin to 0.
 Histogram clone()
           
 double correlation(Histogram otherHistogram)
          Return the correlation of this histogram with another one.
 double correlation(Histogram otherHistogram, CorrelationMeasure correlationMeasure)
          Return the correlation of this histogram with another one.
 double correlationWithDisplacement(int displacement, Histogram otherHistogram)
           
 double correlationWithDisplacement(int displacement, Histogram otherHistogram, CorrelationMeasure correlationMeasure)
           
 void displace(int displacement)
           
 int displacementForOptimalCorrelation(Histogram otherHistogram)
           
 int displacementForOptimalCorrelation(Histogram otherHistogram, CorrelationMeasure correlationMeasure)
          Returns the number of classes the other histogram needs to be displaced to get optimal correlation with this histogram.
 void export(java.lang.String fileName)
          Export the histogram data as a plain text file.
 void exportMatLab(java.lang.String fileName)
          Export the histogram data as a matlab text file.
 Histogram gaussianSmooth(double standardDeviation)
          Smooth the histogram using Gaussians.
 long getAbsoluteSumFreq()
          Returns the sum of all frequencies.
 double getClassWidth()
           
 long getCount(double value)
          Returns the number of values = v.
 long getCountForClass(int i)
          Returns the number of items in class with index bufferCount.
 long getCumFreq(java.lang.Double v)
          Returns the cumulative frequency of values less than or equal to v.
 double getCumPct(java.lang.Double v)
          Returns the cumulative percentage of values less than or equal to v (as a proportion between 0 and 1).
 double getEntropy()
          Returns the entropy of the histogram.
 double getKeyForClass(int i)
          Returns the key for class with index bufferCount.
 long getMaxBinCount()
           
 double getMean()
          Calculates the mean count of each bin.
 double getMedian()
          Calculates the mean count of each bin.
 int getNumberOfClasses()
           
 double getPct(java.lang.Double v)
          Returns the percentage of values that are equal to v (as a proportion between 0 and 1).
 double getStart()
          The starting value is not the same as the first key.
 double getStop()
          The stopping value is not the same as the last key.
 long getSumFreq()
          Returns the sum of all frequencies (bin counts).
 Histogram invert()
          Inverts this histograms.
 boolean isWrapped()
           
 java.util.Set<java.lang.Double> keySet()
           
 Histogram max(Histogram other)
          Takes the maximum of the bin value in each histogram and changes the current histogram to this maximum value.
static Histogram mean(java.util.List<Histogram> histograms)
          Calculates a histogram mean of a list of histograms.
 Histogram multiply(double factor)
          Multiplies each class (bin) count with a factor.
 Histogram normalize()
          Normalizes the peaks in a histogram.
 void plot(java.lang.String fileName, java.lang.String title)
          Plots the histogram to a x y plot.
 void plotCorrelation(Histogram otherHistogram, CorrelationMeasure correlationMeasure, java.lang.String fileName, java.lang.String title)
           
 Histogram raise(double exponent)
          Raises each class count to the power of exponent.
 void setCount(double value, long count)
          Sets the number of values for a key (bin) The value is automatically mapped to a key.
 Histogram smooth(boolean isWeighted, int k)
          Computes a smoothed version of the histogram.
 Histogram subtract(Histogram other)
          Subtracts two histograms.
 java.lang.String toString()
          Return a string representation of this histogram.
 java.lang.String toString(boolean asciiArt)
          Returns a string representation of the histogram.
protected  void valueAddedHook(double value)
          A hook to intercept added values.
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

Histogram

public Histogram(double startVal,
                 double stopVal,
                 int totalClasses,
                 boolean wrapping,
                 boolean ignoreOutsideRange)

Create a Histogram with a certain number of classes with values in the range ]start - classWidht / 2, stop + classWidth / 2 [ if the histogram wraps otherwise values outside the range are mapped to values inside using a modulo calculation.

Parameters:
startVal - the starting value of the histogram. The starting value is not the same as the first class middle. The starting value is equal to the first class middle - classWidth / 2
stopVal - the stopping value of the histogram. The stopping value is not the same as the last class middle. The stopping value is equal to the last class middle + classWidth / 2
totalClasses - the number of classes between the starting and stopping values. Also defines the classWidth.
wrapping - indicates if the histogram wraps around the edges. More formal: If the histogram wraps values outside the range ]start - classWidht / 2, stop + classWidth / 2 [ are mapped to values inside the range using a modulo calculation.
ignoreOutsideRange - if true values outside the valid range are ignored. Otherwise if a value outside the valid range is added an IllegalArgumentException is thrown.

Histogram

public Histogram(Histogram original)
Creates a new, empty histogram using the same parameters of the original histogram. The parameter being start, wraps and stop and number of classes.

Parameters:
original - the original histogram

Histogram

public Histogram(double startVal,
                 double stopVal,
                 int totalClasses)

Create a Histogram with a certain number of classes with values in the range ]start - classWidht / 2, stop + classWidth / 2 [ if the histogram wraps otherwise values outside the range are mapped to values inside using a modulo calculation.

Parameters:
startVal - the starting value of the histogram. The starting value is not the same as the first class middle. The starting value is equal to the first class middle - classWidth / 2
stopVal - the stopping value of the histogram. The stopping value is not the same as the last class middle. The stopping value is equal to the last class middle + classWidth / 2
totalClasses - the number of classes between the starting and stopping values. Also defines the classWidth.

Histogram

public Histogram(double startVal,
                 double stopVal,
                 int totalClasses,
                 boolean wrapping)

Create a Histogram with a certain number of classes with values in the range ]start - classWidht / 2, stop + classWidth / 2 [ if the histogram wraps otherwise values outside the range are mapped to values inside using a modulo calculation.

Parameters:
startVal - the starting value of the histogram. The starting value is not the same as the first class middle. The starting value is equal to the first class middle - classWidth / 2
stopVal - the stopping value of the histogram. The stopping value is not the same as the last class middle. The stopping value is equal to the last class middle + classWidth / 2
totalClasses - the number of classes between the starting and stopping values. Also defines the classWidth.
wrapping - indicates if the histogram wraps around the edges. More formal: If the histogram wraps values outside the range ]start - classWidht / 2, stop + classWidth / 2 [ are mapped to values inside the range using a modulo calculation.
Method Detail

getKeyForClass

public final double getKeyForClass(int i)
Returns the key for class with index bufferCount.

Parameters:
i - a class index. If bufferCount lays outside the interval [0,getNumberOfClasses()[ it is mapped to a value inside the interval using a modulo calculation.
Returns:
the key for class with index bufferCount

getCountForClass

public final long getCountForClass(int i)
Returns the number of items in class with index bufferCount.

Parameters:
i - A class index. If bufferCount lays outside the interval [0,getNumberOfClasses()[ it is mapped to a value inside the interval using a modulo calculation.
Returns:
the number of items in bin with index bufferCount

keySet

public final java.util.Set<java.lang.Double> keySet()
Returns:
the set with histogram keys; Do not add keys to the set directly. Use histogram methods instead. For performance reasons it is not wrapped in an immutable set so handle with care.

add

public final Histogram add(double value)
Adds a value to the Histogram. Assigns the value to the right bin automatically.

Parameters:
value - The value to add.
Returns:
This histogram with the added value.
Throws:
java.lang.IllegalArgumentException - when the value is not in the range of the histogram.

valueAddedHook

protected void valueAddedHook(double value)
A hook to intercept added values.

Parameters:
value - The value added

getCount

public final long getCount(double value)
Returns the number of values = v.

Parameters:
value - the value to lookup.
Returns:
the frequency of v.

setCount

public final void setCount(double value,
                           long count)
Sets the number of values for a key (bin) The value is automatically mapped to a key.

Parameters:
value - the value mapped to a key of the class to set the count for.
count - the number of items in the bin

getClassWidth

public final double getClassWidth()
Returns:
the width of a class (bin)

getNumberOfClasses

public final int getNumberOfClasses()
Returns:
the number of classes

getStart

public final double getStart()
The starting value is not the same as the first key. It is equal to firstKey - classWidth / 2.0

Returns:
the starting value

getStop

public double getStop()
The stopping value is not the same as the last key. It is equal to lastKey + classWidth / 2.0

Returns:
the stop value

isWrapped

public boolean isWrapped()
Returns:
true if values outside the interval are wrapped, false otherwise.

getCumFreq

public long getCumFreq(java.lang.Double v)
Returns the cumulative frequency of values less than or equal to v.

Returns 0 if v is not comparable to the values set.

Uses code from Apache Commons Math" licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements.

Parameters:
v - the value to lookup.
Returns:
the proportion of values equal to v

getCumPct

public double getCumPct(java.lang.Double v)
Returns the cumulative percentage of values less than or equal to v (as a proportion between 0 and 1).

Returns Double.NaN if no values have been added.

Parameters:
v - the value to lookup
Returns:
the proportion of values less than or equal to v

getSumFreq

public long getSumFreq()
Returns the sum of all frequencies (bin counts). If there are negative bin counts the sum is smaller than getAbsoluteSumFreq()

Returns:
the total frequency count.

getAbsoluteSumFreq

public long getAbsoluteSumFreq()
Returns the sum of all frequencies. The absolute values of the bin counts are used.

Returns:
the total frequency count.

getPct

public double getPct(java.lang.Double v)
Returns the percentage of values that are equal to v (as a proportion between 0 and 1).

Returns Double.NaN if no values have been added.

Parameters:
v - the value to lookup
Returns:
the proportion of values equal to v

getEntropy

public double getEntropy()
Returns the entropy of the histogram.

The histogram entropy is defined to be the negation of the sum of the products of the probability associated with each bin with the base-2 LOG of the probability.

Uses code from https://jai-core.dev.java.net/ The source code for the core Java Advanced Imaging API reference implementation is licensed under the Java Research License (JRL) for non-commercial use. The JRL allows users to download, build, and modify the source code in the jai-core project for research use, subject to the terms of the license.

Returns:
The entropy of the histogram.

getMean

public double getMean()
Calculates the mean count of each bin. It iterates over each bin, stores the bin count temporarily and returns the mean bin count. It does not cache the result.

Returns:
the mean bin count.

getMedian

public double getMedian()
Calculates the mean count of each bin. It iterates over each bin, stores the bin count temporarily and returns the mean bin count. It does not cache the result.

Returns:
the mean bin count.

toString

public java.lang.String toString()
Return a string representation of this histogram.

Overrides:
toString in class java.lang.Object
Returns:
a string representation.

toString

public java.lang.String toString(boolean asciiArt)
Returns a string representation of the histogram.

Uses code from Apache Commons Math" licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements.

Parameters:
asciiArt - If true it generates an ascii representation of a histogram, otherwise it generates a frequency table
Returns:
a string representation.

normalize

public Histogram normalize()
Normalizes the peaks in a histogram. Every peak is reduced to it's relative weight (percent).

Changes the current histogram and returns it so it is possible to chain modifications. E.g. histo.normalize().addToEachBin(10)

Returns:
a Histogram with normalized peak.

addToEachBin

public Histogram addToEachBin(long value)
Adds a number of items to each bin. Use a negative number to subtract a value from each bin.

Changes the current histogram and returns it so it is possible to chain modifications. E.g. histo.normalize().addToEachBin(10)

Parameters:
value - the number of items to add.
Returns:
returns the current histogram so it is possible to chain modifications.

baselineHistogram

public Histogram baselineHistogram()
Searches the minimum number of items in a bin and subtracts all bins with this value.
 *
 * *              *
 * *   *          * *
 * * * *    =>    * *   *
 -------          -------
 

Changes the current histogram and returns it so it is possible to chain modifications. E.g. histo.normalize().addToEachBin(10)

Returns:
a baselined histogram

add

public Histogram add(Histogram other)
Calculates the sum of two histograms. The value for each bin of other is added to the corresponding bin of this histogram. The other histogram must have the same start, stop and binWidth otherwise adding histograms makes no sense!

Changes the current histogram and returns it so it is possible to chain modifications. E.g.histo.normalize().addToEachBin(10)

Parameters:
other - The other histogram
Returns:
the changed histogram with more (or the same) number of items in the bins.

add

public Histogram add(Histogram other,
                     int offset)
Calculates the sum of two histograms. The value for each bin of other is added to the corresponding bin of this histogram. The other histogram must have the same start, stop and binWidth otherwise adding histograms makes no sense!

Changes the current histogram and returns it so it is possible to chain modifications. E.g.histo.normalize().addToEachBin(10)

Parameters:
other - The other histogram
offset - The offset.
Returns:
the changed histogram with more (or the same) number of items in the bins.

max

public Histogram max(Histogram other)
Takes the maximum of the bin value in each histogram and changes the current histogram to this maximum value.

Parameters:
other - Another histogram with the same number of bins.
Returns:
A histogram with the maximum values. It is the changed current histogram, not a new one.

subtract

public Histogram subtract(Histogram other)
Subtracts two histograms. The value for each bin of other is removed to the corresponding bin of this histogram. The other histogram must have the same start, stop and binWidth otherwise subtracting histograms makes no sense!

Changes the current histogram and returns it so it is possible to chain modifications. E.g.histo.normalize().addToEachBin(10)

Parameters:
other - The other histogram
Returns:
The changed histogram with less (or the same) number of items in the bins.

invert

public Histogram invert()
Inverts this histograms. The value for each bin is multiplied with -1.

Changes the current histogram and returns it so it is possible to chain modifications. E.g.histo.normalize().addToEachBin(10)

Returns:
The changed histogram with each bin multiplied with -1.

multiply

public Histogram multiply(double factor)
Multiplies each class (bin) count with a factor.

Changes the current histogram and returns it so it is possible to chain modifications. E.g. histo.normalize().addToEachBin(10)

Parameters:
factor - the factor to multiply each bin value with.
Returns:
histogram with each bin value multiplied by the factor.

raise

public Histogram raise(double exponent)
Raises each class count to the power of exponent.

Changes the current histogram and returns it so it is possible to chain modifications. E.g. histo.normalize().addToEachBin(10)

Parameters:
exponent - The exponent to raise each bincount with.
Returns:
Histogram with each bin count raised with exponent.

clone

public Histogram clone()
                throws java.lang.CloneNotSupportedException
Overrides:
clone in class java.lang.Object
Throws:
java.lang.CloneNotSupportedException

mean

public static Histogram mean(java.util.List<Histogram> histograms)
Calculates a histogram mean of a list of histograms. All histograms must have the same start, stop and binWidth otherwise the mean histogram makes no sense!

Parameters:
histograms - a list of histograms
Returns:
a histogram with the mean values. If the list is empty it returns null.

smooth

public Histogram smooth(boolean isWeighted,
                        int k)
Computes a smoothed version of the histogram.

The histogram is smoothed by averaging over a moving window of a size specified by the method parameter: if the value of the parameter is k then the width of the window is 2*k + 1. If the window runs off the end of the histogram only those values which intersect the histogram are taken into consideration. The smoothing may optionally be weighted to favor the central value using a "triangular" weighting. For example, for a value of k equal to 2 the central bin would have weight 1/3, the adjacent bins 2/9, and the next adjacent bins 1/9.

Changes the current histogram and returns it so it is possible to chain modification e.g. histo.normalize().addToEachBin(10)

Uses code from https://jai-core.dev.java.net/ The source code for the core Java Advanced Imaging API reference implementation is licensed under the Java Research License (JRL) for non-commercial use. The JRL allows users to download, build, and modify the source code in the jai-core project for research use, subject to the terms of the license.

Parameters:
isWeighted - Whether bins will be weighted using a triangular weighting scheme favoring bins near the central bin.
k - The smoothing parameter which must be non-negative or an IllegalArgumentException will be thrown. If zero, the histogram object will be returned with no smoothing applied.
Returns:
A smoothed version of the histogram.

gaussianSmooth

public Histogram gaussianSmooth(double standardDeviation)
Smooth the histogram using Gaussians.

Each band of the histogram is smoothed by discrete convolution with a kernel approximating a Gaussian impulse response with the specified standard deviation.

Changes the current histogram and returns it so it is possible to chain modification e.g. histo.normalize().addToEachBin(10)

Uses code from https://jai-core.dev.java.net/ The source code for the core Java Advanced Imaging API reference implementation is licensed under the Java Research License (JRL) for non-commercial use. The JRL allows users to download, build, and modify the source code in the JAI-core project for research use, subject to the terms of the license.

Parameters:
standardDeviation - The standard deviation of the Gaussian smoothing kernel which must be non-negative or an IllegalArgumentException will be thrown. If zero, the histogram object will be returned with no smoothing applied.
Returns:
A Gaussian smoothed version of the histogram.

displacementForOptimalCorrelation

public int displacementForOptimalCorrelation(Histogram otherHistogram)

displace

public void displace(int displacement)

correlationWithDisplacement

public double correlationWithDisplacement(int displacement,
                                          Histogram otherHistogram,
                                          CorrelationMeasure correlationMeasure)

correlationWithDisplacement

public double correlationWithDisplacement(int displacement,
                                          Histogram otherHistogram)

correlation

public double correlation(Histogram otherHistogram,
                          CorrelationMeasure correlationMeasure)
Return the correlation of this histogram with another one.

Parameters:
otherHistogram -
correlationMeasure -
Returns:
the correlation between this histogram with another histogram.

correlation

public double correlation(Histogram otherHistogram)
Return the correlation of this histogram with another one. By default it uses the INTERSECTION correlation measure.

Parameters:
otherHistogram - the other histogram
Returns:
the correlation the computed correlation

displacementForOptimalCorrelation

public int displacementForOptimalCorrelation(Histogram otherHistogram,
                                             CorrelationMeasure correlationMeasure)
Returns the number of classes the other histogram needs to be displaced to get optimal correlation with this histogram. The correlation is defined by the chosen correlation measure.

Parameters:
otherHistogram - The other histogram.
correlationMeasure - The correlation strategy.
Returns:
Returns the number of classes the other histogram needs to be displaced to get optimal correlation with this histogram.

plotCorrelation

public void plotCorrelation(Histogram otherHistogram,
                            CorrelationMeasure correlationMeasure,
                            java.lang.String fileName,
                            java.lang.String title)

plot

public void plot(java.lang.String fileName,
                 java.lang.String title)
Plots the histogram to a x y plot. The file is saved in PNG file format so the fileName should end on PNG.

Parameters:
fileName - The file is saved in PNG file format so the fileName should end on PNG.
title - The title of the histogram. Use an empty string or null for an empty title.

export

public final void export(java.lang.String fileName)
Export the histogram data as a plain text file. The format uses ; to separate keys from values.

Parameters:
fileName - Where to save the text file.

exportMatLab

public final void exportMatLab(java.lang.String fileName)
Export the histogram data as a matlab text file. The format can be read using octave or MatLab.

Parameters:
fileName - Where to save the matlab (.m) file.

getMaxBinCount

public final long getMaxBinCount()
Returns:
The maximum bin count.

clear

public void clear()
Sets each bin to 0.