0110.be logo

~ Reproduction of speech using MIDI

Tarsos is now capable of reproducing speech using MIDI. The idea to convert speech into MIDI comes from the blog of Corban Brook where the following video can be found, actually a work by Peter Ablinger:

</param></param></param></embed>

Another example of music inspired by speech is this interview with Louis Van Gaal:

Tarsos sends out midi data based on an FFT analysis of the signal. It maps the spectrogram to MIDI Messages and uses the power spectrum to calculate the velocity of each note on message.

The implementation can run in real-time but the output has some delay: the FFT calculation, constructing MIDI messages, calculating velocity, synthesizing sound, … is not instantaneous.

To use this capability Tarsos supports the following syntax. If a MIDI file is given the MIDI messages are written to the file. If an audio file is given Tarsos uses the audio as input. If the --pitch switch is used only the F0 is considered to construct MIDI messages instead of a complete FFT.

```ruby\ java -jar tarsos.jar pitch_to_midi [—pitch] [midi_out.midi] [audio_in.wav]\ ```


~ Tone Scale Matching With Tarsos

Tarsos can be used to search for music that uses a certain tone scale or tone interval(s). Tone scales can be defined by a Scala tone scale file or an exemplifying audio file. This text explains how you can use Tarsos for this task.

Search Using Scala Tone Scale Files

Scala files are text files with information about a tone scale. It is used to share and exchange tone scales. The file format originates from the Scala program :

Scala is a powerful software tool for experimentation with musical tunings, such as just intonation scales, equal and historical temperaments, microtonal and macrotonal scales, and non-Western scales. It supports scale creation, editing, comparison, analysis, …

The Scala file format is popular because there is a library with more than 3000 tone scales available on the Scala website.

Tarsos also understands Scala files. It is able to create a pitch class histogram using a gaussian mixture model. A technique described in A. C. Gedik, B.Bozkurt, 2010, “Pitch Frequency Histogram Based Music Information Retrieval for Turkish Music “, Signal Processing, vol.10, pp.1049-1063. (doi:10.106/j.sigpro.2009.06.017).

An example should make things clear. Lets search for an interval of 300 cents or exactly three semitones. A scala file with this interval is easy to define:

```ruby\ ! example.scl\ ! An example of a tone interval of 300 cents\ Tone interval of 300 cents\ 2\ !\ 900\ 1200.0\ ```

The next step is to create a histogram with an interval of 300 cents. In the block diagram this step is called “Peak histogram creation”. The Similarity calculation step expects a list of histograms to compare with the newly defined histogram. Feeding the similarity calculation with the western12ET tone scale and a pentatonic Indonesian Slendro tone scale shows that a 300 cents interval is used in the western tone scale but is not available in the Slendro tone scale.

This example only uses scala files, creating histograms is actually not needed: calculating intervals can be done using the scala file itself. This changes when audio files are compared with each other or with scala files.

Search Using Audio Files

When audio files are fed to the algorithm additional steps need to be taken.

  1. First of all pitch detection is executed on the audio file. Currently two pitch extractors are implemented in pure Java, it is also possible to use an external pitch extractor such as aubio

  2. Using pitch annotations a Pitch Histogram is created.

  3. Peak detection on the Pitch Histogram results in a number of peaks, these should represent the distinct pitch classes used in the musical piece.

  4. With the pitch classes a clean peak histogram is created during the Peak Histogram construction phase.

  5. Finally the Peak histogram is matched with other histograms.

The last two steps are the same for audio files or scala files.

Using real audio files can cause dirty histograms. Determining how many distinct pitch classes are used is no trivial task, even for an expert (human) listener. Tarsos should provide a semi-automatic way of peak extraction: a best guess by an algorithm that can easily be corrected by a user. For the moment Tarsos does not allow manual intervention.

Tarsos

To use tarsos you need a recent java runtime (1.6) and the following command line arguments:

```ruby\ java -jar tarsos.jar rank —detector TARSOS_MPM\ —needle audio.wav —haystack scala.scl other_audio.wav other_scala_file.scl\ ```


~ Static Code Analysis For Java Using Eclipse

This post is about the tools I use to keep the source code of Tarsos reasonably clean, consistent and readable. Static code analysis can be of great help if you want to maintain strict coding standards and follow language idioms. Some of the patterns they can detect for you:

And even more subtle, but equally important:

In a previous life I used .NET and the static code analysis tools FxCop & StyleCop. FxCop operates on bytecode (or intermediate language in .NET parlance) level, StyleCop analyses the source code itself. Tarsos uses JAVA so I looked for JAVA alternatives and found a few.

On freesoftwaremagazine.com there is an article series on JAVA static code analysis software. It covers PMD and FixBugs and integration in Eclipse. It does not cover Checkstyle. Checkstyle is essentialy the same as PMD but it is better integrated in eclipse: it checks code on save and uses the standard ‘Problems’ interface, PMD does not.

To fix problems Eclipse save actions can save you some time. IBM has an article on how to keep your code clean using Eclipse.

Continuous testing is also a really nice thing to have: detecting unexpected behavior while refactoring/programming can prevent unnecessary bug hunts. A video about immediate feedback using continuous testing makes this clear.

Another tip is a more philosophical one: making your code and code revisions publicly available makes you think twice before implementing (and subsequently publishing) a quick and dirty hack. Tarsos is available on github.

References


~ Tarsos demos

I just finished creating a first release of Tarsos. The release contains several demo applications, some more usefull than other. Tarsos is a work in progress: not all functionality is exposed with the CLI (Command Line Interface) demo applications. The demos should however give a taste of the possibilities. All demo applications follow this pattern:

```ruby\ java -jar tarsos.jar subcommand [—option [argument] …]\ ```

To get help the --help switch can be used. It generates contextual help for either the subcommand or for Tarsos itself.

```ruby\ java -jar tarsos.jar —help\ java -jar tarsos.jar subcommand —help\ ```

Detect Pitch

```ruby\ java -jar tarsos.jar detect_pitch —in flute.novib.mf.C5B5.wav\ ```

Midi to Audio Using a Scala Tone Scale

```ruby\ java -jar tarsos.jar midi_to_wav —midi satie_gymno1.mid —scala 120.scl\ ```

Audio to Scala Tone Scale

```ruby\ java -jar tarsos.jar audio_to_scala —in out.wav\ ```

Annotate a File

```ruby\ java -jar tarsos.jar annotate —in out.wav\ ```

Pitch table

```ruby\ java -jar tarsos.jar pitch_table\ ```


~ Tarsos Spectrogram

Today I created a spectrogram application using Tarsos. The application listens to an audio input, computes an FFT and at the same time calculates pitch. The expected pitch is overlaid on the spectrogram. All this happens real-time and is implemented using JAVA.

spectrum with pitch information (red)

This is the most recent version of the spectrogram implementation in java.

```java\ float pitch = Yin.processBuffer(buffer, (float) sampleRate);\ fft.transform(buffer);\ double maxAmplitude = 0;\ for (int j = 0; j < buffer.length / 2; j) {\ double amplitude = buffer[j] * buffer[j] + buffer[j +\ buffer.length/2] * buffer[j+ buffer.length/2];\ amplitude = Math.pow(amplitude, 0.5);\ colorIndexes[j] = amplitude;\ maxAmplitude = Math.max(amplitude, maxAmplitude);\ }\ ```

If you want to test it yourself download the “spectrogram jar package”:[spectrogram.jar] and execute:

```ruby\ java -jar spectrogram.jar\ ```


~ Tarsos on GitHub

The JAVA software program we are developing is called Tarsos and can now be found on GitHub. GitHub is a web-based hosting service for projects that use the Git version control system.

Currently Tarsos is a collection of Java classes to create, compare and process pitch-frequency data using histograms. In it’s current state it is not usable for end-users.

Credits

Tarsos is developed at University College Ghent, Faculty of Music and uses a number of open source libraries:


~ Dataset

The dataset we use is the sound archive of the department of Ethnomusicology of the Royal Museum for Central Africa at Tervuren, Belgium. The archive was digitized during the DEKKMMA (Digitization of the Ethnomusicological Sound Archive of the Royal Museum for Central Africa - it works better in Dutch) project. More information about the dataset can be foun on the website of the DEKKMMA project:

The archive is a collection of sound recordings of traditional music from Central Africa, with a particular focus on Congo and Rwanda. The sound archive contains about 3,000 hours of music recordings, the oldest of which date from 1910: Edison cylinders recorded by Hutereau in the Uele-province in Congo. The archive contains several sound carriers (Edison cylinders, Sonofil wire, magnetic tapes, audiocassettes, disks, CD's ...) with associated metadata (paper files) and contextual data (photographs, films, video's, books, documents of all kind). The collection was created during and after the colonial era of the Belgian Kingdom in Central Africa. The RMCA collection forms for an important part the musical memory of Central Africa and in terms of size, documentation and musical quality, it is -- without any doubt -- the world's most important sound archive for this region.

Using the meta data we did a rough geocoding of each recording to create an “interactive map of the dataset”:[dataset_geocodes.html].


~ Development and Application of MIR Techniques on Ethnic Music

About

The aim of this research project is to gain novel musicological insights into a large dataset of music from Central Africa. While practising ethnomusicological research on this dataset, we to develop and publish useful software and methodologies for the (ethno)musicological research community.

From November 2009 until November 2013 this research project was organised at the School of Arts, University College Ghent, under supervision by Olmo Cornelis. Later, from November 2013 onwards, the project turned into a 2 year doctoral research project hosted at IPEM, University Ghent under the supervision of Marc Leman.

Partners


Royal Museum For Central Africa University Ghent  Institute for Psychoacoustics and Electronic Music University College Ghent, Hogeschool Gent School of Arts, Ghent

Previous blog posts