0110.be logo

~ TarsosDSP Christmas Edition: Jingle Cats

The DSP library for Taros, aptly named TarsosDSP, now includes an example showing how to synthesize cat sounds. The inspration came from this youtube video

To hear what exactly it does, listen to the following audio example.

There is also a command line interface, the following command does

\ java -jar Catify-latest.jar in.mid\

 _______                       _____   _____ _____  
|__   __|                     |  __ \ / ____|  __ \ 
   | | __ _ _ __ ___  ___  ___| |  | | (___ | |__) |
   | |/ _` | '__/ __|/ _ \/ __| |  | |\___ \|  ___/ 
   | | (_| | |  \__ \ (_) \__ \ |__| |____) | |     
   |_|\__,_|_|  |___/\___/|___/_____/|_____/|_|     

----------------------------------------------------
Name:
    TarsosDSP catify'er
----------------------------------------------------
Synopsis:
    java -jar Catify-latest.jar input.mid
----------------------------------------------------
Description:

The source code of the Java implementation of the catify’er can be found on the TarsosDSP github page.


~ TarsosDSP Pitch Estimation Synthesizer

The DSP library for Taros, aptly named TarsosDSP, now includes an example showing how to synthesize pitch estimations. The goal of the example is to show which errors are made by different pitch detectors.

Pitch estimation synthesizer

To test the application, download and execute the Resynthesizer.jar file and load an audio file. For the moment only 44.1kHz mono wav is allowed. To hear what exactly it does, compare the following two audio fragments:

There is also a command line interface, the following command does pitch tracking, and follows the envelope of in.wav and immediately plays it on the default audio device. If you want to save the audio, see the command line options. The “flute example”:[flute.wav] is provided for your convenience.

\ java -jar Resynthesizer-latest.jar in.wav\

 _______                       _____   _____ _____  
|__   __|                     |  __ \ / ____|  __ \ 
   | | __ _ _ __ ___  ___  ___| |  | | (___ | |__) |
   | |/ _` | '__/ __|/ _ \/ __| |  | |\___ \|  ___/ 
   | | (_| | |  \__ \ (_) \__ \ |__| |____) | |     
   |_|\__,_|_|  |___/\___/|___/_____/|_____/|_|     

----------------------------------------------------
Name:
    TarsosDSP resynthesizer
----------------------------------------------------
Synopsis:
    java -jar CommandLineResynthesizer.jar [--detector DETECTOR] [--output out.wav] [--combined combined.wav] input.wav
----------------------------------------------------
Description:
    Extracts pitch and loudnes from audio and resynthesises the audio with that information.
    The result is either played back our written in an output file. 
    There is als an option to combine source and synthezized material
    in the left and right channels of a stereo audio file.

    input.wav       a readable wav file.

    --output out.wav        a writable file.

    --combined combined.wav     a writable output file. One channel original, other synthesized.
    --detector DETECTOR defaults to FFT_YIN or one of these:
                YIN
                MPM
                FFT_YIN
                DYNAMIC_WAVELET
                AMDF

The source code of the Java implementation of the synthesizer can be found on the TarsosDSP github page.


~ Phase Vocoding: Time Stretching and Pitch Shifting with TarsosDSP Java

The DSP library for Taros, aptly named TarsosDSP, now includes an implementation of a pitch shifting algorithm (as of version 1.4) and a time stretching algorithm. Combined, the two can be used for something like phase vocoding. With a phase vocoder you can load an audio snippet, change the pitch and duration and e.g. create a library of snippets. E.g. by recording one piano key stroke, it is possible to generate two octaves of samples of different lengths, and use those in stead of synthesized samples. The following example application shows exactly that, implemented in the java programming language.

The example application below shows how to pitch shift and time stretch a sample to create a sample library with the TarsosDSP library.

<a href=/releases/TarsosDSP/TarsosDSP-latest/TarsosDSP-latest-Examples/SampleExtractor-latest.jar">Pitch shifting in Java</a>

Find your oven fresh baked binaries at the TarsosDSP Release Repository.


~ Tarsos 1.0: Transcription Features

Today marks the reslease of Tarsos 1.0 . The new Tarsos release contains practical transcription features. As can be seen in the screenshot below, a time stretching feature makes it easy to loop a certain audio fragment while it is playing in a slow tempo. The next loop can be played with by pressing the n key, the one before by pressing b.

Since the pitch classes can be found in a song, and there is a feature that lets you play a MIDI keyboard in the tone scale of the song under analysis, transcription of ethnic music is made a lot easier.

\ Tarsos 1.0\

The new release of Tarsos can be found in the Tarsos release repository. From now on, nightly releases are uploaded there automatically.


~ Pitch Shifting - Implementation in Pure Java with Resampling and Time Stretching

The DSP library for Taros, aptly named TarsosDSP, now includes an implementation of a pitch shifting algorithm (as of version 1.4). The goal of pitch shifting is to change the pitch of a piece of audio without affecting the duration. The algorithm implemented is a combination of resampling and time stretching. Resampling changes the pitch of the audio, but affects the total duration. Consecutively, the duration of the audio is stretched to the original (without affecting pitch) with time stretching. The result is very similar to phase vocoding.

The example application below shows how to pitch shift input from the microphone in real-time, or pitch shift a recorded track with the TarsosDSP library.

Pitch shifting in Java

To test the application, download and execute the PitchShift.jar file and load an audio file. For the moment only 44.1kHz mono wav is allowed. To get started you can try “this piece of audio”:[08._Ladrang_Kandamanyura_10s-20s.wav].

There is also a command line interface, the following command lowers the pitch of in.wav by two semitones.

java -jar in.wav out.wav -200

----------------------------------------------------
 _______                       _____   _____ _____  
|__   __|                     |  __ \ / ____|  __ \ 
   | | __ _ _ __ ___  ___  ___| |  | | (___ | |__) |
   | |/ _` | '__/ __|/ _ \/ __| |  | |\___ \|  ___/ 
   | | (_| | |  \__ \ (_) \__ \ |__| |____) | |     
   |_|\__,_|_|  |___/\___/|___/_____/|_____/|_|     

----------------------------------------------------
Name:
    TarsosDSP Pitch shifting utility.
----------------------------------------------------
Synopsis:
    java -jar PitchShift.jar source.wav target.wav cents
----------------------------------------------------
Description:
    Change the play back speed of audio without changing the pitch.

        source.wav  A readable, mono wav file.
        target.wav  Target location for the pitch shifted file.
        cents       Pitch shifting in cents: 100 means one semitone up, 
                -100 one down, 0 is no change. 1200 is one octave up.

The resampling feature was implemented with libresample4j by Laszlo Systems. libresample4j is a Java port of Dominic Mazzoni’s libresample 0.1.3, which is in turn based on Julius Smith’s Resample 1.7 library.


~ ISMIR 2012 - Highlights

Logo ISMIR 2012The 13th International Society for Music Information Retrieval Conference took place in Porto, Portugal, October 8th-12th, 2012. This text contains links to some papers, toolkits, software presented there which are interesting for my research. Basically it contains my personal highlights of the conference. The ISMIR 2012 is described as follows:

The annual Conference of the International Society for Music Information Retrieval (ISMIR) is the world’s leading research forum on processing, searching, organizing and accessing music-related data. The revolution in music distribution and storage brought about by digital technology has fueled tremendous research activities and interests in academia as well as in industry. The ISMIR Conference reflects this rapid development by providing a meeting place for the discussion of MIR-related research, developments, methods, tools and experimental results. Its main goal is to foster multidisciplinary exchange by bringing together researchers and developers, educators and librarians, as well as students and professional users.

Tutorials

I saw an interesting tutorial on Jazz music and a tutorial on source separation. After an introduction, which detailed the experimental basis of the system, a source separator was introduced. The REPET source separator is a relatively simple system that yields reasonable results to split accompaniment from foreground melody.

Posters & Talks

The approach and the dataset used in N-gram Based Statistical Makam Detection on Makam Music in Turkey Using Symbolic Data is very interesting. More than 800 pieces of makam music where transcribed manually and analysed. Details about the dataset are available in the following paper: A Turkish Makam Music Symbolic Database for Music Information Retrieval: SymbTr.

Assigning a Confidence Threshold on Automatic Beat Annotation in Large Datasets by Zapata et al. shows a very interesting way to do exactly what the title says. Descriptive titles are descriptive.

A very practical tool to do melody extraction was presented by Justin Salamon. He created a Vamp Plugin with the name Melodia. Unfortunately the plugin is currently only available for windows, but Linux and OS X versions are in the pipeline. More about the algorithm implemented and background information can be found in the paper Justin presented: Statistical Characterisation of Melodic Pitch Contours and its Application for Melody Extraction. Another Vamp Plugin for melody visualization was also presented: Pitch Content Visualization Tools for Music Performance Analysis.

The ongoing work by Ceril Bohak and Matija Marolt on segmentation of folk music could be very useful to apply on Afican musics. The paper is called Finding Repeating Stanzas in Folk Songs.


~ CIM 2012 - Revealing and Listening to Scales From the Past; Tone Scale Analysis of Archived Central-African Music Using Computational Means

Logo Universiteit UtrechtWhat follows is about the Conference on Interdisciplinary Musicology (CIM2012) and the 15th international Conference of the Gesellschaft fur Musikfoschung. First this text will give information about our contribution to CIM2012: Revealing and Listening to Scales From the Past; Tone Scale Analysis of Archived Central-African Music Using Computational Means and then a number of highlights of the conference follow. The joint conference took place from the 4th to the 8th of september 2012.

In 2012, CIM will tackle the subject of History. Hosted by the University of Göttingen, whose one time music director Johann Nikolaus Forkel is widely regarded as one of the founders of modern music historiography, CIM12 aims to promote collaborations that provoke and explore new methods and methodologies for establishing, evaluating, preserving and communicating knowledge of music and musical practices of past societies and the factors implicated in both the preservation and transformation of such practices over time.

### Revealing and Listening to Scales From the Past; Tone Scale Analysis of Archived Central-African Music Using Computational Means

Our contribution ton CIM 2012 is titled “Revealing and Listening to Scales From the Past; Tone Scale Analysis of Archived Central-African Music Using Computational Means”:[CIM12_Submission.pdf]. The aim was to show how tone scales of the past, e.g. organ tuning, can be extracted and sonified. During the demo special attention was given to historic Central African tuning systems. The presentation I gave is included below and or available for “download”:[2012.09.05-Revealing_and_listening_to_scales_from_the_past__tone_scale_analysis_of_archived_Central-African_music_using_computational_means..ppt]

Highlights

What follows are some personal highlights for the Conference on Interdisciplinary Musicology (CIM2012) and the 15th international Conference of the Gesellschaft fur Musikfoschung. The joint conference took place from the 4th to the 8th of september 2012.

The work presented by Rytis Ambrazevicius et al. Modal changes in traditional Lithuanian singing: Diachronic aspect has a lot in common with our research, it was interesting to see their approach. Another highlight of the conference was the whole session organized by Klaus-Peter Brenner around Mbira music.

Rainer Polak gave a talk titled ‘Swing, Groove and Metre. Asymmetric Feels, Metric Ambiguity and Metric Transformation in African Musics’. He showed how research about rhythm in jazz research, music theory and empirical musicology ( amongst others) could be bridged and applied to ethnic music.

The overview Eleanore Selfridge-Field gave during her talk Between an Analogue Past and a Digital Future: The Evolving Digital Present was refreshing. She had a really clear view on all the different ways musicology and digital media can benifit from each-other.

From the concert programme I found two especially interesting: the lecture-performance by Margarete Maierhofer-Lischka and Frauke Aulbert of Lotofagos, a piece by Beat Furrer and Burdocks composed and performed by Christian Wolff and a bunch of enthusiastic students.


~ ICMC 2012 - Sound to Scale to Sound, a Setup for Microtonal Exploration and Composition

Logo Universiteit UtrechtAt this years ICMC Conference, ICMC 2012 we presented a paper describing a way to experiment with tone scales and how to use Tarsos as a compositional tool. What follows are some pointers to the presentation, paper and to other interesting talks that were presented there.

ICMC 2012 was organized in Ljubljana from the 9 to 14 septembre and had a very dense program of talks, posters, presentations, demos and concerts.

Since 1974 the International Computer Music Conference has been the major international forum for the presentation of the full range of outcomes from technical and musical research, both musical and theoretical, related to the use of computers in music. This annual conference regularly travels the globe, with recent conferences in the Americas, Europe and Asia. This year we welcome the conference to Slovenia for the first time.

### Sound to Scale to Sound, a Setup for Microtonal Exploration and Composition

Our contribution to the conference was a paper titled “Sound to Scale to Sound, a Setup for Microtonal Exploration and Composition”:[icmc2012_submission_45.pdf].

If you want to cite our work, this BibTeX entry is included for your convenience:

```ruby\ \@inproceedings{cornelis2012sound_to_scale,\ author = {Olmo Cornelis and Joren Six},\ title = {{Sound to Scale to Sound, a Setup for Microtonal Exploration and Composition}},\ booktitle = {{Proceedings of the 2012 International Computer Music Conference,\ (ICMC 2012)}},\ year = {2012},\ publisher = {The International Computer Music Association}\ }\ ```

Program highlights

What follows are a number of pointers to my personal program highlights.

Verena Thomas presented two very well polished software tools. One to detect patterns in scores, called motifviewer and a tool to search in score databases in a multi-modal way. The Probado tool does score-to-audio alignment and much more.

Gibber is an impressive live-coding environment with an easy syntax. Since it is all done with javascript you can start playing with it immediately. Overtone Another live-coding environment, presented at the conference by Sam Aaron, was equally impressive. It is programmed using the Closure language.

At ICMC there were a number of tools to assist in composition. One of those is The Bach Project, by Andrea Agostini. Togheter with CatART by Diemo Swartz it forms a very expressive platform to work with sound, which was demonstrated by Aaron Einbond and Christopher Trapani in their paper titled Precise Pitch Control In Real Time Corpus-Based Concatenative Synthesis. Diemo Swartz presented work on Audio Mosaicing, it can be seen as a follow-up to AuidioGuild by Ben Hackbarth.

I also got to know the work by Thomas Grill, on his website a nice piece of software can be found a Python implementation of the Non Stationary Gabor Transform (NSGT). Another software system I got to know is the functional signal processing programming language FAUST

My personal highlights of the concert programme include the works by Johannes Kreidler, Aura Pon, Daniel Mayer, Alexander Schubert and the remarkable performance by Dexter Ford. The concept behind Soundlog by Johannes Kretz was also interesting.


~ Analytical Approaches To World Music - Microtonal Scale Exploration in Central Africa

At the 2012 AAWM (Analytical Approaches To World Music) conference we presented a way to explore tone scales in the music of Central Africa. Since the audience consisted of (ethno)musicologists, the main focus of the presentation was on the applicication part, the technical aspects were only briefly mentioned.

The extended abstract can be consulted: Towards the tangible: microtonal scale exploration in Central-African music

The conference program itself was very diverse and interesting.


~ TarsosDSP Release 1.2

Today a new version of the TarsosDSP library was released. TarsosDSP is a small library to do audio processing in Java. It features two new pitch detectors. An AMDF (Average Magnitude Difference Function) pitch detector, contributed by Eder Souza of Brazil and a faster implementation of YIN kindly provided by Matthias Mauch of Queen Mary University, London.

[![Pitch Detector in Java](/files/attachments/336/PitchDetector.png "Pitch Detector in Java")](/releases/TarsosDSP/TarsosDSP-1.2/TarsosDSP-1.2-Examples/PitchDetector-1.2.jar)

Find your oven fresh baked binaries at the TarsosDSP Release Repository.


~ Guest Lecture at MIT - Ethnic Music Analysis: Challenges & Opportunities - Tarsos as a Case Study

Thursday the 3th of May I gave a guest lecture titled ‘Ethnic Music Analysis: Challenges & Opportunities’ it featured Tarsos as a Case Study. The goal was to identify the difficulties when dealing with ethnic music and to show a possible approach, the approach implemented by Tarsos.

The invitation to give the guest lecture came from Michael Cuthbert who is one of the driving forces behind music21. The audience was a small group of double majors in both musicology and computer science: the ideal profile to gather useful feedback.


~ TarsosDSP Release 1.0

After about a year of development and several revisions TarsosDSP has enough features and is stable enough to slap the 1.0 tag onto it. A ‘read me’, manual, API documentation, source and binaries can be found on the TarsosDSP release directory. The source is present in the\ What follows below is the information that can be found in the read me file:

TarsosDSP is a collection of classes to do simple audio processing. It features an implementation of a percussion onset detector and two pitch detection algorithms: Yin and the Mcleod Pitch method. Also included is a Goertzel DTMF decoding algorithm and a time stretch algorithm (WSOLA).

Its aim is to provide a simple interface to some audio (signal) processing algorithms implemented in pure JAVA. Some TarsosDSP example applications are available.

The following example filters a band of frequencies of an input file testFile. It keeps the frequencies form startFrequency to stopFrequency.

<code>AudioInputStream inputStream = AudioSystem.getAudioInputStream(testFile);
AudioDispatcher dispatcher = new AudioDispatcher(inputStream,stepSize,overlap);
dispatcher.addAudioProcessor(new HighPass(startFrequency, sampleRate, overlap));
dispatcher.addAudioProcessor(new LowPassFS(stopFrequency, sampleRate, overlap));
dispatcher.addAudioProcessor(new FloatConverter(format));
dispatcher.addAudioProcessor(new WaveformWriter(format,stepSize, overlap, "filtered.wav"));
dispatcher.run();
</code>

Quickly Getting Started with TarsosDSP

Head over to the TarsosDSP release repository and download the latest TarsosDSP library. To get up to speed quickly, check the TarsosDSP Example applications for inspiration and consult the API documentation. If you, for some reason, want to build from source, you need Apache Ant and git installed on your system. The following commands fetch the source and build the library and example jars:
git clone https://JorenSix@github.com/JorenSix/TarsosDSP.git cd TarsosDSP/build ant tarsos_dsp_library #Builds the core TarsosDSP library ant build_examples #Builds all the TarsosDSP examples ant javadoc #Creates the documentation in TarsosDSP/doc
\ When everything runs correctly you should be able to run all example applications and have the latest version of the TarsosDSP library for inclusion in your projects. Also the Javadoc documentation for the API should be available in TarsosDSP/doc. Drop me a line if you use TarsosDSP in your project. Always nice to hear how this software is used.

Source Code Organization and Examples of TarsosDSP

The source tree is divided in three directories:


~ Text to Speech to Speech Recognition - Am I Sitting in a Room?

This post is about a hack I did for the 2012 Amsterdam music hack days. From the website:

The Amsterdam Music Hack Day is a full weekend of hacking in which participants will conceptualize, create and present their projects. Music + software + mobile + hardware + art + the web. Anything goes as long as it’s music related

The hackathon was organized at the NiMK(Nederlands instituut voor Media Kunst) the 25th and 24th of May. My hack tries to let a phone start a conversation on its own. It does this by speaking a text and listening to the spoken text with speech recognition. The speech recognition introduces all kinds of interesting permutations of the original text. The recognized text is spoken again and so a dreamlike, unique nonsensical discussion starts. It lets you hear what goes on in the mind of the phone.

The idea is based on Alvin Lucier’s I am Sitting in a Room form 1969 which is embedded below. He used analogue tapes to generate a similar recursive loop. It is a better implementation of something I did a couple of years ago.

The implementation is done with Android and its API’s. Both speech recognition and text to speech are available on android. Those API’s are used and a user interface shows the recognized text. An example of a session can be found below:

To install the application you can download “Tryalogue.apk”:[Tryalogue.apk] of use the QR-code below. You need Android 2.3 with Voice Recognition and TTS installed. Also needed is an internet connection. “The source”:[Tryalogue.zip] is also up for grabs.


~ Oscilloscope in TarsosDSP

The DSP library for Taros, aptly named TarsosDSP, now includes an implementation of an oscilloscope.

"![Oscilloscope in Java](/files/attachments/330/oscilloscope.png "Oscilloscope in Java")":\[OscilloscopeExample.jar\]

The source code of the Java implementation can be found on the TarsosDSP github page. That is all.


~ Dan Ellis' Robust Landmark-Based Audio Fingerprinting - With Octave

This blog post documents how to get the Matlab implementation by Dan Ellis of Avery Wangs Industrial-Strength Audio Search Algorithm running with GNU Octave on Ubuntu (and similar Linux distributions).

The Dan Ellis implementation is nicely documented here: Robust Landmark-Based Audio Fingerprinting . To download, get info about and decode mp3’s some external binaries are needed:

```bash\ #install octave if needed\ sudo apt-get install octave3.2\ #Install the required dependencies for the script\ sudo apt-get install mp3info curl

mpg123 is not present as a package, install from source:\

wget http://www.mpg123.de/download/mpg123-1.13.5.tar.bz2\ tar xvvf mpg123-1.13.5.tar.bz2\ cd mpg123-1.13.5/\ ./configure\ make\ sudo make install\ ```

In mp3read.m the following code was changed (line 111 and 112):

```matlab\ mpg123 = ‘mpg123’; % was fullfile(path,[‘mpg123.’,ext]);\ mp3info = ‘mp3info’; % was fullfile(path,[‘mp3info.’,ext]);\ ```

Then, the demo program runs flawlessly when executing octave -q demo_fingerprint.m.

Running the demo with the original code with GNU Octave, version 3.2.3 takes 152 seconds on a PC with a Q9650 @ 3GHz processor. A small tweak can make it run almost 8 times faster. When working with larger data sets (10k audio files) this makes a big difference. I do not know why but storing a hash in the large hash table was really slow (0.5s per hash, with 900 hashes per song…). Caching the hashes and adding them all at once makes it faster (at least in Octave, YMMV). The optimized version of “record_hashes.m”:[record_hashes.m.txt] can be found attached. With this alteration the same demo ran in 20s. When caching the data locally the difference is 11.5s to 141s or 12 times faster. The code with all the changes can be found here: “Robust Landmark-Based Audio Fingerprinting - optimized for Octave 3.2”:[fingerprint_fast.zip]. Please note again that the implementation is done by Dan Ellis (2009) ( available on Robust Landmark-Based Audio Fingerprinting) and I did only some small tweaks.


~ Harmony and Variation in Music Information Retrieval

Logo Universiteit UtrechtThe 29th of February 2012 there was a symposium on Music Information Retreival in Utrecht. It was organized on the occasion of Bas de Haas’ PhD defense. The title of the study day was Harmony and variation in music information retrieval.

During the talk by Xavier Serra rasikas.org was mentioned a forum with discussions about Carnatic Music. Since I could find a couple of discussions about pitch use on that forum I plugged Tarsos there to see if I could gather some feedback.


~ Echo or Delay Audio Effect in Java With TarsosDSP

The DSP library for Taros, aptly named TarsosDSP, now includes an implementation of an audio echo effect. An echo effect is very simple to implement digitally and can serve as a good example of a DSP operation.

"![Echo or delay effect in Java](/files/attachments/398/echo_or_delay_effect.png "Echo or delay effect in Java")":\[Delay.jar\]

The implementation of the effect can be seen below. As can be seen, to achieve an echo one simply needs to mix the current sample i with a delayed sample present in echoBuffer with a certain decay factor. The length of the buffer and the decay are the defining parameters for the sound of the echo. To fill the echo buffer the current sample is stored (line 4). Looping through the echo buffer is done by incrementing the position pointer and resetting it at the correct time (lines 6-9).

```java\ //output is the input added with the decayed echo\ audioFloatBuffer[i] = audioFloatBuffer[i] + echoBuffer[position] * decay;\ //store the sample in the buffer;\ echoBuffer[position] = audioFloatBuffer[i];\ //increment the echo buffer position\ position;\ //loop in the echo buffer\ if(position == echoBuffer.length)\ position = 0;\ ```

To test the application, download and execute the “Delay.jar”:[Delay.jar] file and start singing in a microphone.

The source code of the Java implementation can be found on the TarsosDSP github page.


~ Spectrogram in Java with TarsosDSP

This is post presents a better version of the spectrogram implementation. Now it is included as an example in TarsosDSP, a small java audio processing library. The application show a live spectrogram, calculated using an FFT and the detected fundamental frequency (in red).

Spectrogram and pitch detection in Java

To test the application, download and execute the “Spectrogram.jar”:[Spectrogram.jar] file and start singing in a microphone.

There is also a command line interface, the following command shows the spectrum for in.wav:

\ java -jar Spectrogram.jar in.wav\

The source code of the Java implementation can be found on the TarsosDSP github page.


~ Démonstration de Tarsos

Nous avons creé une video pour expliquer des possibilités de Tarsos, et maintenant en français.


~ Audio Time Stretching - Implementation in Pure Java Using WSOLA

The DSP library for Taros, aptly named TarsosDSP, now includes an implementation of a time stretching algorithm. The goal of time stretching is to change the duration of a piece of audio without affecting the pitch. The algorithm implemented is described in An Overlap-add Technique Based On Waveform Similarity (WSOLA) for High Quality Time-Scale Modification of Speech.

Time Stretching (WSOLA) in Java

To test the application, download and execute the “WSOLA jar”:[TimeStretch.jar] file and load an audio file. For the moment only 44.1kHz mono wav is allowed. To get started you can try “this piece of audio”:[08._Ladrang_Kandamanyura_10s-20s.wav].

There is also a command line interface, the following command doubles the speed of in.wav:

\ java -jar TimeStretch.jar in.wav out.wav 2.0\

 _______                       _____   _____ _____  
|__   __|                     |  __ \ / ____|  __ \ 
   | | __ _ _ __ ___  ___  ___| |  | | (___ | |__) |
   | |/ _` | '__/ __|/ _ \/ __| |  | |\___ \|  ___/ 
   | | (_| | |  \__ \ (_) \__ \ |__| |____) | |     
   |_|\__,_|_|  |___/\___/|___/_____/|_____/|_|     

----------------------------------------------------
Name:
    TarsosDSP Time stretch utility.
----------------------------------------------------
Synopsis:
    java -jar TimeStretch.jar source.wav target.wav factor
----------------------------------------------------
Description:
    Change the play back speed of audio without changing the pitch.

        source.wav  A readable, mono wav file.
        target.wav  Target location for the time stretched file.
        factor      Time stretching factor: 2.0 means double the length, 0.5 half. 1.0 is no change.

The source code of the Java implementation of WSOLA can be found on the TarsosDSP github page.


Previous blog posts

03-02-2012 ~ Tarsos CLI: Detect Pitch

17-01-2012 ~ A Robust Audio Fingerprinter Based on Pitch Class Histograms - Applications for Ethnic Music Archives

21-12-2011 ~ Pitch, Pitch Interval, and Pitch Ratio Representation

15-12-2011 ~ TarsosDSP sample application: Utter Asterisk

12-12-2011 ~ TarsosDSP used in jAM - Java Automatic Music Transcription

12-12-2011 ~ Kinderuniversiteit - Muziek onder de microscoop!

07-12-2011 ~ How To: Generate an Audio Fingerprinting Data Set With Sox Audio Effects

06-12-2011 ~ The Power of the Pentatonic Scale

02-12-2011 ~ Software for Music Analysis

09-11-2011 ~ Robust Audio Fingerprinting with Tarsos and Pitch Class Histograms

09-11-2011 ~ PeachNote Piano demo at ISMIR 2011

25-10-2011 ~ Tarsos at 'Study Day: Tuning and Temperament - Insitute of Musical Research, London'

25-10-2011 ~ Tarsos presentation at 'ISMIR 2011'

18-10-2011 ~ Tarsos at 'WASPAA 2011'

04-10-2011 ~ Bruikbare software voor muziekanalyse

27-09-2011 ~ Dual-Tone Multi-Frequency (DTMF) Decoding with the Goertzel Algorithm in Java

26-09-2011 ~ PeachNote Piano at the ISMIR 2011 demo session

21-09-2011 ~ Simplify Collaboration on a LaTeX Documents with Dropbox and a Build Server

21-09-2011 ~ The Pidato Experiment: Vibrato on a Digital Piano Using an Arduino

20-09-2011 ~ Rendering MIDI Using Arbitrary Tone Scales - Revisited