Devel/Chord Extraction TODO's
From Clam
Contents |
[edit]
Stand alone application (GSoC Pawel)
[edit]
Big project milestones
- Having an application that seeks allong the song.
- Launching the analysis whenever you change the song
- The playback controlling the time displayed on the views
- Additional features/views
- ...
[edit]
Tasks
- Preparation
- Implement seeking
- Simple hack
-
Take a look at the FileReader classes
-
Write a processing which loads whole files using internal MonoAudioFileReader
-
Add internal MonoAudioFileReader error handling
-
Add position variable to it
-
Add output of current position to an OutControl
-
Add a position InControl to it
-
Add support for the loop option
- Write a multichannel version
-
- Final version (?)
- TDD a seek control on the mono audio file reader
- Extend it to the Multi channel file reader
- Build a Control sender with feedback (to build a progress slider)
- Simple hack
- Widget library
- splitting monitors and widgets in NetworkEditor/src/monitors
[edit]
Realtime segmentation (GSoC 2007 Roman)
- Improve segmantation in ChordExtractor
-
Extract the code responsible for segmentation and make a separate class out of it
- Setup framework for comparisons between different implementation of segmentation algorithms
-
Decide on a method for allowing use of different chord extraction algorithms
- A chord similarity based implementation
- Removing small segments
- Joining segments with identical chords
-
- Realtime segmentation in chord extraction
-
Add a time output control to AudioFileReaders
- Add a time input control to TonalAnalysis
- Use this time input to inform ChordExtractor of the time position (if ever we use AudioFileReaders with seeking)
- Internal time in TonalAnalysis, reset to 0 when the network starts
- Add a new port to TonalAnalysis for the segmentation from ChordExtractor
- Some changes to the Segmentation classes to allow use through a port
-
Some other smaller/different/non-GSoC tasks:
- Add an in control to AudioFileReaders to enable setting the time position within the file
- Enhance the evaluation framework for comparisons between different implementations of chord extraction algorithms
- write unit tests
- allow different attributes as Segmentation (not only Chords_Harte)
- allow different hop sizes
- generally there's room for improvement/rethinking:
- http://www.music-ir.org/mirex2007/index.php?title=Audio_Chord_Detection
- http://en.wikipedia.org/wiki/Recall_(information_retrieval)
- so a clear path needs to be decided upon...
Some short term goals for getting accustomed with the code:
- Make the parameters of TonalAnalysis configurable in NetworkEditor, one by one:
-
tunningEnabled
-
peakWindowingEnabled
-
hopRatio
-
filter (PCPSmoother)
- Change BinsPerOctave into BinsPerSemitone...
- ...first making BinsPerOctave configurable, to check if everyting works
- Make the algorithm "change the tunning" when starting from a different minimum frequency then 98 Hz
- Quick (hopefully) and dirty (hopefully not) solution (hopefully) - change the reference tunning while reconfiguring TonalAnalysis
- Find the reason of the "delete _implenentation; _implementation = new..." crash
-
fix the .clamnetwork files!
-
- Make a precomputed SparseKernel for the default configuration
- Seperate the inner workings of TonalAnalysis into Processing Composites
[edit]
Old Done Tasks
Exercise in using test with cppunit:
-
Adapt InstantTunningEstimator's tests to the changes in the class
-
Change assertFoundCenterIs to use the vector<pair> interface
-
Divide any position (positions and expectedCenter) by 3 and pass 1 as the second constructor parameter
-
Adapt the last two tests to use the vector<pair> interface
-
Remove the useless doIt
-
Change any occurrences of _binsPerSemitone within the class to a 1
-
Adapt user interface and user code
-
Change last two tests to use a special helper function
-
[edit]
Detection algorithm enhancements
Several algorithm enhancements are to be considered:
- Preprocessing
- [done] Compute instant tunning by fasor addition on chroma peaks mapped to a semitone
- [done] Limit the time scope of the global tunning computation (done but improvements needed)
- Improve the limited scope tunning
- Processing
- Find faster and more precise algorithms than the current one
- Emilia's algorithm (peak detect before folding)
- Wavelet based
- Self-Correlation based
- Find faster and more precise algorithms than the current one
- Postprocessing
- [done] Consider the None chord (all pitches) so that we can detect non chord segments and use it as reference for pitchness.
- Symbolical analysis: Instead of correlation, analyze the pcp content using heuristic reasoning (Harte did plain filtering and some )
- Double scope for analysis: Too large windows difuminate transitions but small ones fail to detect arpeggios based chords. We could choose depending on the number of high pitches on the PCP.
- Onset alignment: Use realtime onset detection (aubio?) to 'quantize' the chord segments limits.
[edit]
Helper information
Enrich algorithm output so that the user may take profit of non-perfect algorithm or music that is not using recognized cords (fifths, rare chords...)
- Diffuse guessing: Minimize false positive impact to the user by computing a confidence value for each guess.
- Keeping several candidates so the user may view that he has more than one option.
- Rectified guess: Do a first realtime guess and correct it later if needed as the song goes on.
[edit]
Visualization
In parallel to enhance the algorithm to realtime some views must be developed. Some views ideas:
- [done] KeySpace (Emilia and Jordi's)
- Augmented KeySpace (not just major and minor)
- [done] Tonnetz (pcp)
- Add chord figures to Tonnetz
- Chord torus (map pcp into the tonal torus space)
- Vectors in chord torus (needed to disambiguate dim chords)
- [done] Chord ranking: all chords displayed as sorted probability bars
- Highlight or filter candidates on chord ranking
- Chord candidates: just the ones before the first strong decay
- Realtime segment construction:
- Instant chord segment: Display segment based on the best one on each instant.
- Delayed segment: don't display a segment until we have enough information on the future to make a post processing
- Guessed segments: Until sure, display the guess
- Instrument fingering (playback suggest
- Integrate a tunner
