Knowing which sounds can be produced by a simulated vocal model is not trivial. Being able to map this out is interesting for applications that make use of the extended capabilities of a voice, e.g. singing and voice acting. In previous work - submitted to the Genetic and Evolutionary Computation Conference (GECCO) at the end of January - we developed, implemented and validated a method to explore and steer the expressive capabilities of a state-of-the-art articulatory vocal model using recent Quality-Diversity optimization with multimodal embeddings. Now that we have a basic working implementation of the method we intend to: 1) test and validate the current method more extensively, 2) calibrate its hyperparameters, 3) improve various aspects of the current method, 4) extend it with new functions, 5) apply it to questions in vocal science, music and performing arts.
This research is meant to result in several conference papers, the first of which is aimed at the AI for Music Creativity Conference (AIMC) with deadline april 11. Eventually we also envisage a journal paper that ties the different aspects of this research together before it will become part of the dissertation of co-investigator JG.