research

Research in sound and music computing, with interests in sound analysis and retrieval, audio source separation, environmental sound recognition, interactive systems and participatory music.

2023

ROMA, G., (2023) Agent-based Music Live Coding: Sonic adventures in 2D. Organised Sound 28 (2), pp. 231-240.

2022

TREMBLAY, P.A. , ROMA, G., GREEN, O., (2022). The Fluid Corpus Manipulation Toolkit: enabling programmatic data mining as musicking. Computer Music Journal .45(2), pp. 9-23.
GREEN, O., TREMBLAY, P.A. , MOORE, T., BRADBURY, J., HART, J., HARKER., A. and ROMA, G., (2022). Architecture about Dancing: Creating a Cross Environment, Cross Domain Framework for Creative Coding Musicians. Proceedings of the 33rd Annual Workshop of the Psychology of Programming Interest Group (PPIG).
ROMA, G., (2022). Comparing approaches for new AudioWorklets. Pro- ceedings of the 7th Web Audio Conference (WAC).

2021

ROMA, G., XAMBÓ, A., GREEN, O. And TREMBLAY, P. A., (2021). A General Framework for Visualization of Sound Collections in Musical Interfaces. Applied Sicences, 2021, 11 (24) (online)
TREMBLAY, P.A. , ROMA, G., GREEN, O., (2021). Digging it: Pro- grammatic Data Mining as Musicking. Proceedings of the 2021 Interna- tional Computer Music Conference (ICMC).
ROMA, G., GREEN, O. and TREMBLAY, P.A. (2021) Graph-based au- dio looping and granulation. Proceedings of the 24th International Con- ference on Digital Audio Effects (DAFx-21).

2020

ROMA, G., GREEN, O. and TREMBLAY, P.A. (2020) Audio morphing using matrix decomposition and optimal transport. Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx-20).
XAMBÓ, A., and ROMA, G. (2020) Performing Audiences: Composition Strategies for Network Music using Mobile Phones. Proceedings of the 20th International Conference on New Interfaces for Musical Expression (NIME).

2019

ROMA, G., GREEN, O. and TREMBLAY, P.A. (2019). Time scale modification of audio using non-negative matrix factorization. In: Proceedings of the 22nd International Conference on Digital Audio Effects (DAFX).
TREMBLAY, P.A. ,GREEN, O., and ROMA, G., (2019). From Collections to Corpora: Exploring Sounds through Fluid Decomposition. In: Proceedings of the ICMC-2019.
ROMA, G., GREEN, O. and TREMBLAY, P.A. (2019). Adaptive Mapping of Sound Collections for Data-driven Musical Interfaces. In: Proceedings of the Conference on New Interfaces for Musical Expression (NIME).

2018

ROMA, G., GREEN, O. and TREMBLAY, P.A. (2018). Stationary / transient separation using convolutional autoencoders. In: Proceedings of the 21st International Conference on Digital Audio Effects (DAFX).
ROMA, G., GREEN, O. and TREMBLAY, P.A. (2018). Improving single-network single-channel separation of musical audio with convolutional layers. In: Proceedings of the 14th International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA).
XAMBÓ, A., PAUWELS, J., ROMA, G., BARTHET, M. and FAZEKAS G. (2018) Jam with Jamendo: Querying a Large Music Collection by Chords from a Learner’s Perspective. In: Proceedings of the 13th International Audio Mostly Conference.
ROMA, G., XAMBÓ, A., GREEN, O. and TREMBLAY, P.A., (2018). A Javascript Library for Flexible Visualization of Audio Descriptors. In: Proceedings of the 4th Web Audio Conference (WAC).
XAMBÓ, A., PAUWELS, J., ROMA, G., BARTHET, M. and FAZEKAS G., (2018) Exploring Real-time Visualisations to Support Chord Learning with a Large Music Collection. In: Proceedings of the 4th Web Audio Conference (WAC).
ROMA, G., XAMBÓ, A. and FREEMAN, J. (2018). User-independent Accelerometer Gesture Recognition for Participatory Mobile Music. Journal of the Audio Engineering Society (JAES) .66 (6), pp .430-438.
XAMBÓ, A., ROMA, G., SHAH, P., TSUCHIYA, T., FREEMAN, J. and MAGERKO, B. (2018). Turn-taking and online chatting in Co-located and remote collaborative music live coding. Journal of the Audio Engineering Society (JAES) .66 (4), pp. 253-266.

2017

ROMA, G., XAMBÓ, A. and FREEMAN, J. (2017). Handwaving: Gesture Recognition for Participatory Mobile Music. In: Proceedings of the 12th International Audio Mostly Conference.
GRAIS, E., ROMA, G., SIMPSON, A. JR. and PLUMBLEY, M. (2017) Discriminative enhancement for single channel audio source separation using deep neural networks. In: Proceedings of the 13th International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA).
SIMPSON, A. JR, ROMA, G., GRAIS, E. and PLUMBLEY, M. (2017) Psychophysical evaluation of audio source separation methods. In: Proceedings of the 13th International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA).
ROMA, G., XAMBÓ, A. and FREEMAN, J. (2017). Loop-aware Audio Recording for the Web. In: Proceedings of the 3rd Web Audio Conference (WAC).
ROMA, G., HERRERA, P. and NOGUEIRA W. (2017) Environmental sound recognition using short-time feature aggregation. Journal of Intelligent Information Systems (JIIS), 2017 (online).
GRAIS, E., ROMA, G., SIMPSON, A. and PLUMBLEY, M. (2017). Two-stage single-channel audio source separation using deep neural networks. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP) 25 (9), pp. 773-1783.
XAMBÓ, A., ROMA, G., SHAH, P., FREEMAN, J. and MAGERKO, B. (2017) Computational Challenges of Co-creation in Collaborative Music Live Coding: An Outline. In: 2017 Co-Creation Workshop at the International Conference on Computational Creativity (ICCC).

2016

ROMA, G., GRAIS, E.M. , SIMPSON, A. JR. and PLUMBLEY, M.D. (2016). Music remixing and upmixing using source separation. 2nd AES Workshop on Intelligent Music Production (WIMP).
ROMA, G. (2016). Colliding: a SuperCollider environment for synthesis-oriented live coding. In: Proceedings of the 2nd International Conference on Live Interfaces (ICLI).
ROMA, G., SIMPSON, A. JR., GRAIS, E. and PLUMBLEY, M. (2016). Remixing musical audio on the web using source separation. In: Proceedings of the 2nd Web Audio Conference (WAC).

2015

SIMPSON, A. JR., ROMA, G. and PLUMBLEY, M. (2015). Deep Karaoke: Extracting Vocals from Musical Mixtures Using a Convolutional Deep Neural Network. In: Proceedings of the 12th International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA).
ROMA, G. Algorithms and representations for supporting online music creation with large-scale audio databases. PhD thesis, Universitat Pompeu Fabra, 2015.
ROMA, G., & SERRA X. (2015). Querying Freesound with a microphone. In: Proceedings of the 1st Web Audio Conference (WAC).
ROMA, G., & SERRA X. (2015). Music performance by discovering community loops. In: Proceedings of the 1st Web Audio Conference (WAC).

2014

XAMBÓ, A., ROMA G., LANEY R., DOBBYN C. and JORDÀ S. (2014). SoundXY4: Supporting Tabletop Collaboration and Awareness with Ambisonics Spatialisation. In: Proceedings of the 14th International Conference on New Interfaces for Musical Expression (NIME).
BOGDANOV, D., WACK N., GÓMEZ E., GULATI S., HERRERA P., MAYOR O., et al. (2014). ESSENTIA: an open source library for audio analysis. ACM SIGMM Records. 6 (1), (online)

2013

BOGDANOV, D., WACK N., GÓMEZ E., GULATI S., HERRERA P., MAYOR O., et al. (2013). ESSENTIA: an Open-Source Library for Sound and Music Analysis. In: Proceedings of the ACM International Conference on Multimedia.
FONT, F., ROMA G. and Serra X. (2013). Freesound Technical Demo. In: Proceedings of the ACM International Conference on Multimedia (MM).
ROMA, G., NOGUEIRA W. and HERRERA P. (2013). Recurrence Quantification Analysis Features for Environmental Sound Recognition. In: Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
MAYOR O., et al. (2013). ESSENTIA: an Audio Analysis Library for Music Information Retrieval. In: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR).
ROMA, G., and HERRERA P. (2013). Representing Music as Work in Progress. In: STEYN, J., ed. Structuring Music through Markup Language: Designs and Architectures. IGI Global, 2013, pp. 119-134.

2012

FONT, F., ROMA G., HERRERA P. and SERRA X. (2012). Characterization of the Freesound Online Community. In: Proceedings of the 3rd International Workshop on Cognitive Information Processing.
ROMA, G., XAMBÓ A., HERRERA P. and LANEY R. (2012). Factors in human recognition of timbre lexicons generated by data clustering. In: Proceedings of the Sound and Music Computing Conference (SMC).
ROMA, G., ZANIN M., HERRERA P., TORAL S. L., FONT F. and SERRA X. (2012). Small world networks and creativity in audio clip sharing. International Journal of Social Network Mining (IJSNM), 2012, 1(1), pp. 112 - 127.

2011

JANER, J., ROMA G. and KERSTEN, S. (2011). Authoring augmented soundscapes with user-contributed content. In: Proceedings of the ISMAR Workshop on Authoring Solutions for Augmented Reality.
AKKERMANS, V., FONT F., FUNOLLET J., DE JONG, B, ROMA G., TOGIAS S., et al. (2011). Freesound 2: An Improved Platform for Sharing Audio Clips. In: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR).
JANER, J., KERSTEN S., SCHIROSA M. and ROMA G. (2011). An online platform for interactive soundscapes with user-contributed content. In: Proceedings of the AES 41st International Conference on Audio for Games.

2010

ROMA, G. and HERRERA P. (2010). Community structure in audio clip sharing. In: Proceedings of the International Conference on Intelligent Networking and Collaborative Systems (INCoS).
ROMA, G. and HERRERA P. (2010). Graph grammar representation for collaborative sample-based music creation. In: Proceedings of the 5th Audio Mostly Conference.
SCHIROSA, M., JANER J., KERSTEN S. and ROMA G. (2010). A system for soundscape generation, composition and streaming. XVII CIM-Colloquium of Musical Informatics.
ROMA, G., JANER J., KERSTEN S., SCHIROSA M. and HERRERA P. (2010). Content-based retrieval from unstructured databases using an ecological acoustics taxonomy. In: Proceedings of the International Community for the Auditory Display (ICAD).
ROMA, G., JANER J., KERSTEN S., SCHIROSA M., HERRERA P. and SERRA X. (2010). Ecological acoustics perspective for content-based retrieval of environmental sounds. EURASIP Journal on Audio, Speech, and Music Processing, 2010 (online).

2009

JANER, J., FINNEY N., ROMA G., KERSTEN S. and SERRA X. (2009). Supporting Soundscape Design in Virtual Environments with Content-based Audio Retrieval. Journal of Virtual Worlds Research, 2009, 2(3) (online).
ROMA, G., HERRERA P. and SERRA X. (2009). Freesound Radio: supporting music creation by exploration of a sound database. Computational Creativity Support Workshop CHI09.
JANER, J., HARO M., ROMA G., FUJISHIMA T. and KOJIMA N. (2009). Sound Object Classification for Symbolic Audio Mosaicing: A Proof-of-Concept. In: Proceedings of the Sound and Music Computing Conference (SMC).

2008

ROMA, G. and XAMBÓ A. (2008). A tabletop waveform editor for live performance. In: Proceedings of the 8th International Conference on New Interfaces for Musical Expression (NIME).
ROMA, G. Freesound Radio: supporting collective organization of sounds. Master thesis, Universitat Pompeu Fabra, 2008.