Audio Augmented Reality to Enhance Visual Art Experiences
- 예지 이
- 2024년 10월 29일
- 6분 분량

Augmenting visual art in art galleries can be an effective Audio Augmented Reality (AAR) application for indoor
exploration. In the current study, eight paintings from four genres were augmented with audio through their
sonification. Basic Audio was generated using a sonification algorithm by identifying the major colors of the
paintings, and Enhanced Audio was generated by a musician enhancing the Basic Audio; these were presented
with the paintings to compare against No Audio. Twenty-six participants viewed each painting in all three
conditions; eye gaze metrics, and qualitative data were collected. Results showed that Enhanced Audio led to
significantly greater engagement and positive sentiments, compared to Basic Audio. Thematic analysis showed
semantic and syntactic relationships of the audio with the paintings, and a tendency to guide users’ gaze over
time. Findings from this study can guide future AAR developments to improve auditory display designs to
enhance visual experiences.
All Python code, basic audio files, enhanced audio files, and paintings are available here: Open Science Framework link.
1. Basic audio
The Basic Audio was generated from a sonification algorithm. This sonification algorithm was created in Jython Music (Manaris et al.,2016) in a prior work that sought to create and validate this algorithm
(Nadri et al., 2022). They followed a systematic approach that involved interviews with visual art, sonification, and psychoacoustic experts. Experts identified visual parameters such as structure, color, and lightness,
auditory parameters such as tempo, pitch, amplitude, rhythm and timbre, and contextual parameters such as artist’s background and time period as important considerations when developing a sonification algorithm.
Based on the interviews, it was determined that generating audio based on the most visually salient regions of the visual artwork was recognized as a novel way to create a naturalistic gaze behavior, but in the auditory modality (Nadri et al., 2022). This approach could be considered musification, where visual data is auralized to create a musical performance (Barat`e et al., 2023; Visi et al., 2014), designed to provide viewers with a multimodal immersive experience so as to engage the viewer to a greater extent. The first step to creating the Basic Audio out of the paintings was to conduct a visual parameter analysis; this was done at https://www.geotests.net/couleurs/v2/. Image files of the paintings were uploaded to the website, and default settings were used except for the following – “Working color space” was changed to “HSL”, and the “Threshold of color difference” was adjusted to produce a.csv file with no more than 60 rows of unique hue values. The .csv file contained the color palette information of the painting. Next, in a new column in the .csv file, the percentage of each hue (cumulative sum) was calculated, by dividing the ‘number of pixels’ in each row by the sum of the ‘number of pixels’. In a separate column, the cumulative sum of the hue percentages was then calculated. Finally, the top hues that cumulated to at least 80 % of the number of pixels in the painting were
selected to be fed to a sonification algorithm, thus ensuring at least 80 % of the area of the painting was sonified. This approach could be considered similar to the sonification method in Photone (R¨onnberg and
L¨owgren, 2018), but instead of one main hue, the current method identified multiple main hues presented in the paintings. In the current work, the algorithm was adapted from the Nadri et al. (2022) work. The top hues, which comprised at least 80 % of the painting area, were mapped to pitch, or notes. Additionally, two lists of chords were also generated on four octaves each, with the lower note of a chord being the same note as the one for the corresponding hue. Using a loop, a random function was used to generate numbers and convert them to percentages; this value was compared to the hue percentages to identify the closest corresponding hue and play the corresponding note. If the percentage value was the same for two hues, the corresponding note would
be more likely to play (Nadri et al., 2022). All of the Python codes for producing the Basic Audio files can be found here: (https://osf.io/698sq/?view_only=1382088cb7e94b05a6f8f5536fb66775). In this manner, it was anticipated that the audio would reflect the similar manner in which a viewer might view the painting, by focusing their gaze on the most salient objects and colors. Other musical parameters consisted of Aeolian Scale, pitch range of 12 – 60 in MIDI note numbers, note duration range of 0.75 – 2.0 in seconds, and volume range of 30 – 40 in MIDI note numbers. The output MIDI files which were played with two grand pianos to manage a broad range of notes and dynamics, and one percussion instrument (SoCal) through GarageBand software on a
2021 MacBook Pro 16″. The pianos and percussions were selected as default instrument options following the same options as used in the past work from which the algorithm was adapted (Nadri et al., 2022). Further, these instruments are relatively simpler instrument options which provide greater equity and access to those who are new to sonification, students, early researchers, and young practitioners. In addition, the piano instrument is also the default instrument option in Jython application for sonifying visual art (Manaris et al., 2016).

2. Enhanced audio
To enhance the musical quality of the obtained MIDI files, and to engross users in the artwork, Ableton Live 11 was used to design the sonification according to the following principles. These principles indicate how the characteristics of each painting genre, the content of the paintings, and the overall atmosphere depicted in the paintings were mapped to musical parameters. Color in artwork could be considered similar to tone color in music, which is the timbre of an instrument, and it tends to vary according to the type and range of the instrument used (Adams, 1995). Furthermore, musical meaning, structures and elements have a significant relationship with culture, era, and genre (Vad´en & Torvinen, 2014). Given this perspective, the instruments were chosen for each painting genre by considering each genre’s characteristics to enhance the emotion felt when viewing paintings for each genre. For example, Abstract paintings characteristics are considered to be the
opposite of Realistic paintings (Sparks Gallery, n.d.), which leads the use of electronic instruments for Abstract paintings to express a surreal mood. On the other hand, classical instruments could be used to create sonification that gives a relatively more traditional mood for Renaissance, Realism, and Impressionist paintings. In the context of the painting, the overall mood of the sonification was changed by considering
the color combination within the painting itself and assessing whether the brightness was high or low, which led to a choice between a major or minor scale. To provide a more detailed explanation of the characteristics specific to each genre, sonification was designed using the following criteria. In terms of instrument selection, electronic instruments were used for abstract paintings, while classical instruments were chosen for other genres. 1) Impressionism was characterized by the movement of light and color, often using bold brushstrokes and vibrant colors (Callen, 2000). To recreate the dreamy-like feeling expressed in the painting through sonification, a soft and warm instrument was used, along with major chords and reverb effects to create a smooth and surreal sonification. 2) For the Realist paintings, which focused on depicting the real world (“Characteristics of Realism Art,” The Artchive, n.d.), the sonification was designed to aid in realistic depictions. For instance, in the painting “Napoleon Crossing the Alps”, rhythmic patterns reminiscent of a marching atmosphere were used to emphasize the heroic feeling of war, and the horn instrument related to calling out the horse was incorporated to enhance the painting’s realistic ambience. 3) For the Abstract paintings, electronic instruments were used to emphasize the unreal feeling depicted in the artwork, and various sound effects were applied to enhance a transcendent sensation. For instance, in the painting “Guernica”, given its limited color palette, electronic instruments were used to express the absence of varied timbral sensation. Additionally, spatial qualities were applied to the sonification to portray the multiple viewpoints expressed in the painting (Ginev, 2020). 4) Regarding the sonification of Renaissance paintings, the sonification aimed to assist in understanding the story of the painting. The focus was more on highlighting the symbolic and religious meanings within the painting (Hope, 1986). For example, in “The School of Athens”, to depict the scene where many scholars gathered and engaged in discussion, it was intended to express murmuring sound coming from various places. To convey the seriousness and profundity of their conversation, the overall notes were lowered by one octave to represent the weight of the dialogue. Additionally, to portray the sensation of voices resonating from the dome, an echo effect was used, creating an immersive experience
for the audience as if they were inside the dome.

Comments