Effects of Sentiment-based Sonification on Fairy Tale Listening Experience

예지 이
2024년 10월 29일
5분 분량

Sonification translates emotional data into sound, allowing us to interpret emotional information. However, the effectiveness of sonification varies, depending on its application. The main purpose of the current study is to investigate how the emotion-reflecting sonification vs. the emotion-mitigating sonification influences the experience of listening to fairy tales. We analyzed three fairy tales using sentiment analysis and created two types of sonification: one that reflects the representative emotion of each paragraph (emotion-reflecting sonification), and another that reflects the opposite emotion of the paragraph (emotion-mitigating sonification). For the story-telling agent, we used a humanoid robot, NAO and designed three voices with Azure. We conducted five focus groups with 22 participants. The results showed that participants' ratings of the “pleasing”, “empathy”, and “immersiveness” categories were higher in emotion-mitigating sonification than in emotion-reflecting sonification. Moreover, participants showed a preference for male voice than female or child voices. The results are discussed with implications and future directions.

All sonification files, robot speech recordings, and sample video files are available here: Open Science Framework link.

1. Green Frog (Emotion-mitigating Sonification)

2. Green Frog (Emotion-reflecting Sonification)

3. Green Frog (No Sonification)

1. Sonification Design Based on Sentiment Analysis Results

We divided each fairy tale into nine to ten paragraphs and assigned three to four paragraphs for each sonification condition (emotion-reflecting sonification, emotion-mitigating sonification, and no sonification). To design emotion-reflecting sonification corresponding to the representative emotion of each paragraph, the paragraphs of the fairy tales were analyzed using sentiment analysis. Based on the result, we mapped musical parameters with emotions using the arousal-valence model (Kim et al., 2011). To design emotion-mitigating sonification that soothes the emotions derived from sentiment analysis results, we also utilized the arousal-valence model (Kim et al., 2011) to select emotions that were precisely opposite to the representative emotions of each paragraph. Then, sonification was designed based on this opposite emotion of the representative emotion. In the arousal-valence model, arousal is related to the amount of energy in emotion, while valence is related to whether the emotion is positive or negative (Griffiths et al., 2021). The arousal-valence model is shown in Figure 1.

Figure 1. Arousal-Valence Model

Table 1 compared the sonification conditions used for each fairy tale with corresponding emotions. We presented emotions (with the frequency) for each sonification condition. We randomized the order of sonification conditions that each group of participants would experience, and we also changed the sequence of each fairy tale for each group of participants.

Table 1. Comparison of emotions for each sonification condition in Three Fairy Tales

	The Rabbit’s Liver	The Green Frog	The Tiger and the Persimmon
Emotion-Reflecting Sonification	Joy (2) Surprise (1) Anger (1)	Sadness (3)	Fear (1) Anger (2)
Emotion-Mitigating Sonification	Calmness (2) Contentment (1) Sadness (1)	Calmness (3)	Contentment (2) Happiness (2)
No Sonification	-	-	-

We selected three relatively lesser-known classic music as a basis for each fairy tale: “Ballade No.2 in F major, Op. 38” by Chopin for “The Tiger and the Persimmon”, “Impromptu” by Anatoly Lyadov for “The Rabbit’s Liver”, and “Gymnopédie No.1” by Erik Satie for “The green frog.”

We used Ableton Live 11 (Hein, 2021) to design sonification with MIDI files by adopting musical parameters corresponding to each emotion. We mapped these parameters with three classical music pieces we mentioned above. All three pieces have been modified as in Table 2.

Table 2. Mapping musical parameters to six emotions

	Musical Parameters
	Mode	Tempo	Instrument	Effect
Happiness	Phrygian Modes	50 or 100	Basic Brushed Bells	Ubiquitous
Joy	Major Pentatonic Phrygian	90 -120	Basic Brushed Bells Obelisk Bells	Ubiquitous Arp Streets Echo
Anger	Minor Blues Minor Melodic Down	300	Grand Piano	Grandiose Fast Attack Eternal Sunshine
Sadness	Minor Blues Kumoi Insen	300	Synthetic String Cruiser String	-12 Pitch
Fear	Minor Pentatonic	90	Synthetic String Cruiser String	+12 Pitch
Surprise	Lydian Augmented	120	Organ Retro Harmonic	+12 Pitch
Calmness	Major Pentatonic Phrygian	110 -130	Basic Brushed bells	-12 Pitch
Contentment	Major Phrygian Mode	50 - 60	Basic Brushed Bells Basic Bells	N/A

2. Mode

The mode is a harmonic structure related to musical valence (Wallis et al., 2008), which determines the entire mood and tonality whether the music evokes positive or negative emotions (Webster & Weir, 2005). According to the previous research, people were able to feel certain emotions by listening to a specific musical mode. For example, the major mode is associated with feeling of brightness, warmth, and cuteness (Hoshino, 1996), while the minor mode is related to melancholy, darkness, sadness, and anxiety. Furthermore, Japanese pentatonic scales such as the YOH mode and IN mode are associated with a vague, mysterious impression (Abe & Hoshino, 1990), due to their unique group of intervals (e.g., the Kumoi scale). Additionally, the Lydian mode is related to ethereal, dreamy, and futuristic qualities (Hein, 2010). The Phrygian dominant mode is associated with exotic, while the Phrygian mode is linked to Spanish characteristics (Hein, 2010). Meanwhile, the blues scale is associated with a bluesy feeling (Hein, 2010). Based on this research, we used Major pentatonic, Phrygian, Lydian, and Augmented scales for positive valence emotions, while Minor blues, Minor melodic down, Kumoi, Insen, and Minor pentatonic scales for negative valence

emotion.

2. Tempo

The tempo is related to musical arousal (Wallis et al., 2008), which means that high tempo is related to high arousal emotions such as surprise, and anger, while low tempo is related to low arousal emotions like drowsiness and calmness. Note that perceived tempo can vary depending on the context (Wallis et al., 2008). For example, in the “Gymnopédie No.1” by Erik Satie, the emotion of the first paragraph of “The green frog” was calmness so we used 130 bpm for this emotion. The emotion of the second paragraph was sadness, so that we used 300 bpm because sad emotion has a relatively higher arousal compared to calm emotion. In terms of perceived tempo, the music style of "Gymnopédie No.1" by Erik Satie is characterized by sparse musical events so that the difference between 150 bpm and 300 bpm was not that significant.

3. Instrument

According to the previous research, percussion instruments are related to happy emotion, while string instruments are related to sad emotion (Rajesh & Nalini, 2020). Also, flute is related to neutral emotion, while brass instruments are related to fear (Rajesh & Nalini, 2020). Another research investigated the emotion people felt when playing string instruments, including the double bass, cello, viola, and violin at the same pitch and dynamic level (Chan et al., 2018). For example, the viola was significant in expressing emotions of happiness and calmness compared to other instruments (Chan et al., 2018). This research also showed that different pitches and dynamics of the instruments can change emotional characteristics of sound (Chan et al., 2018). For example, the emotion of fear was intense at the extreme high pitches and loud notes within string instruments (Chan et al., 2018).

Based on this result, we selected Virtual Studio Technology Instrument (VSTi) (Tanev & Bozhinovski, 2013) in Ableton Live 11 (Hein, 2021). Due to the significant variations in the inherent dynamics and pitches of instruments within VSTi, we selected the instruments most appropriate for each emotion, thoroughly taking into account the aforementioned aspects. For joy and calmness, we used a percussion instrument, such as Basic Brushed Bells and Obellisk Bells. For anger, we used a percussion instrument, Grand Piano. For sadness, we used string instrument, such as Synthetic String or Cruiser String. For fear, we used string instruments such as Synthetic String and Organ Vibrato. For surprise, we used string instrument like Organ Retro Harmonic.

4. Effect

The effect was used to enhance positive or negative affect (valence), or to increase or release the feeling of tension (arousal). We adjusted the pitch and dynamics using effects, which enhanced the expression of specific emotion (Chan et al., 2018). For example, for joy, we used the bouncing effect such as Ubiquitous or Arp Streets Echo. For anger, we used effects that exaggerate the sound texture such as Grandiose, Fast Attack, or Eternal Sunshine. For sadness and calmness, we applied an effect that lowers the pitch by one octave. For fear and surprise, we raised the pitch one octave higher.

Based on the following criteria, the emotion-reflecting sonification and the emotion-mitigating sonification of each paragraph were designed.