An Approach to Discover Similar Musical Patterns

An algorithm has been developed to find the similarity between given songs. The song pattern similarity has been determined by knowing the note structures and the fundamental frequencies of each note of the two songs, under consideration. The statistical concept namely Correlation of Coefficient is used in this work. Correlation of Coefficient is determined by applying 16 Note-Measure Method. If Correlation of Coefficient is near to 1, it indicates that the patterns of the two songs under consideration are similar. Otherwise, there exists a certain percentage of similarity only. This basic principle is used in a set of Indian Classical Music (ICM) based songs. The proposed algorithm can determine the similarity between songs, so that alternative songs in place of some well-known songs can be identified, in terms of the embedded raga patterns. A digital music library has been constructed as a part of this work. The library consists of different songs, their raga name, and their corresponding healing capabilities in terms of music therapy. The proposed work may find application in the area of music therapy. Music therapy is an area of research which is explored significantly in recent time. This work can also be exploited for developing intelligent multimedia tool that is applicable in healthcare domain. A multimedia based mobile app has been developed encapsulating the above mentioned idea that can recommend alternative or similar songs to the existing ICM based songs. This mobile app based music recommendation system may be used for different purposes including entertainment and healthcare. As a result of the applications of the proposed algorithm, similar songs in terms of raga patterns can be discovered from within the pool of a set of songs. A music recommendation system built on this algorithm can retrieve an alternative song from within the pool of songs as a replacement to a well-known song, which otherwise may be used for a particular music therapy. Results are reported and analyzed thoroughly. Future scope of the work is outlined.


I INTRODUCTION
Music has capability to heal some of the illnesses of human body. Thus music is said to have therapy capabilities. Indian Classical Music (ICM) consists of one basic component known as raga. The seven basic notes of music, i.e., Sa, Re, Ga, Ma, Pa, Da and Ni are exploited to create a particular raga [10]. Computational Musicology is an emerging field which draws various basic principles from Computer Science. In the 16 Note-Measure System, the notes Re, Ga, Da, and Ni have three variations and the note Ma has two variations. The 16 different notes are as mentioned below: Sa, Re1, Re2, Re3, Ga1, Ga2, Ga3, Ma1, Ma2, Pa, Da1, Da2, Da3, Ni1, Ni2, and Ni3 [11]. Music therapy is an area of para medicine field in which music is being employed for different therapy applications. Music therapy can be used in curing even psychological and physiological problems like mesothelioma, peritoneal mesothelioma asthma, asbestos cancer, depression etc [11]. The raga of a music is the primary element considered for music therapy. There are various ragas that can be used for different purposes, for example, Ahirbhairav and Todi are used for hypertension, Punnagavarali is used to control anger and violence, Todi is used to relief from cold and headache, Shivaranjani is used for memory related problems, Bhairavi is used to get relief from sinus, cold, phlegm, tooth ache etc [12]. Similarly, Chandrakauns raga is used to treat the heart problems and diabetes, Darbari is used to reduce the tension and to provide relaxation [12]. Therefore, a specific ICM based song of a specific raga is applicable for some health and mind issues [15]. This work introduces an approach by which two similar songs in terms of raga, can be identified. Thus similar musical patterns are possible to identify, computationally. This approach can also be applied to develop an intelligent multimedia mobile application. In turn, such a mobile application may be applied in the electronic healthcare field. This mobile application can recommend alternate music in place of the ICM based songs, having similar healing capabilities. Challenge is to identify similar songs which are suitable for music therapy. ICM is the backbone of this work, as there are a lot of ragas known to be applicable for different therapies. The song pattern similarity can be established by knowing the note structures, and the fundamental frequencies of each note of the songs under consideration. The Correlation of Coefficient is identified by applying 16 Note-Measure Method. If Correlation of Coefficient is close to 1, then it indicates that the patterns of the two songs under consideration are similar. Otherwise, it indicates a certain percentage of similarity only, between the songs. This method has been used in a set of ICM based songs. The ICM based songs are stored in a digital music library along with their raga name, and the corresponding healing capabilities. After applying the algorithm reported in this paper, a set of new songs are discovered as an alternative to the ICM based songs. A multimedia based mobile app has been developed and also reported in this paper that can recommend alternative songs in place of the established ICN based songs, for a particular music therapy. The system that has been reported here has potential to act as an electronic healthcare system for some specific purposes based on music therapy.
Correlation is a statistical concept that helps to analyse and determine the degree of relationship that exists among series of data variables. The degree of relationship among different series of data is expressed by Correlation Coefficient which ranges from -1 ≤ r ≤ +1. The direction of change is determined by a sign. Fig. 1 depicts one example of positive, negative, and zero correlation. Following are the possible correlations: [1] If r = +1, then the relation of the two series of data variables is said to be positively similar.
[2] If r = -1, then the relation of the two series of data variables is said to be negatively similar.
[3] If r = 0, then there exists no similarities between the two series of data variables.
Correlation of Coefficient deals with the association among series of variables and for that reason it has been assumed that it is the basis of similarity mapping between two song structures. The paper proposes a statistical approach for computing the similarity between two different songs or music structures or music patterns by using Correlation Coefficient of the fundamental frequencies of the two given songs.
Music recommendation system plays an important role in music therapy. Music recommendation system recommends music for the users depending on various factors like, human moods, human behaviours, choices, similarities, fundamental frequencies, time slots, etc. Fig.2. depicts the overall relationship between music recommendation system and music therapy. Motivation: There are different folk songs available in different parts of the world. The folk songs are less explored in terms of computational musicology. It is already explored and recommended by competent authority that Indian Classical Music has therapy capabilities and can be used for different treatment, e.g., to treat health and mental problems. A very fundamental question that motivates to do this work is "Does the Indian Folk Music (IFM) too have music therapy capabilities like the Indian Classical Music (ICM)?". Although addressing this question is a larger study with wider scope, in this work, we are motivated to develop a method to find out similarity between songs (for example, ICM based and IFM based songs) from computational musicology perspectives. If alternate songs can be recommended from computing perspective, to use as an alternative to ICM based songs, then in the next stage, the music therapy capabilities of the alternate songs may be examined.
This fact motivates to develop an approach to identify similar music patterns and recommend alternate songs for music therapy.
The contributions made in this work are as mentioned below.
i) An algorithm is proposed to identify similar song patterns. The statistical measure Correlation Coefficient has been exploited in order to find similar music patterns.
ii) A music database has been created comprising of ICM based and IFM based songs. In this song repository, the raga information of the ICN based songs and the healing capabilities of different ragas are also stored.
iii) A mobile app has been developed that works as a Music Recommendation System. The app implements the proposed algorithm (as mentioned in i)) to identify similar songs and it works over the database developed (as mentioned in ii).
iv) The process of finding similar song patterns has been demonstrated through rigorous experiments and statistical analysis of the obtained experimental results.
The proposed algorithm may be used to identify two similar songs considering the frequency as the basis. Thus this is the first step toward identifying ICM music or other music (for example, IFM) with similar therapeutic effects. In fact, establishing similar therapeutic effect of IFM in comparison to ICM is the next step of this work, in which rigorous experiments involving human has already been planned by this group.
The rest of the paper is organized as follows. Section II reports few related works. The proposed algorithm is detailed in section III followed by the section IV, in which experimental results are analysed. The paper is concluded in section V.

II RELATED WORK
There are quite a lot of research works that motivate to work further and explore even new dimensions of musicology research. Computational musicology is the most emerging area that depends on different concepts of computer science. Indian Classical Music (ICM) is relatively complex and a vast area which has not been explored significantly in terms of computational musicology.
Music recommendation system has been explored in [27]. The work provides a personalized music recommendation service with the help of polyphonic music objects using MIDI (Musical Instrument Digital Interface) format. The user analyses the profiles for user grouping based on the behaviours and interests of the users. They use pitch density for track selection that contains the melody which can be calculated as: Where, NP = Number of distinct pitches in the track AP = Number of all distinct pitches in MIDI standard The pitch entropy (PE) can be derived as follows: where, Pj is represented as follows: Nj = Total number of notes with the corresponding pitch in the representative track, T = Total number of notes in the representative track.
The music group containing highly accessed musical objects hold the higher weight than other groups. The weight of music group (GWi) can be calculated as: where TWj = Weight of the transaction Tj n = Number of latest transactions used for analysis MOj,i = Number of music objects which belong to music group Gi in transaction Tj. Different numbers (Ri) of musical objects from music groups are computed (also recommended) according to the GWi, as follows [18]: Although this work is based on music recommendation system, it does not explore Indian Classical Music and associated ragas.
There is a specific relationship between Raga and Rasa. Raga means music origin, and rasa means music emotions. Therefore, music and emotions are directly related to each other and that have been established. A content-based culture-specific music recommendation system model has been proposed in [13].The paper [14] describes a research project that is aimed at developing a music analysis system which presents an analysis of clinical music therapy. Music is a very effective mode of mental treatment and human mental management can be controlled by music therapy, based on ICM [5].
The paper [16] is a reference of website that illustrates different raga names and their respective healing powers. The work reported in [17] introduces music recommendation technique based on content and context information mining. The work reported in [18] introduces a context-aware mobile music recommendation system.
The work reported in [19] is a modeling technique and useful tool that formalizes the music composition rules; the technique increases music analysis speed with the help of Music Petri nets that introduces Schoenberg's rules. The work presented in [20] introduces an approach that determines the similarity mapping between two songs; this is achieved by the notes and the fundamental frequencies of each note of the two songs. The Pearson's Correlation of Coefficient is exploited in this work. The work presented in [21] is a method to generate song list for listening; the songs may be downloaded according to age factor of the online users. It is a web-based application that recommends different songs depending on the listeners' choice that too based on their age group. Songs are downloaded from the music library and unknown songs are classified depending on the review of the users.
The work presented in [22] introduces a model of musical creativity rather than algorithmic music variations with the help of genetic algorithms. The implementation of this model is based on Genome software. A statistical approach has been exploited to find similar song patterns with the help of coefficient of variance [23]. A time based raga recommendation system has been developed by using Neural Networks in [24]. A Music Recommendation System that classifies different songs appropriate for different time of a day has been reported in [25]. An intelligent mechanism to identify the density of a given music rhythm and complexity of that music rhythm automatically has been proposed in [26]. Different music research areas and their applications are illustrated in [27]. The Chi-square table link is given in [28].
Based on this survey it is found that there is no work available in the context of ICM that can recommend alternate songs keeping applications like music therapy in mind. As mentioned under the motivation subsection of the introduction section of this paper, there is a research gap in order to find equivalent songs that can be used as an alternative to an ICM in the context of music therapy. Such an alternative may be a folk song also, that can have similar healing capability with respect to music therapy. However, the work presented in [10] is in the similar direction as the work reported in this paper, but was at a very nascent stage.
Some important characteristics of ICM are enlisted in Table 1. Rhythm is the style or pattern of sound in musical piece that combines with the recurrence of notes and rests. Genre Genre is the style of musical phrases like traditional music, classical music, folk music, devotional music, rock music, pop music, etc. Aalap (Rendition) Aalap is the melodic distribution of musical notes of a particular raga in which the combinations of all possible valid note combinations are performed without any fixed rhythm. Timbre Music Timbre is the quality of music sound or quality of music tone or the quality of music notes. Several instruments can play the same musical pitches or same notes in same volume but can produce different timbres.
Though some of the above characteristics of ICM are applicable for measuring the similarity between two songs, pitch is one of the most important features to find the similarity between the music patterns. Therefore, computing pitch or fundamental frequencies of any song is the primary task to find similarity of two songs using pitches of the songs. Pitch values of the songs can be extracted by any standard music software like Wavesurfer. Wavesurfer has been adopted in this work.
Music Information Retrieval (MIR) involves series of activities like, music recommendation, song detection, music genre recognition, pitch tracking, music score generation, beat tracking, music transcription, music mood similarity mapping, music melodic similarity, musical instrument recognition, tempo estimation, query by humming etc.
Content based Music Information Retrieval System is the combinations of different research fields like computational musicology, music cognition, music perception, and these research fields are applied for intelligent music recommendation system for advanced searching, processing, and retrieval of music. The overall architecture of Content based Music Retrieval System is depicted in Fig 3. In the first phase of Fig 3, the required data are gathered and organized. The phase consists of the three subphases -feature extraction, normalization, and storage. Feature extraction includes pitch processing, timbre generation, computation of loudness from several audio files. In data normalization phase the extracted data are normalized using some standard form, and finally, date are stored in database using indexing. In the next phase, music similarity mapping can be achieved by using different similarity measuring algorithm [1].
As per [2], the music similarity mapping in MIR has two basic research areas, first, exploring the overall functionalities and application areas of MIR, and second, music similarity mapping through different procedures and comparative performance analysis among these procedures.
Some relevant music similarity mapping approaches using the different features of ICM are enlisted in table 2.

III PROPOSED METHOD
In this section, an algorithm has been proposed that can be used to identify a similar song to a benchmark song. For validation of the algorithm, a song library has been created containing different ICM based songs. Different songs, their associated raga, and corresponding healing capabilities of various ragas are also stored in the library. After applying the algorithm on the library / database, a new song database has been created containing the songs having the same healing power. Thus groups of similar songs are getting created automatically. It is expected that songs of a particular group can be used alternatively as they have similarity in terms of their embedded ragas.
In order to find the similarities of the frequency patterns of two given songs, a statistical method based on Correlation Coefficient has been proposed which is the VOLUME XX, 2017 9 core of the proposed algorithm. The primary objective of the Correlation Coefficient is to examine whether the two series of fundamental frequencies of two given song structures, are similar. Moreover, Correlation Coefficient may also be used to determine whether the fundamental pitches of the two songs are significantly similar or not.
Finally, an app (for mobile) has been developed embedding the proposed algorithm, and the song database as mentioned above. The app is able to find the similarity between two songs running the proposed algorithm. Depending on the fundamental frequency patterns, similar songs are discovered. It is already known that some songs based on ICM have healing power that can be used in music therapy for treating different health and mind issues. This app is implemented in such a way that it can function as a search engine and also a music recommender that recommends alternate songs instead of the standard ICM that may have similar healing capability. Thus the app functions as a music recommendation system. Moreover, the app contains a large collection of Indian Folk Music (IFM). Thus as an alternative to the ICM, an IFM from within the database may be recommended. Such an app may be used as an E-Healthcare application. The structure of the app is outlined below.
App Name: The app consists of eight different components: (1) Playlist: It is a list of video or audio files that can be played on a media player. (2) Artists: It is a list of artists that can be played on a media player (3) Albums: It is a collection of audio or video recordings treated as a collection of songs. (4) Songs: It is a collection of note structures performed by singers. (5) TV & Movies: Some media from where music files can be downloaded. (6) Downloaded Music: It is the digital transfer of music through the Internet into a device capable of storing locally. (7) Song Similarity Mapping: It is the primary focus of this app; using the proposed algorithm, the similarity between two or more songs is determined.
(8) Music Therapy in E-Healthcare: It is another focus of this app; it may be used as a tool for music therapy. This part functions as a multimedia based mobile E-Healthcare application.
Proposed Algorithm for Song Similarity Mapping: In this sub-section, the proposed algorithm has been detailed that can be used to find song similarity. It is necessary to know the note structures and the fundamental frequencies of each note of the two songs in order to find the similarity between the two songs through the algorithm designed.
In order to create the pitch file and determine the fundamental frequency of the songs, the wave surfer software has been used. The following procedure describes how to use the wave surfer software in order to create the pitch file (pre-processing); the proposed algorithm to find song similarity has also been presented in this procedure.
Proposed Procedure: Pre-Processing: (1) Pick a song of a particular Raga that has some healing power and another normal song from the available song library/repository.
(2) Click on the wave surfer button to open the software. Run one song through the Wave Surfer which is used to generate the pitch values of that song. Firstly ".mp3" song file is used to build the pitch file of the song with the extension of .f0. This file consists of all the pitches that are used in the song.
(3) The basic steps to create the .f0 format file from a given song are: Step3: Compute total pitch value of individual note of Song 1 using the following expression: Where, i = 1, 2, 3……………, n = Total Pitch value of individual note of Song 1, = Frequency of individual note of Song1, = Occurrence of individual note of Song 1.
Step4: Compute total pitch value of individual note of Song 2 using the following expression: = × Step5: Now find the value of i F S1 and i F S 2 as follows: Now consider the sum of square (SoS) values of a set of n fundamental frequencies of (S1Fi, S2Fi) about i F S1 and i F S 2 respectively as: Now the song similarity can be measured by Pearson's Correlation formula using the following expression: Using the above mentioned algorithm, song similarity may be computed in terms of percentage.
In the following section, results of different experiments implementing this algorithm have been presented. The mobile app that has been developed is based on this algorithm. The mobile app is able to recommend alternate songs to ICM based songs from a pool of IFM based songs, according to this algorithm.

IV RESULTS AND ANALYSIS
As mentioned above, an app for mobile has been developed that implements the algorithm described in the previous section. A snapshot of the user interface of the "Multimedia Mobile E-Healthcare System" is presented in Fig. 4      are presented in Fig. 10. Thus the second test case is, Test Case 2: Song 1 is compared with Song 3.

Song 4:
Oh Jeebon re Song Type: Indian Folk Music Singer: Jubin Garg The Waveform, Spectrogram, and Pitch Contour of Song 4 are depicted in Fig. 11, and Mean plus Standard Deviation of Song 4 (Oh Jeebon re) are presented in  Thus the third test case is, Test Case 3: Song 1 is compared with Song 4.  Thus the fourth test case is, Test Case 4: Song 1 is compared with Song 5.
Expected frequencies may be computed depending on the observed frequencies as given in (15).
In order to determine the degree of association between Song 1 and Song 2, Chi-Square measure and Coefficient of Contingency are computed by using the equations (16) and (17), respectively.     In Table 7, = Pitch frequency, = Pitch occurrence, = total Pitch Frequency of Song 1; and = Pitch frequency, = Pitch occurrence, = total Pitch frequency of Song 3. Table 8 presents the song similarity between Song1 and Song3 and its value is 0.7144741  71.44741%.  Table 9 presents the expected frequencies depending on observed frequencies of Song 1 and Song 3.    In Test Case 2, Degree of Freedom (df) = 15 and between Song 1 and Song 3 = 17.6666808. From the chi-square table, for df = 15, chi-square value at 0.05 level is 24.996 and at 0.025 level is 27.488. Therefore, calculated value of chi-square is less than the both tabulated values, and the value of coefficient of contingency is near to zero. Therefore, there is no significant difference between the two series. Thus it is mostly significant and rejects the null hypothesis; thus the conclusion is that Song 1 and Song 3 are similar at a certain percentage. Fig. 16 presents the comparison between Song 1 and Song 3 with respect to Observed Frequency and Expected Frequency.   In Table 11, = Pitch frequency, = Pitch occurrence, = total Pitch frequency of Song 1; and = Pitch frequency, = Pitch occurrence, = total Pitch frequency of Song 4. Table 12 presents the song similarity between Song1 and Song3 and its value is 0.7860254  78.60254%.  Table 13 presents the expected frequencies depending on the observed frequencies of Song 1 and Song 4.   Findings from Test Case 4:  In Table 15, = Pitch frequency, = Pitch occurrence, = Total Pitch frequency of Song 1; and = Pitch frequency, = Pitch occurrence, = total Pitch frequency of Song 5.       Comparative Performance Evaluation: In this subsection, a comparative performance analysis is presented. Apart from considering the findings from the above mentioned four test cases using the proposed method (i.e., using correlation co-efficient), outcomes regarding the similarity level determined between two songs have been calculated by using different sound analysis methods such as mean and standard deviation. Then all the computed values are presented in Table 19.  Two other similarity measures used for sound analysis, namely, Mean Similarity and Standard Deviation Similarity have also been explored for determining similarity levels between two input songs.
The goal of this work is to develop a methodology to identify similar songs in terms of their corresponding raga contents. Such a methodology may be applicable in the broad areas of music information retrieval (MIR). An application of such a method may be found in recommending an alternate but similar song to a standard song having well-known therapy capability or healing power. Based on the similarity value of two songs determined by the proposed method, it may be decided to replace one song by the other for music therapy like applications.
In future, rigorous experiments shall be carried out to measure the actual impact of the alternate songs identified by the proposed method in terms of their therapy capabilities, so that alternate music therapy can also be applied to the prospective users. These alternate songs are planned to be picked up from IFM like Goalparia Lokgeet, Kamrupia Lokgeet and Boul Geet.