How Google Assistant recognizes half a million songs sung

The Google Assistant teams explain how they managed to make the personal assistant able to recognize music sung by the user.

Google Assistant

Since October, Google Assistant has been able to recognize the songs you hum. A very practical way to find that song or artist that you liked a lot, but whose name escapes you. This great technical feat is based, not surprisingly, on Google’s advances in terms of machine learning.

The Mountain View firm seems to be particularly proud of it and recently dedicated a blog post to the explanation of this technology. This is an opportunity to learn that Google Assistant can already recognize 500,000 songs when you hum them. This number is impressive in itself, but the room for improvement is still very large given the hundreds of millions of songs that exist around the world. The database used for this function is therefore expected to expand.

A story of numbers

In addition, the teams at Google also explain what the algorithms they have developed do behind the scenes. On the one hand, it should be known that they were based on the work already done to design the “Listening” feature introduced in 2017 on Pixel smartphones. This option allows you to indicate which music is being played near the user, even without an internet connection.

From there, Google used techniques of machine learning so that his Assistant is able to concentrate only on the hummed melody, without considering the instruments or the quality of the voice. In this way, Google Assistant can generate a series of numbers representing said melody and then compare it with the songs saved in its database.

Help from Google employees

When the sequence of numbers generated by the hum melody sufficiently matches one of the recorded songs, Google Assistant then displays the corresponding result. For the 500,000 pieces mentioned above, the web giant prides itself on providing a ” high level of precision ».

Finally, for the record, some Google employees sent recordings of themselves humming to train the firm’s algorithms.

