Smart Speakers: what they are, how they work

Smart speakers are becoming ever more popular, providing streamed music and much more from voice commands.

Loudspeaker Tutorial Includes:
What is a loudspeaker: basics Moving coil loudspeaker Loudspeaker enclosures Loudspeaker repairs Speaker wire / cable Speaker placement Bluetooth speaker Smart speaker

Smart speakers have become very popular and there is a variety of smart speakers on the market from many different manufacturers.

These smart speakers enable us to do a variety of different things from controlling wireless enabled lights to ordering various products and takeaways online as well as providing information like the weather forecast, time, date and many more things. And they can also stream music to the speakers themselves. In fact they are what is termed virtual assistants.

How do smart speakers work

The key to the operation of smart speakers is the voice recognition technology that is used. Using voice recognition, it is possible for the smart speaker to understand what is being said and act upon it.

Different manufacturers use different voice recognition systems: Apple uses its Siri assistant voice recognition, Microsoft uses Cortana, Google Home series and Amazon Echo speakers use their own voice recognition schemes for their smart speakers.

Although each smart speaker system is slightly different, when looking at how they work, it is possible to generalise slightly to see the basic concepts.Typically the smart speaker listens to all speech and awaits a “wake word.”

There is often a default wake word: for Amazon, the Alexa system awaits for the word Alexa, although this can be changed. Other systems have other words.

Once the system hears this word it activates, it records what is being said and sends this over the Internet to main processing area or voice recognition service for the system: for the Amazon system, the speech file is sent to Amazon’s AVS (Alexa Voice Services) in the cloud.

The voice recognition service, deciphers the speech and then sends a response back to the smart speaker.

The voice recognition service uses a series of algorithms so that the system becomes more familiar with your use of words and individual speech patterns. In this way it learns how you speak so the system can provide a better service.

In fact, normally when setting up a new smart speaker system it will be necessary to run through a learning process for the smart speaker.

Smart speaker speech recognition process

The technology behind speech recognition has developed hugely in recent years. Only a few years ago, speech recognition was very much a laboratory phenomenon, but now it is being extensively used in many areas, including smart speakers.

Although we all listen to others talking and perform speech recognition ourselves, it is a very complicated process when undertaken by computers.

Computers are programmed to recognise sections of words, known as “phones.” These are then linked with other phones so that “phonemes” are built which are effectively different words.

Although there are variations on this basic theme, the basic concept is the same for all speech recognition systems.

Data security

One of the issues that many people fear with the use of smart speakers is that of data security. People could become victims of voice hacking. This involves the recording or mimicking of a user’s voice and then hijacking their accounts.

It has been noted that many automatic speaker verification systems are not able to detect whether the speech has been previously recorded, although systems are now being developed to detect this.

Also there have been some scares about some smart speaker systems responding to speech from nearby radios the speakers can hear. However it does not appear to be a major issue.

One of the main issues when setting up smart speaker systems is to ensure that the Wi-Fi network, as well as any items on it, and of course the smart speaker, all have robust passwords. One of the ways hackers can gain access into systems is through people leaving default passwords in place, thinking that nobody would be interested in hacking their system. These system provide easy prey for unscrupulous hackers.

The technology for smart speakers is evolving at a rapid pace. As a result their use can only increase, and this is likely to happen at a great rate. Also voice recognition will become embedded into more of everyday life, with newer systems performing the recognition locally. Smart speaker technology will also extend its area of applications, and enable more areas of everyday life to be controlled by voice command.