freeCodeCamp/guide/english/voice/index.md

---
title: Voice
---

# Voice
Speech recognition allows users affected by accessibility difficulties (such as permanent visual impairment or temporary impairment while driving) the ability to navigate content on a website or input text data (such as a form). 

Speech synthesis provides websites the ability to provide information to users by reading text.


## Javascript Web Speech API

The [Web Speech API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API) enables you to incorporate voice data into web apps using both speech recognition and speech synthesis. 

### How the Web Speech API Works

The Web Speech API uses the device's native microphone system. When an utterance is recognized from a pre-defined grammar (see below) it is returned as a result (or list of results) as a text string and callback functions can be provided to perform further actions. 

### How to use the Speech Recognition API

Here is a simple example of using the Speech Recognition API. Note that the API is initated with the `new SpeechRecognition()` constructor and starts when `recognition.start();` is called. It creates a transcript from what is received and then that is appended to the `<p class="transcript">` element. [Click here for a working demo of this code](https://codepen.io/ashwoodall/pen/MPeyRm).

This is the HTML that the transcript is appended to:

```html
<main class="main">
  <span class="loader"></span>
  <p class="description">What I think you said: <p class="transcript" data-js="varValue"></p></p>
  
</main>
```

And here is the JavaScript: 

```javascript
window.SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;

const span = document.querySelector('[data-js="varValue"]');
const main = document.querySelector('.main');
const loader = document.querySelector('.loader');

const recognition = new SpeechRecognition();
recognition.lang = 'en-US';

recognition.addEventListener('result', e => {
    const transcript = Array.from(e.results)
        .map(result => result[0].transcript)

    span.textContent = transcript;
    loader.textContent = '';
});

recognition.addEventListener('start', () => loader.textContent = 'Listening (enable your microphone)...');

recognition.addEventListener('end', recognition.start);
recognition.start();
```

## Alexa
Alexa is Amazon’s cloud-based voice service available on tens of millions of devices from Amazon and third-party device manufacturers. With Alexa you can build natural voice experiences that offer customers a more intuitive way to interact with the technology they use every day.
Alexa is capable of voice interaction, music playback, making to-do lists, setting alarms, streaming podcasts, playing audiobooks, and providing weather, traffic, sports, and other real-time information such as news.

### Amazon Echo Device Range
- Amazon Echo
- Amazon Echo Plus
- Amazon Echo Dot
- Amazon Echo Look
- Amazon Echo Show
- Amazon Echo Spot

## Far Field Microphones
Speech recognition systems often use multiple microphones to reduce the impact of reverberation and noise. 
The Echo mics are arranged in a hexagonal layout with one microphone at each vertex and one in the center. The delay between each microphone receiving the signal enables the device to identify the source of the voice and cancel out noise coming from other directions. This is a phenomenon known as beamforming.

While state-of-the-art speech recognition systems perform reasonably well in close-talking microphone conditions performance degrades in conditions where the microphone is far from the user.

The audio captured by the Echo will be influenced by:
1) the speaker’s voice against the wall of the room,
2) the background noise from outside, 
3) the acoustic echo coming from the device’s loudspeaker, 
4) the output audio against the wall of the room.

## Software
The software components within the platform include both Natural Language Understanding (NLU) as well as Automated Speech Recognition (ASR).  These software components can be leveraged by custom written "skills" by independent software developers who are then certified to a set of standards by Amazon. There are already more than 20k of these custom skills available through their app store.

## IBM Watson Speech-to-Text API
The IBM Watson Speech-to-Text API uses machine learning to accurately predict speech in real time. Currently seven different languages are supported as well as live voice and pre-recorded audio. The API can be used for free; paid versions are also available for larger scale apps. 

## Siri
Apple's iOS 12 update introduced [Siri](https://en.wikipedia.org/wiki/Siri) Shortcuts, which supports the use of third-party applications through Apple's digital voice assistant. Siri Shortcuts allows developers to add shortcuts and personalized phrases to Siri through their applications by, for example, letting the user record a voice phrase for a particular action and adding that phrase to Siri.

## Additional Resources
- [Web Speech API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API)
- [Alexa API](https://developer.amazon.com/docs/alexa-voice-service/api-overview.html)
- [IBM Watson API](https://www.ibm.com/watson/services/speech-to-text/)
- [Sirikit](https://developer.apple.com/documentation/sirikit)
-												fix: change directory structure

											
										
										
											2018-10-12 15:37:13 -04:00
+								---
 								title: Voice
 								---
-												edit for punctuation and capitalization (#30161)

Removed multiple unnecessary commas, added one necessary comma, added one capital in a title.
											
										
										
											2019-02-14 16:03:39 -05:00
+								# Voice
-												fix: change directory structure

											
										
										
											2018-10-12 15:37:13 -04:00
+								Speech recognition allows users affected by accessibility difficulties (such as permanent visual impairment or temporary impairment while driving) the ability to navigate content on a website or input text data (such as a form).
 								Speech synthesis provides websites the ability to provide information to users by reading text.
-												edit for punctuation and capitalization (#30161)

Removed multiple unnecessary commas, added one necessary comma, added one capital in a title.
											
										
										
											2019-02-14 16:03:39 -05:00
+								## Javascript Web Speech API
-												fix: change directory structure

											
										
										
											2018-10-12 15:37:13 -04:00
 								The [Web Speech API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API) enables you to incorporate voice data into web apps using both speech recognition and speech synthesis.
-												edit for punctuation and capitalization (#30161)

Removed multiple unnecessary commas, added one necessary comma, added one capital in a title.
											
										
										
											2019-02-14 16:03:39 -05:00
+								### How the Web Speech API Works
-												fix: change directory structure

											
										
										
											2018-10-12 15:37:13 -04:00
-												edit for punctuation and capitalization (#30161)

Removed multiple unnecessary commas, added one necessary comma, added one capital in a title.
											
										
										
											2019-02-14 16:03:39 -05:00
+								The Web Speech API uses the device's native microphone system. When an utterance is recognized from a pre-defined grammar (see below) it is returned as a result (or list of results) as a text string and callback functions can be provided to perform further actions.
-												fix: change directory structure

											
										
										
											2018-10-12 15:37:13 -04:00
-												edit for punctuation and capitalization (#30161)

Removed multiple unnecessary commas, added one necessary comma, added one capital in a title.
											
										
										
											2019-02-14 16:03:39 -05:00
+								### How to use the Speech Recognition API
-												fix: change directory structure

											
										
										
											2018-10-12 15:37:13 -04:00
-												edit for punctuation and capitalization (#30161)

Removed multiple unnecessary commas, added one necessary comma, added one capital in a title.
											
										
										
											2019-02-14 16:03:39 -05:00
+								Here is a simple example of using the Speech Recognition API. Note that the API is initated with the `new SpeechRecognition()` constructor and starts when `recognition.start();` is called. It creates a transcript from what is received and then that is appended to the `<p class="transcript">` element. [Click here for a working demo of this code](https://codepen.io/ashwoodall/pen/MPeyRm).
-												fix: change directory structure

											
										
										
											2018-10-12 15:37:13 -04:00
 								This is the HTML that the transcript is appended to:
 								```html
 								<main class="main">
 								  <span class="loader"></span>
 								  <p class="description">What I think you said: <p class="transcript" data-js="varValue"></p></p>
 								</main>
 								```
 								And here is the JavaScript:
 								```javascript
 								window.SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;
 								const span = document.querySelector('[data-js="varValue"]');
 								const main = document.querySelector('.main');
 								const loader = document.querySelector('.loader');
 								const recognition = new SpeechRecognition();
 								recognition.lang = 'en-US';
 								recognition.addEventListener('result', e => {
 								    const transcript = Array.from(e.results)
 								        .map(result => result[0].transcript)
 								    span.textContent = transcript;
 								    loader.textContent = '';
 								});
 								recognition.addEventListener('start', () => loader.textContent = 'Listening (enable your microphone)...');
 								recognition.addEventListener('end', recognition.start);
 								recognition.start();
 								```
-												edit for punctuation and capitalization (#30161)

Removed multiple unnecessary commas, added one necessary comma, added one capital in a title.
											
										
										
											2019-02-14 16:03:39 -05:00
+								## Alexa
 								Alexa is Amazon’s cloud-based voice service available on tens of millions of devices from Amazon and third-party device manufacturers. With Alexa you can build natural voice experiences that offer customers a more intuitive way to interact with the technology they use every day.
 								Alexa is capable of voice interaction, music playback, making to-do lists, setting alarms, streaming podcasts, playing audiobooks, and providing weather, traffic, sports, and other real-time information such as news.
-												fix: change directory structure

											
										
										
											2018-10-12 15:37:13 -04:00
-												edit for punctuation and capitalization (#30161)

Removed multiple unnecessary commas, added one necessary comma, added one capital in a title.
											
										
										
											2019-02-14 16:03:39 -05:00
+								### Amazon Echo Device Range
-												fix: change directory structure

											
										
										
											2018-10-12 15:37:13 -04:00
+								- Amazon Echo
 								- Amazon Echo Plus
 								- Amazon Echo Dot
 								- Amazon Echo Look
 								- Amazon Echo Show
 								- Amazon Echo Spot
-												edit for punctuation and capitalization (#30161)

Removed multiple unnecessary commas, added one necessary comma, added one capital in a title.
											
										
										
											2019-02-14 16:03:39 -05:00
+								## Far Field Microphones
-												fix: change directory structure

											
										
										
											2018-10-12 15:37:13 -04:00
+								Speech recognition systems often use multiple microphones to reduce the impact of reverberation and noise.
-												edit for punctuation and capitalization (#30161)

Removed multiple unnecessary commas, added one necessary comma, added one capital in a title.
											
										
										
											2019-02-14 16:03:39 -05:00
+								The Echo mics are arranged in a hexagonal layout with one microphone at each vertex and one in the center. The delay between each microphone receiving the signal enables the device to identify the source of the voice and cancel out noise coming from other directions. This is a phenomenon known as beamforming.
-												fix: change directory structure

											
										
										
											2018-10-12 15:37:13 -04:00
-												edit for punctuation and capitalization (#30161)

Removed multiple unnecessary commas, added one necessary comma, added one capital in a title.
											
										
										
											2019-02-14 16:03:39 -05:00
+								While state-of-the-art speech recognition systems perform reasonably well in close-talking microphone conditions performance degrades in conditions where the microphone is far from the user.
-												fix: change directory structure

											
										
										
											2018-10-12 15:37:13 -04:00
 								The audio captured by the Echo will be influenced by:
 ) the speaker’s voice against the wall of the room,
 ) the background noise from outside,
-												edit for punctuation and capitalization (#30161)

Removed multiple unnecessary commas, added one necessary comma, added one capital in a title.
											
										
										
											2019-02-14 16:03:39 -05:00
+) the acoustic echo coming from the device’s loudspeaker,
-												fix: change directory structure

											
										
										
											2018-10-12 15:37:13 -04:00
+) the output audio against the wall of the room.
-												edit for punctuation and capitalization (#30161)

Removed multiple unnecessary commas, added one necessary comma, added one capital in a title.
											
										
										
											2019-02-14 16:03:39 -05:00
+								## Software
 								The software components within the platform include both Natural Language Understanding (NLU) as well as Automated Speech Recognition (ASR).  These software components can be leveraged by custom written "skills" by independent software developers who are then certified to a set of standards by Amazon. There are already more than 20k of these custom skills available through their app store.
-												fix: change directory structure

											
										
										
											2018-10-12 15:37:13 -04:00
-												edit for punctuation and capitalization (#30161)

Removed multiple unnecessary commas, added one necessary comma, added one capital in a title.
											
										
										
											2019-02-14 16:03:39 -05:00
+								## IBM Watson Speech-to-Text API
 								The IBM Watson Speech-to-Text API uses machine learning to accurately predict speech in real time. Currently seven different languages are supported as well as live voice and pre-recorded audio. The API can be used for free; paid versions are also available for larger scale apps.
-												fix: change directory structure

											
										
										
											2018-10-12 15:37:13 -04:00
-												edit for punctuation and capitalization (#30161)

Removed multiple unnecessary commas, added one necessary comma, added one capital in a title.
											
										
										
											2019-02-14 16:03:39 -05:00
+								## Siri
-												Add info for Siri Shortcuts (#26414)

* Add info for Siri Shortcuts

Add info for Siri Shortcuts, introduced in iOS's Update 12

* Created list formatting

											
										
										
											2019-01-02 08:25:59 -08:00
+								Apple's iOS 12 update introduced [Siri](https://en.wikipedia.org/wiki/Siri) Shortcuts, which supports the use of third-party applications through Apple's digital voice assistant. Siri Shortcuts allows developers to add shortcuts and personalized phrases to Siri through their applications by, for example, letting the user record a voice phrase for a particular action and adding that phrase to Siri.
-												edit for punctuation and capitalization (#30161)

Removed multiple unnecessary commas, added one necessary comma, added one capital in a title.
											
										
										
											2019-02-14 16:03:39 -05:00
+								## Additional Resources
-												Add info for Siri Shortcuts (#26414)

* Add info for Siri Shortcuts

Add info for Siri Shortcuts, introduced in iOS's Update 12

* Created list formatting

											
										
										
											2019-01-02 08:25:59 -08:00
+								- [Web Speech API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API)
 								- [Alexa API](https://developer.amazon.com/docs/alexa-voice-service/api-overview.html)
 								- [IBM Watson API](https://www.ibm.com/watson/services/speech-to-text/)
 								- [Sirikit](https://developer.apple.com/documentation/sirikit)