Last Updated on March 4, 2026 by Laura Coronel
The way people interact with computers continues to evolve. Touch, gestures, predictive text, and voice input are now everyday parts of using phones, laptops, and smart devices. For many users, speaking is often faster and more natural than typing, especially on mobile devices or in accessibility-focused interfaces.
Modern browsers provide APIs that allow web applications to accept voice input and convert speech into text. While early implementations were limited and browser-specific, today’s approaches are more standardized, permission-based, and flexible.
In this article, you’ll learn how voice input works on the modern web, when it makes sense to use it, and how to implement speech-to-text functionality in a way that degrades gracefully across browsers.
Contents
A Brief Note on Older Approaches
If you’ve worked with web speech features in the past, you may remember browser-specific attributes like x-webkit-speech. These early experiments made it possible to dictate text directly into form fields, but they were never standardized and are no longer supported.
Today, voice input is handled through the Web Speech API, which provides a programmatic way to capture speech and convert it into text. Rather than being tied to a single browser or HTML attribute, this approach gives developers more control and better long-term compatibility.
How Voice Input Works on the Web Today
Modern voice input relies on three core pieces:
- The browser’s Speech Recognition interface
- User permission to access the microphone
- A speech-to-text service provided by the browser vendor
When voice recognition is active, the browser records audio from the user’s microphone and sends it to a speech recognition service. The resulting text is then returned to your application, where you can insert it into a form field or process it further.
Because this relies on external services, an internet connection is typically required.
Using the Web Speech API for Speech Recognition
The Web Speech API exposes a SpeechRecognition interface (often prefixed in some browsers). Before using it, you should always check for browser support.
Checking for Browser Support
const SpeechRecognition =
window.SpeechRecognition || window.webkitSpeechRecognition;
if (!SpeechRecognition) {
console.log("Speech recognition is not supported in this browser.");
}
If the API is not available, your application should fall back to standard text input.
Creating a Speech Recognition Instance
const recognition = new SpeechRecognition();
recognition.lang = "en-US";
recognition.interimResults = false;
recognition.continuous = false;
langsets the recognition languageinterimResultscontrols whether partial results are returnedcontinuousdetermines whether recognition stops after one phrase
Starting and Stopping Voice Input
Speech recognition must be triggered by a user action, such as clicking a button.
startButton.addEventListener("click", () => {
recognition.start();
});
To stop recognition:
recognition.stop();
Handling Recognition Results
When speech is successfully recognized, the API emits a result event.
recognition.addEventListener("result", (event) => {
const transcript = event.results[0][0].transcript;
inputField.value = transcript;
});
This allows you to populate a text input, textarea, or other form field with the spoken text.
Handling Errors and Edge Cases
Speech recognition can fail for a variety of reasons: microphone access denied, network issues, or unclear speech.
recognition.addEventListener("error", (event) => {
console.error("Speech recognition error:", event.error);
});
Always assume speech input may not work and design your forms so typing remains the primary interaction.
Accessibility and UX Considerations
Voice input should enhance—not replace—traditional form interactions.
Good practices include:
- Making voice input optional
- Clearly indicating when recording is active
- Providing visual feedback during recognition
- Respecting user privacy and permissions
Voice input is especially valuable for:
- Mobile users
- Hands-free workflows
- Accessibility use cases
- Long-form or repetitive text entry
Browser Support and Limitations
Speech recognition support varies by browser and platform. Some browsers offer full support, others partial, and some none at all. This is why feature detection and graceful fallback are essential.
Rather than designing your application around voice input, treat it as an enhancement layered on top of standard form controls.
Summary
Voice input on the web has matured significantly since its early experiments. While older browser-specific solutions are no longer viable, the Web Speech API provides a modern, flexible way to capture spoken input and convert it into text.
By using feature detection, handling permissions responsibly, and designing with accessibility in mind, you can add voice input to your forms in a way that feels natural, useful, and future-friendly.
Voice won’t replace keyboards anytime soon but in the right context, it can make web interfaces faster, more inclusive, and easier to use.

What’s up colleagues, good paragraph and nihe urging commented here, I am genuinely enjoying by these.
is there any way to send audio files as an input to google translate website using java program for translation ?
What a Ԁata of un-ambiguity and preserveness
of precious know-how regarding unpredicted feelings.
Is this demo still working, I tried in chrome v55 in window, also tried in android chrome. Doesn’t work.
x-webkit-speech is now deprecated.
None of the older webkit speech demos I’ve found works in Chrome any more.
Hi Hagge , Just tried the same above and chrome is not working anymore .i’m using chrome version46.2 in Mac .Did you find anything ?
Thanks matt for the article.would like to create my next app around speech recognition technology
Firefox…
I have a project that I want to implement this on. Thanks Matt!
Speech input is certainly a big leap forward! the article is certainly replete with details on this concept! thanks matt !!
The demo doesn’t work on my nexus 5 though 🙁 Not sure why
I’m not sure if `x-webkit-speech` is supported in Chrome for Android yet. Unfortunately I don’t have a nexus 5 to test :/
I know that this doesn’t work on Chrome for iOS at the moment though.
Many mobile operating systems include their own speech input technologies that are more part of the keyboard than the browser. Perhaps this is why Google haven’t seen the need to implement this feature on mobile.
Speech is going to be so important in the future, using it on form inputs in your example can really open up and improve the mobile experience.
I got really excited a few months back by learning about the JavaScript Web Speech API, Ian Devlin has already shown real world examples of controlling video through speech – can you imagine how accessible the web is going to become when users start to navigate around just by talking. Exciting times ahead.
There certainly are. One step closer to a real world J.A.R.V.I.S 🙂
Thanks Matt! I didn’t know this was available yet. I wonder how long before the other browsers catch up?
I’m not sure. I think it’s mainly down to when other browser vendors can develop/license the speech recognition technology. Safari seems like the next contender for this feature after Apple acquired Siri. That said, Microsoft has done a lot of work on speech recognition too.
I guess we’ll just have to wait and see 🙂
Amazing!
I’ll start to use in my projects!
Me encantó tu web. Llegué por casualidad y me ha gustado bastante.