It can be implemented with 10 lines of code.
Docs: https://developers.google.com/web/updates/2013/01/Voice-Driven-Web-Apps-Introduction-to-the-Web-Speech-API Simple Example: https://codepen.io/renanpupin/pen/yVpBRq/
Then you can send that data [text] to the server, server will process it and send whatever result back.