Sunday, November 14, 2021

AWS Polly - Text to Speech Service

Online service (browser-based)

https://console.aws.amazon.com/polly/home/SynthesizeSpeech

(Log on to aws, go to Polly. Very simple)


Command Line interface (CLI) service

1 Log on to aws and create an IAM user

  • add a user with username: administrator with management console access
  • create an admin group with admin access policy. add user to the group
  • create an access key ID and secret access key
    sign in: https://console.aws.amazon.com/iam/
    user - security credentials - access keys - create access key
    download the key pair file (.csv) to your local computer 
2. install application: https://awscli.amazonaws.com/AWSCLIV2.msi
  • check after installation: c:\> aws --version
3. configure aws CLI
  • aws configure
    copy/paster key id, access key, input default region
4 start using Polly from command line. Examples:

aws polly synthesize-speech ^
    --output-format mp3 ^
    --voice-id Joanna ^
    --text "Hello, my name is Joanna. I learned about the W3C on 10/3 of last year." ^
    hello.mp3

hello.mp3

aws polly synthesize-speech^ // load text from a file
  --output-format mp3 ^
  --voice-id Brian^
  --text file://war1.txt ^
  war1.mp3

aws polly start-speech-synthesis-task ^ //save to S3
  --engine neural ^
  --region us-west-2 ^
  --endpoint-url "https://polly.us-eest-1.amazonaws.com/" ^
  --output-format mp3 ^
  --output-s3-bucket-name chris.hare ^
  --voice-id Joanna ^
  --text file://war1.txt

 With SSML (synthesis markup language), you add tags such as news, conversational, lexicon and add break, emphasis, etc.