如何在 Python 中使用 Google 的 Text-to-Speech API

3回答

达令说

为 JSON 文件配置 Python 应用程序并安装客户端库创建服务帐户使用此处的服务帐户创建服务帐户密钥JSON 文件下载并安全保存在您的 Python 应用程序中包含 Google 应用程序凭据安装库： pip install --upgrade google-cloud-texttospeech使用 Google 的 Python 示例找到：https : //cloud.google.com/text-to-speech/docs/reference/libraries 注意：在 Google 的示例中，它没有正确包含 name 参数。和 https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/texttospeech/cloud-client/quickstart.py以下是使用谷歌应用程序凭据和女性的 wavenet 语音从示例中修改的内容。os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/home/yourproject-12345.json"from google.cloud import texttospeech# Instantiates a clientclient = texttospeech.TextToSpeechClient()# Set the text input to be synthesizedsynthesis_input = texttospeech.types.SynthesisInput(text="Do no evil!")# Build the voice request, select the language code ("en-US") # ****** the NAME# and the ssml voice gender ("neutral")voice = texttospeech.types.VoiceSelectionParams(    language_code='en-US',    name='en-US-Wavenet-C',    ssml_gender=texttospeech.enums.SsmlVoiceGender.FEMALE)# Select the type of audio file you want returnedaudio_config = texttospeech.types.AudioConfig(    audio_encoding=texttospeech.enums.AudioEncoding.MP3)# Perform the text-to-speech request on the text input with the selected# voice parameters and audio file typeresponse = client.synthesize_speech(synthesis_input, voice, audio_config)# The response's audio_content is binary.with open('output.mp3', 'wb') as out:    # Write the response to the output file.    out.write(response.audio_content)    print('Audio content written to file "output.mp3"')语音、姓名、语言代码、SSML 性别等语音列表：https : //cloud.google.com/text-to-speech/docs/voices在上面的代码示例中，我将 Google 示例代码中的语音更改为包含名称参数并使用 Wavenet 语音（大大改进但更贵 16 美元/百万个字符）和 SSML 性别为 FEMALE。voice = texttospeech.types.VoiceSelectionParams(        language_code='en-US',        name='en-US-Wavenet-C',        ssml_gender=texttospeech.enums.SsmlVoiceGender.FEMALE)

慕妹3146593

如果您想避免使用 google Python API，您可以简单地执行以下操作：import requests import jsonurl = "https://texttospeech.googleapis.com/v1beta1/text:synthesize"text = "This is a text"data = {        "input": {"text": text},        "voice": {"name":  "fr-FR-Wavenet-A", "languageCode": "fr-FR"},        "audioConfig": {"audioEncoding": "MP3"}      };headers = {"content-type": "application/json", "X-Goog-Api-Key": "YOUR_API_KEY" }r = requests.post(url=url, json=data, headers=headers)content = json.loads(r.content)它与您所做的类似，但您需要包含您的 API 密钥。

拉丁的传说

找到了答案并丢失了我打开的 150 个 Google 文档页面之间的链接。#(Since I'm using a Jupyter Notebook)import osos.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/Path/to/JSON/file/jsonfile.json"from google.cloud import texttospeech# Instantiates a clientclient = texttospeech.TextToSpeechClient()# Set the text input to be synthesizedsynthesis_input = texttospeech.types.SynthesisInput(text="Hello, World!")# Build the voice request, select the language code ("en-US") and the ssml# voice gender ("neutral")voice = texttospeech.types.VoiceSelectionParams(    language_code='en-US',    ssml_gender=texttospeech.enums.SsmlVoiceGender.NEUTRAL)# Select the type of audio file you want returnedaudio_config = texttospeech.types.AudioConfig(    audio_encoding=texttospeech.enums.AudioEncoding.MP3)# Perform the text-to-speech request on the text input with the selected# voice parameters and audio file typeresponse = client.synthesize_speech(synthesis_input, voice, audio_config)# The response's audio_content is binary.with open('output.mp3', 'wb') as out:    # Write the response to the output file.    out.write(response.audio_content)    print('Audio content written to file "output.mp3"')我耗时的追求是尝试使用 Python 通过 JSON 发送请求，但这似乎是通过自己的模块，工作正常。请注意，默认语音性别为“中性”。