alicloud-ai-audio-tts — community alicloud-ai-audio-tts, alicloud-skills, community, ide skills, Claude Code, Cursor, Windsurf

v1.0.0
GitHub

About this Skill

Perfect for AI Agents needing advanced text-to-speech capabilities with Alibaba Cloud services. alibaba cloud skills,qwen ,wan and all skills

cinience cinience
[0]
[0]
Updated: 3/5/2026

Agent Capability Analysis

The alicloud-ai-audio-tts skill by cinience is an open-source community AI agent skill for Claude Code and other IDE workflows, helping agents execute tasks with better context, repeatability, and domain-specific guidance.

Ideal Agent Persona

Perfect for AI Agents needing advanced text-to-speech capabilities with Alibaba Cloud services.

Core Value

Empowers agents to generate high-quality audio files using Alibaba Cloud's AI audio TTS service, leveraging Python scripts and the py_compile library to validate and execute TTS generation tasks, while handling output files and request payloads in a structured manner.

Capabilities Granted for alicloud-ai-audio-tts

Generating audio files for voice assistants
Converting text-based content to audio for accessibility
Creating audio samples for marketing campaigns

! Prerequisites & Limits

  • Requires Alibaba Cloud account and credentials
  • Python 3.x compatible only
  • Dependent on py_compile for script validation
Labs Demo

Browser Sandbox Environment

⚡️ Ready to unleash?

Experience this Agent in a zero-setup browser environment powered by WebContainers. No installation required.

Boot Container Sandbox

alicloud-ai-audio-tts

Install alicloud-ai-audio-tts, an AI agent skill for AI agent workflows and automation. Works with Claude Code, Cursor, and Windsurf with one-command setup.

SKILL.md
Readonly

Category: provider

Model Studio Qwen TTS

Validation

bash
1mkdir -p output/alicloud-ai-audio-tts 2python -m py_compile skills/ai/audio/alicloud-ai-audio-tts/scripts/generate_tts.py && echo "py_compile_ok" > output/alicloud-ai-audio-tts/validate.txt

Pass criteria: command exits 0 and output/alicloud-ai-audio-tts/validate.txt is generated.

Output And Evidence

  • Save generated audio links, sample audio files, and request payloads to output/alicloud-ai-audio-tts/.
  • Keep one validation log per execution.

Critical model names

Use one of the recommended models:

  • qwen3-tts-flash
  • qwen3-tts-instruct-flash
  • qwen3-tts-instruct-flash-2026-01-26

Prerequisites

  • Install SDK (recommended in a venv to avoid PEP 668 limits):
bash
1python3 -m venv .venv 2. .venv/bin/activate 3python -m pip install dashscope
  • Set DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials (env takes precedence).

Normalized interface (tts.generate)

Request

  • text (string, required)
  • voice (string, required)
  • language_type (string, optional; default Auto)
  • instruction (string, optional; recommended for instruct models)
  • stream (bool, optional; default false)

Response

  • audio_url (string, when stream=false)
  • audio_base64_pcm (string, when stream=true)
  • sample_rate (int, 24000)
  • format (string, wav or pcm depending on mode)

Quick start (Python + DashScope SDK)

python
1import os 2import dashscope 3 4# Prefer env var for auth: export DASHSCOPE_API_KEY=... 5# Or use ~/.alibabacloud/credentials with dashscope_api_key under [default]. 6# Beijing region; for Singapore use: https://dashscope-intl.aliyuncs.com/api/v1 7dashscope.base_http_api_url = "https://dashscope.aliyuncs.com/api/v1" 8 9text = "Hello, this is a short voice line." 10response = dashscope.MultiModalConversation.call( 11 model="qwen3-tts-instruct-flash", 12 api_key=os.getenv("DASHSCOPE_API_KEY"), 13 text=text, 14 voice="Cherry", 15 language_type="English", 16 instruction="Warm and calm tone, slightly slower pace.", 17 stream=False, 18) 19 20audio_url = response.output.audio.url 21print(audio_url)

Streaming notes

  • stream=True returns Base64-encoded PCM chunks at 24kHz.
  • Decode chunks and play or concatenate to a pcm buffer.
  • The response contains finish_reason == "stop" when the stream ends.

Operational guidance

  • Keep requests concise; split long text into multiple calls if you hit size or timeout errors.
  • Use language_type consistent with the text to improve pronunciation.
  • Use instruction only when you need explicit style/tone control.
  • Cache by (text, voice, language_type) to avoid repeat costs.

Output location

  • Default output: output/alicloud-ai-audio-tts/audio/
  • Override base dir with OUTPUT_DIR.

Workflow

  1. Confirm user intent, region, identifiers, and whether the operation is read-only or mutating.
  2. Run one minimal read-only query first to verify connectivity and permissions.
  3. Execute the target operation with explicit parameters and bounded scope.
  4. Verify results and save output/evidence files.

References

  • references/api_reference.md for parameter mapping and streaming example.

  • Realtime mode is provided by skills/ai/audio/alicloud-ai-audio-tts-realtime/.

  • Voice cloning/design are provided by skills/ai/audio/alicloud-ai-audio-tts-voice-clone/ and skills/ai/audio/alicloud-ai-audio-tts-voice-design/.

  • Source list: references/sources.md

FAQ & Installation Steps

These questions and steps mirror the structured data on this page for better search understanding.

? Frequently Asked Questions

What is alicloud-ai-audio-tts?

Perfect for AI Agents needing advanced text-to-speech capabilities with Alibaba Cloud services. alibaba cloud skills,qwen ,wan and all skills

How do I install alicloud-ai-audio-tts?

Run the command: npx killer-skills add cinience/alicloud-skills/alicloud-ai-audio-tts. It works with Cursor, Windsurf, VS Code, Claude Code, and 19+ other IDEs.

What are the use cases for alicloud-ai-audio-tts?

Key use cases include: Generating audio files for voice assistants, Converting text-based content to audio for accessibility, Creating audio samples for marketing campaigns.

Which IDEs are compatible with alicloud-ai-audio-tts?

This skill is compatible with Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for alicloud-ai-audio-tts?

Requires Alibaba Cloud account and credentials. Python 3.x compatible only. Dependent on py_compile for script validation.

How To Install

  1. 1. Open your terminal

    Open the terminal or command line in your project directory.

  2. 2. Run the install command

    Run: npx killer-skills add cinience/alicloud-skills/alicloud-ai-audio-tts. The CLI will automatically detect your IDE or AI agent and configure the skill.

  3. 3. Start using the skill

    The skill is now active. Your AI agent can use alicloud-ai-audio-tts immediately in the current project.

Related Skills

Looking for an alternative to alicloud-ai-audio-tts or another community skill for your workflow? Explore these related open-source skills.

View All

widget-generator

Logo of f
f

f.k.a. Awesome ChatGPT Prompts. Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.

149.6k
0
AI

flags

Logo of vercel
vercel

flags is a Next.js feature management skill that enables developers to efficiently add or modify framework feature flags, streamlining React application development.

138.4k
0
Browser

zustand

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
AI

data-fetching

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
AI