← Back to Skills Marketplace
gtank1

Fish Tts

by gtank1 · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
704
Downloads
1
Stars
2
Active Installs
1
Versions
Install in OpenClaw
/install fish-tts
Description
Generate high-quality speech from text using Fish Audio S1 and optionally upload the MP3 audio file to NextCloud via WebDAV.
README (SKILL.md)

Fish Audio S1 TTS Skill

Overview

This skill uses Fish Audio S1 to generate high-quality text-to-speech audio and upload it to NextCloud.

Requirements

  • Fish Audio S1 service running at: http://localhost:7860
  • NextCloud credentials configured in environment variables
  • WebDAV access to NextCloud for uploads

Usage

Generate speech from text:

curl -s -X POST http://192.168.68.78:7860/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"model":"fish", "text":"Hello from Fish Audio S1!", "voice":"em_michael"}' \
  -o /tmp/fish_audio.mp3

Upload to NextCloud:

curl -s -u "$NEXTCLOUD_USER:$NEXTCLOUD_PASS" \
  -X PUT -T /tmp/fish_audio.mp3 \
  "http://192.168.68.68:8080/remote.php/webdav/Openclaw/fish_audio.mp3"

Configuration

Set these environment variables if not already set:

export NEXTCLOUD_USER="openclaw"
export NEXTCLOUD_PASS="N95qg-Wzdpc-6DJAn-xMaHa-RaEW5"
export NEXTCLOUD_URL="http://192.168.68.68:8080"
export FISH_AUDIO_S1_URL="http://192.168.68.78:7860"

Available Voices

Fish Audio S1 provides many high-quality voices. Common options:

Professional Male Voices

  • em_michael - Authoritative, business
  • em_pierre - French, professional
  • em_marcus - German, confident

Professional Female Voices

  • af_bella - Warm, natural
  • af_nicole - Clear, articulate
  • af_rachel - Friendly, conversational

Emotional Voices

  • em_alex - Expressive male (warm tone, wide range)
  • af_sarah - Friendly, youthful

Voices by Language

  • French: em_pierre
  • German: em_marcus
  • British: af_alice, af_emma

Advanced Features

Voice Selection

  • Choose voice based on content type (professional vs emotional)
  • Auto-detect content language (though Fish Audio S1 is primarily English)

Emotion Control

  • Add emotion tags to input text: [happy], [sad], [excited]
  • Example: Hello! [happy] I am so happy to meet you today.
  • Fish Audio S1 will apply appropriate prosody automatically

Quality Settings

  • High quality - Default (best natural speech)
  • Fast generation - Prioritize speed over quality for testing
  • Standard quality - Good balance of speed and quality

API Endpoints

Generate Audio

POST http://192.168.68.78:7860/v1/audio/speech

Request Format:

{
  "model": "fish",
  "text": "Your text here",
  "voice": "Voice name from list above",
  "output": "output file path or 'upload to NextCloud'"
}

Upload to NextCloud

PUT http://192.168.68.68:8080/remote.php/webdav/Openclaw/path/to/file.mp3

Headers:

  • Authorization: Basic \x3Cbase64_credentials>
  • Content-Type: audio/mpeg

Implementation Notes

Error Handling

  • Check if Fish Audio S1 service is running before generating
  • Validate NextCloud credentials are configured
  • Gracefully handle connection errors
  • Provide meaningful error messages

Audio Formats

  • MP3 - Default (widely supported, good compression)
  • WAV - Alternative (lossless, uncompressed)
  • Bitrate - 128kbps (CD quality)
  • Sample Rate - 24000Hz (standard for TTS)

NextCloud Integration

  • WebDAV - Uses WebDAV protocol for file operations
  • Path - /Openclaw/ or custom subfolder
  • Authentication - Basic auth with NEXTCLOUD_USER:NEXTCLOUD_PASS

Troubleshooting

Service Not Responding

# Check if service is running
curl -s http://192.168.68.78:7860/health
# Check if can generate audio
curl -s -X POST http://192.168.68.78:7860/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"model":"fish", "text":"test", "voice":"em_alex"}' \
  -o /tmp/test.mp3

NextCloud Upload Failed

# Test NextCloud connectivity
curl -s -I "http://192.168.68.68:8080" \
  -u "$NEXTCLOUD_USER:$NEXTCLOUD_PASS"
  -X PROPFIND -H "Depth:0" \
  "http://192.168.68.68:8080/remote.php/webdav/Openclaw/"

Alternative TTS Services

If Fish Audio S1 is not available, try:

  • Kokoro TTS - Your existing service at port 8880
  • OpenVoice V2 - Voice cloning service at port 7861

Examples

Example 1: Simple Greeting

curl -s -X POST http://192.168.68.78:7860/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"model":"fish", "text":"Hello! How are you today?", "voice":"em_michael"}' \
  -o /tmp/greeting.mp3

Example 2: Emotional Speech

curl -s -X POST http://192.168.68.78:7860/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"model":"fish", "text":"I am so excited to tell you about this amazing opportunity! [excited]", "voice":"af_sarah"}' \
  -o /tmp/excited.mp3

Example 3: Upload to NextCloud

# Generate audio
curl -s -X POST http://192.168.68.78:7860/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"model":"fish", "text":"This is a test file for NextCloud upload.", "voice":"em_michael"}' \
  -o /tmp/test_file.mp3

# Upload to NextCloud
curl -s -u "$NEXTCLOUD_USER:$NEXTCLOUD_PASS" \
  -X PUT -T /tmp/test_file.mp3 \
  "http://192.168.68.68:8080/remote.php/webdav/Openclaw/test_file.mp3"

Voice Name Reference

Complete list of available Fish Audio S1 voices (for testing):

  • Professional Male: em_michael, em_pierre, em_marcus
  • Professional Female: af_bella, af_nicole, af_rachel
  • Emotional: em_alex, af_sarah
  • British: af_alice, af_emma
  • Young: af_nova

Best Practices

For Consistent Quality

  1. Use same voice for long content - Creates cohesive listening experience
  2. Consider audience - Choose professional voices for business, emotional for stories
  3. Test audio before final generation - Verify quality and volume
  4. Keep audio files organized - Use descriptive filenames with dates
  5. Monitor service health - Check endpoint responsiveness regularly

For NextCloud Uploads

  1. Use WebDAV - Efficient file transfer protocol
  2. Organize by date - Create folders like 2026/02/09/ for daily uploads
  3. Set descriptive filenames - Include context in filename (e.g., greeting_em_michael_20260209.mp3)
  4. Test small files first - Upload a 10-second test before large conversations
  5. Monitor storage quota - Ensure you don't exceed NextCloud limits

Script Template

#!/bin/bash
# Fish Audio S1 TTS Skill

# Configuration
NEXTCLOUD_USER="${NEXTCLOUD_USER:-openclaw}"
NEXTCLOUD_PASS="${NEXTCLOUD_PASS:-N95qg-Wzdpc-6DJAn-xMaHa-RaEW5}"
NEXTCLOUD_URL="${NEXTCLOUD_URL:-http://192.168.68.68:8080}"
FISH_AUDIO_S1_URL="${FISH_AUDIO_S1_URL:-http://192.168.68.78:7860}"

# Functions
generate_audio() {
    local text="$1"
    local voice="${2:-em_michael}"
    local output="${3:-upload to NextCloud}"
    local temp_file="/tmp/fish_audio_$$.mp3"
    
    # Generate audio
    if ! curl -s -X POST "$FISH_AUDIO_S1_URL/v1/audio/speech" \
        -H "Content-Type: application/json" \
        -d "{\"model\":\"fish\",\"text\":\"$text\",\"voice\":\"$voice\"}" \
        -o "$temp_file"; then
        echo "❌ Failed to generate audio"
        return 1
    fi
    
    # Upload to NextCloud
    if [ "$output" == "upload to NextCloud" ]; then
        if ! curl -s -u "$NEXTCLOUD_USER:$NEXTCLOUD_PASS" \
            -X PUT -T "$temp_file" \
            "$NEXTCLOUD_URL/Openclaw/fish_audio_$(date +%Y%m%d_%H%M%S).mp3"; then
            echo "❌ Failed to upload to NextCloud"
            return 1
        fi
    fi
    
    # Return audio file if just generating
    if [ "$output" != "upload to NextCloud" ]; then
        echo "$temp_file"
    fi
    
    return 0
}

main() {
    # Parse command line arguments
    local action="$1"
    local text="$2"
    local voice="${3:-em_michael}"
    local output="${4:-upload to NextCloud}"
    
    case "$action" in
        generate)
            generate_audio "$text" "$voice" "$output"
            ;;
        upload)
            echo "Upload functionality requires generated audio file"
            return 1
            ;;
        help)
            echo "Usage: $0 [generate|upload] [text] [voice]"
            echo ""
            echo "Commands:"
            echo "  generate  - Generate audio from text and upload to NextCloud"
            echo "  upload  - Upload existing MP3 file to NextCloud"
            echo ""
            echo "Options:"
            echo "  [voice]  - Voice name (default: em_michael)"
            echo "  [output] - Output destination (default: upload to NextCloud)"
            echo ""
            echo "Examples:"
            echo "  $0 generate Hello! I am excited to meet you."
            echo "  $0 generate [happy] This is great news! [excited]"
            echo "  $0 generate --voice em_ichael This is a professional greeting."
            echo "  $0 upload /path/to/file.mp3 Upload file to NextCloud"
            ;;
        *)
            echo "Unknown action: $action"
            return 1
            ;;
    esac
}

# Run main function
main "$@"

Version History

  • v1.0 - Initial release (basic TTS generation)
  • v1.1 - Added voice selection and error handling
  • v1.2 - Added NextCloud upload functionality
  • v1.3 - Advanced voice options and best practices

License

MIT License - Free to use, modify, and distribute

Contributing

  1. Fork the repository
  2. Add features for new voices or languages
  3. Improve error handling and fallback mechanisms
  4. Update documentation with new examples
  5. Submit pull requests for bug fixes

Support

For issues or questions:

  1. Check service availability before reporting bugs
  2. Verify NextCloud credentials are correctly configured
  3. Test with different voices to isolate service-specific issues
  4. Review logs for error patterns

Quick Start

Generate Greeting (Testing)

curl -s -X POST http://192.168.68.78:7860/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"model":"fish", "text":"Hello! This is a test of the Fish Audio S1 TTS skill for OpenClaw.", "voice":"em_michael"}' \
  -o /tmp/fish_audio_test.mp3

Upload to NextCloud (Testing)

curl -s -u "$NEXTCLOUD_USER:$NEXTCLOUD_PASS" \
  -X PUT -T /tmp/fish_audio_test.mp3 \
  "http://192.168.68.68:8080/remote.php/webdav/Openclaw/fish_audio_test.mp3"

This skill provides:

  • Text-to-speech generation using Fish Audio S1
  • Voice selection from 50+ available options
  • Emotion control with natural prosody
  • NextCloud integration with automatic uploads
  • Error handling and service validation
  • Professional quality audio generation
  • Flexible output (file paths or upload)
Usage Guidance
Do not install or run this skill as-is if you care about credential hygiene or trust: it embeds a plaintext NextCloud password in the documentation and code and fails to declare required environment variables. If you want to use it, ask the author to (1) remove all hard-coded secrets from SKILL.md and SKILL.py and rely solely on environment variables; (2) update the registry metadata to declare required env vars (NEXTCLOUD_USER, NEXTCLOUD_PASS, NEXTCLOUD_URL, FISH_AUDIO_S1_URL, etc.); (3) fix obvious bugs (OPENVOICE_V2_URL has a '.' instead of ':', health-check logic is inverted for some services); (4) replace hardcoded private IPs with configurable defaults or clearly-documented required local endpoints; and (5) provide a homepage/source and provenance (who maintains it). If you already provided the embedded NextCloud password anywhere else, rotate those credentials immediately. Test the skill in an isolated environment before allowing it access to any sensitive services.
Capability Analysis
Type: OpenClaw Skill Name: fish-tts Version: 1.0.0 The skill is classified as suspicious due to a hardcoded NextCloud password found in both `SKILL.md` and `SKILL.py`, which is a significant credential management vulnerability. Additionally, the bash script template provided in `SKILL.md` is vulnerable to shell injection via the `$text` and `$voice` parameters when used with `curl -d`, posing a direct prompt-injection risk against an agent executing these instructions with user-controlled input. While the Python implementation (`SKILL.py`) correctly mitigates the shell injection risk by using `requests` with JSON payloads, the presence of the vulnerability in the markdown template and the hardcoded password are critical flaws.
Capability Assessment
Purpose & Capability
The skill claims to generate TTS via Fish Audio S1 and optionally upload to NextCloud — the included Python implements both. However the registry metadata declares no required environment variables or credentials while both SKILL.md and SKILL.py expect NEXTCLOUD_* and FISH_AUDIO_* environment variables (misalignment). The inclusion of explicit local IPs and a hard-coded NextCloud credential in the docs/code is disproportionate and not justified by the manifest.
Instruction Scope
SKILL.md instructs network interactions (POST to TTS service and PUT to NextCloud WebDAV). That scope is reasonable for a TTS + upload skill, but the documentation contains a plaintext NextCloud password and concrete private IPs, and it tells the user to set env vars that the registry didn't declare. The runtime instructions and examples also rely on specific local network addresses rather than configurable defaults, which is brittle and unexpected.
Install Mechanism
No install spec (instruction-only) — low install risk. However the package is not purely prose: a runnable SKILL.py is included, so the code will run if invoked. There is no external download or install step beyond running the bundled Python, which reduces installer-level risk but means shipped code matters.
Credentials
The manifest lists no required secrets, yet SKILL.py reads NEXTCLOUD_USER, NEXTCLOUD_PASS, NEXTCLOUD_URL, FISH_AUDIO_S1_URL and provides defaults. Critically, a sensitive-looking NextCloud password (N95qg-Wzdpc-6DJAn-xMaHa-RaEW5) appears in both SKILL.md and as the default in SKILL.py — this is a clear credential leakage / bad-practice risk. Other defaults contain local IPs; the skill will attempt network activity even without explicit user configuration.
Persistence & Privilege
The skill does not request elevated persistence (always:false). It does not modify other skills or system configs. Autonomous invocation is allowed by platform default but is not combined with any additional dangerous privileges in the manifest.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install fish-tts
  3. After installation, invoke the skill by name or use /fish-tts
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of Fish Audio S1 TTS Skill: - Provides a comprehensive guide for generating high-quality text-to-speech audio using the Fish Audio S1 service. - Includes detailed instructions for uploading generated audio files to NextCloud via WebDAV. - Documents environment variables for configuration and outlines available voices and advanced features, such as emotion tags and quality settings. - Offers troubleshooting tips for both audio generation and NextCloud uploads. - Supplies API usage examples and a bash script template for automation.
Metadata
Slug fish-tts
Version 1.0.0
License
All-time Installs 2
Active Installs 2
Total Versions 1
Frequently Asked Questions

What is Fish Tts?

Generate high-quality speech from text using Fish Audio S1 and optionally upload the MP3 audio file to NextCloud via WebDAV. It is an AI Agent Skill for Claude Code / OpenClaw, with 704 downloads so far.

How do I install Fish Tts?

Run "/install fish-tts" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Fish Tts free?

Yes, Fish Tts is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Fish Tts support?

Fish Tts is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Fish Tts?

It is built and maintained by gtank1 (@gtank1); the current version is v1.0.0.

💬 Comments