Artifical Echoes - AI in Sound and music

Artificial Echoes: AI in Music and Sound Generation


Workshop Notes from Klub Montage Camp 2025

1. Introduction & Overview


What are AI Music Tools?

  • Probablites based on a dataset
  • Probablities are based on a model that has been trained before hand
  • That same algortihms used for text and image generation can be used for sound generation too
  • Using Stable Diffusion to generate spectrogramms

![[Pasted image 20250815124826.png]] —

a. Categories of AI Tools in Audio

Prompt-Based Generation

Tools that use a text prompt to create sound/music

  • non-realtime
  • Text-to-audio/music (e.g. Suno, Udio)
  • Text-to-voice (e.g. ElevenLabs)

    Audio Manipulation

    • Offline (Post-Production)
      • AI mastering, restoration, stem separation, style transfer
    • Real-Time (Performance/Interactive)
      • Real-time effects, voice cloning, adaptive mixing
      • Event-Based (MIDI/Score AI)

    • AI-assisted composition, accompaniment generation, auto-arrangement

      b. Common Use Cases

  • Sound & Sample Generation (e.g. percussion, ambient textures)
  • Voice-Overs
  • Mixing & Mastering (AI-assisted mastering, balance tools)
  • Audio Restoration & Extraction (stem separation, noise reduction)
  • Audio Manipulation & Style Transfer (genre translation, timbre shift)
  • Composition Support (AI-based improvisation, harmony suggestion)

2. Key Tools & Platforms

a. Prompt-Based Audio Tools

Voice

  • ElevenLabs – Text-to-speech and voice cloning
    • Voice Cloning
    • Text to Speech
    • SoundFX
    • Problem with impersonation
    • Trains on your data
    • What happens if it gets hacked?

      Music Generation

  • Music creation via prompts
  • Lyrics can be provided too
  • Great thing for the companies that use it, no royalties for musicians -> cheaper (Ikea, Lidl, Wiener Eistraum)
  • Individualized music for your event
  • Suno got sued from Gema
  • https://www.derstandard.at/story/3000000253877/wiener-eistraum-musik-kuenstliche-intelligenz
  • https://harpers.org/archive/2025/01/the-ghosts-in-the-machine-liz-pelly-spotify-musicians/

    • Suno – Text-to-song with lyrics
    • Udio – High-quality music generation
    • SynthGPT – Prompt-based synth patch generation

      b. Sample & Sound Design Tools

  • Emergent Drums (Native Instruments) – AI-generated drum samples
  • Synplant 2 (Sonic Charge) – AI-guided synthesis and mutation
  • Dance Diffusion (Harmonai) – Open-source music diffusion model
  • Eleven Labs SFX – Sound Effects prompting

    c. Mixing, Mastering, and Restoration

  • iZotope Ozone / Neutron – AI-powered mastering and mixing
  • Landr: AI-based online Mastering
  • LALAL.AI / Spleeter, Stemroller – Stem separation tools
  • iZotope RX – Audio cleanup and restoration
  • Ultimate Vocal Remover : AI-based voice seperation tool

    d. Real-Time / Performance Tools

  • Concatenator (Datamind) – Real-time sound to sample tool
  • RAVE (IRCAM) – Real-time neural audio synthesis. Training possible!
  • DDSP-VST – Real-time neural audio effects plugin. Training possible!
  • Voicemod – Real-time voice manipulation
    • https://www.instagram.com/reel/CxYNNfbgfpE/
  • Neutone – Real time tone-morphing plugin

3. Demos & Hands-On Activities

  • Play around with Eleven-Labs
  • Create Song with Suno
  • Create Samples with Eleven Labs SFX
  • Voice Removal with UVR
  • Use RAVE via nn~

    4. Training

  • How to train your own models?
  • Training from scratch or with pretrained models
  • Powerful computer with high-end graphics card
  • Server-Rental
  • Steep learning curve if you have no experiencing in server-setups and python

4. Discussion: Creative and Ethical Implications

  • Authorship & originality
  • The dataset you are using has been trained probably on data that has not been paid for
  • Consent, deepfakes, and voice/data rights
  • https://futurism.com/soundcloud-ai-terms-of-service

5. Showcase