Question 1

What is Segment Anything for Audio?

Accepted Answer

Segment Anything for Audio (SAM-Audio) is an AI model developed by Meta that can isolate any sound from a mixed audio recording using a text prompt. Unlike traditional stem splitters that only separate fixed tracks like vocals or drums, Segment Anything for Audio lets you describe any sound — a dog barking, a siren, a specific instrument — and extract or remove it. Sievin is built on this model, making Segment Anything for Audio accessible to creators without any audio engineering skills.

Question 2

What is audio separation?

Accepted Answer

Audio separation is extracting a specific sound from a mixed recording — like isolating vocals, removing background music, separating speech from noise, or extracting drums — so you can reuse it in video editing, podcasts, or music production.

Question 3

Do I need audio engineering skills?

Accepted Answer

No — but writing a clear prompt helps. SAM-Audio is prompt-sensitive by design, meaning the quality of your results depends on how you describe what to extract. We include tips inside the app to help you write tighter prompts (like "male speech" instead of just "voice"). No audio engineering knowledge required.

Question 4

Is Sievin like a stem splitter?

Accepted Answer

Yes — but designed for non-engineers. Traditional stem splitters separate predefined tracks (vocals, drums, bass). Sievin uses Meta's Segment Anything for Audio (SAM-Audio) for prompt-based extraction — describe any sound to isolate or remove. The tradeoff: prompts need to be specific for best results. Learn more about SAM-Audio at ai.meta.com/blog/sam-audio.

Question 5

Can I remove vocals from a song?

Accepted Answer

Yes. You can remove vocals to create an instrumental, or isolate vocals to extract an acapella. For best results, be specific — "female singing voice" works better than just "vocals". The app includes prompt tips to help you get cleaner separations. Note: SAM-Audio works best with distinct sounds; overlapping voices or similar instruments may require more precise prompts.

Question 6

What audio formats do you support?

Accepted Answer

We support MP3, WAV, FLAC, and M4A files up to 10 minutes in length. This covers most audio clips from video editing, podcasts, and music production workflows.

Question 7

Is my audio private?

Accepted Answer

Yes — privacy-first audio processing by design. Your files are processed on isolated GPU instances, automatically deleted after 24 hours, and never used to train AI models. Your audio stays yours.

Separate audio with a prompt
— no plugins, no engineering.

Built for creators like you

For video creators & editors

For podcasters

For educators & presenters

For musicians & practice

How it works

Upload your audio

Describe the sound

Download the result

Privacy you can trust

Secure processing

24-hour auto-delete

No AI training

Simple, transparent pricing

Starter

Creator

Frequently asked questions

Ready to clean up your audio?

Separate audio with a prompt— no plugins, no engineering.