Stem Separation Quality is Unusable for Production

I have been exploring and experimenting Tunee AI. I am writing this to provide critical feedback regarding the quality of the downloaded stems. The stem separation feature needs significant improvement. The downloaded stems are of poor quality, exhibiting heavy bleeding between the tracks.

The individual stems (e.g., Vocals, Drums, Instruments) are highly compromised by artifacts, primarily severe bleeding. I have found the isolated tracks to be heavily contaminated with audio from other elements of the mix. For example, the vocal stem contains audible instrumental bleed, and the instrumental stems contain significant vocal remnants.

As a producer/remixer, this level of bleed and contamination renders the stems completely unusable for clean production work. They cannot be reliably mixed, mastered, or repurposed in an external DAW, which defeats the entire purpose of offering a stem download feature. The current quality is far below the standard required for professional and even high-hobbyist music creation.

I strongly recommend prioritizing the retraining or upgrade of your underlying AI model for stem separation. The core quality of the isolated tracks must be improved to achieve a cleaner separation with minimal cross-talk and artifacts. This is a crucial feature for your user base, and its current state significantly limits the platform’s value for producers.

Thank you for considering this feedback. I look forward to seeing this feature improve in future updates.

2 Likes

Hi there,Thank you so much for the detailed feedback — we truly appreciate it.

We’ve taken note of your suggestions, and will be releasing an improved version of our stem separation feature later this week. Once it’s live, we’d love for you to try it out and let us know whether the new quality meets your needs.

1 Like

Stem separation will never be as good as raw stems recorded in a DAW. This is because when a song is mixed, the affects of frequency and dynamics effect each instrument, ie, a frequency that is occupied by a Vocal and a guitar, normal the mixing engineer will prioritise one over the other, so the guitar will lose some frequencies as well as level. So when you extract the guitars from a mixed song, will have missing frequencies and level/dynamics.
Currently AI creates the song in one process. When AI becomes smart enough to create each instrument raw stem and then mix, only then it can provide raw stems. Until then, it won’t be possible to get good stems. Maybe in the near future.

1 Like