Real-time audio processing is the most demanding task for a CPU. Legitimate developers spend years optimizing their code to run at 44.1kHz or 48kHz with buffers as low as 32 samples.
If you’ve ever wondered how top-tier YouTubers and streamers keep their voices crystal clear while music pumps in the background, you’re looking for auto ducking
While I don't have a specific paper to reference, I can outline the general concepts and challenges involved in implementing real-time auto-ducking, which might be covered in a technical paper on the subject:
This information expires once printed. Please always refer to the online version for the most current information.