You’re mostly correct, but you’re thinking about output, while you should be considering all the intermediate data that is processed in your patch instead. So if you hardclip, there would be inevitably added some amount of harmonics that go above 1/2 Nyquist frequency (above 24kHz) as soon as you clip the signal. They would get reflected back from that frequency, i.e. if you have a signal with harmonics at 5K, 10K, 15K, 20K, 25K, 30K, etc, 25K would become 23K, 30K would become 18K, and so on. In worst case the reflected signal reaches 0Hz and gets reflected from it again, and so on.
So all those aliased partials are usually not harmonically related to your audio and they end up near the same frequencies. This sounds like a buzzing noise. You can’t just filter it away completely with BPF, because they would be interwoven with your audio contents. It may become not very prominent depending how much of your output signal is left, but it typically increases noise floor or gets unexpected side effects later.