Hmm I suspected that this may happen. I think that this means that reading 16 or 32 bits takes more or less the same amount of time, so we don’t gain anything from shorter values while still adding extra processing.
Something else that’s worth trying is storing sample data in recording buffer in interleaved format. I mean that instead of writing samples like l1
, l2
, …ln
and r1
, r2
, … rn
you would use l1
, r1
, l2
, r2
, … ln
, rn
. With i16 samples you can read a 32bit frame and unpack it into 2 samples. This reduces amount of buffer reads by half, but no idea if it’s enough to overcome the extra sample conversion overhead.
With f32 buffer amount of IO won’t change, but you will perform half of your reads sequentially when processing stereo audio. This should work faster with SDRAM as it has IO buffer that makes sequential IO faster than random. It may be better to stick to raw pointers for iteration instead of using array lookups.
There are 2 possible approaches for using interleaved format and I’m not sure if there’s any performance gains in either of them:
- use FloatArray and manage addresses accordingly (left and right channel data are stored as sequential samples)
- use ComplexFloatArray (L/R channel data is stored as real/imaginary values in a ComplexFloat)
There’s “ComplexSignalGenerator” class for working with complex data in packed format if you want to try that. And you can use ::copyFrom
and ::copyTo
methods in ComplexFloatArray
to convert between interleaved/non-interleaved channels.