OWL DSP library updates

Since the early days we’ve built up a growing collection DSP utilities in the form of the OwlProgram library classes, or OpenWareLibrary as we should perhaps call it.

The API is here:
https://www.rebeltech.org/docs/

It includes basic signal processing, FFTs, et c. In the beginning, the library was focused on cross-platform C++ wrappers for optimised DSP functions found in the CMSIS DSP library. But in time it has grown to provide many oscillators, filters, envelopes, windows and other useful things.

But in spite of having many great components, it was always lacking a consistent structure. I’ve tried to address this with a new PR: Refactor DSP library classes.

Summary:

  • Oscillators and Noise generators inherit SignalGenerator abstract base class
  • Filters inherit SignalProcessor abstract base class
  • Envelopes inherit both SignalGenerator and SignalProcessor (can be used as either)
  • Objects are (most conveniently) allocated with a static create(...) function, and deallocated with destroy(...)
  • create() and destroy() should be called from the Patch constructor and destructor, respectively

Most API changes will elicit a deprecated warning, pointing you to the new, preferred method, without breaking any code or changing previous behaviour. But there are some breaking changes:

  • BiquadFilter normalised frequency now follows f/SR norm, instead of 1/Nyquist
  • Oscillator constructors no longer take SR argument, SR must be passed in with setFrequency(freq, sr)
  • Oscillator::setFrequency(float) assumes argument is a normalised frequency

All classes with methods that take a frequency now implement two versions, e.g.:

void setFrequency(float freq, float sr); // set frequency given sample rate 'sr'
void setFrequency(float nfreq); // set normalised frequency, i.e. freq/SR

This means that the sampling rate doesn’t have to be held in every object instance, instead the correct rate can be passed in the setter function by the caller. It also provides the more efficient alternative of setting a normalised frequency nfreq, which is simply the frequency divided by the sample rate. This is typically a value between 0 and 0.5, representing the range 0 Hz to Nyquist frequency.

The docs for the refactored code is here on the dev server.
(I’ve not figured out how to get doxygen to generate a deprecation list for methods and classes marked with C++14 [[deprecated]] tags, anyone knows?)

The next task will be to refactor the mess that is basicmaths.h. Probably the fast math functions will be separated out into an owl namespace, and most of the macros removed. Introducing a namespace will either mean all patches must add a using namespace owl; directive, or we do it in Patch.h by default :wink:. Discussion welcome.

Other changes I’d like to add include multi-channel (e.g. stereo) SignalProcessor alternatives, maybe also for SignalGenerator.

Any other requests?

You should use “@deprecated” tag in comments, possibly with description comment like “@deprecated foo bar”

I think that the setFrequency changes are very confusing and see little benefits in this approach:

  1. We’ll have 2 methods with the same name that have different meaning assigned to the frequency parameter. At the very least the second one should be renamed to .setNormalizedFrequency() or something like that

  2. We don’t expect to change SR in patch, but now it has to passed explicitly or implicitly (as divisor used for nfreq parameter). It would make more sense to pass it to constructor and maybe store 1 / SR as object member. This way we also won’t have to perform division every time block is processed.

Regarding ABC for signal generator/processor - this is a very nice addition. Signal processor should obviously have per sample process() function that would be used as default implementation for block based processing.

you mean like it used to be :slight_smile:
The division can be optimised away by the compiler so that’s not much of an issue.

The argument is that using normalised frequencies is generally preferable, and if something needs to be expressed in Hertz then there’s a simple convenience method for that. So all frequencies are normalised, unless you explicitly specify a sample rate. I don’t think that’s confusing.

As well as [[deprecated]]? I think doxygen should be able to process C++14 tags.

A nice syntactic sure would be a few operator overloads for SignalGenerator / SignalProcessor, i.e. “>>” for passing output to next processor’s input. And some math operators for arrays. I wouldn’t mind to be able to use something like:

buf = (vco >> vcf >> vca) * 0.5;

instead of more ugly things like:

buf = vco.generate();
buf = vcf.process(buf);
buf = vca.process(buf);
buf = buf.multiply(buf);

or

buf = vca.process(vcf.process(vco.generate()).multiply(0.5);

Would it actually be optimized if you’re changing it in runtime? I understand how it would be done automatically with the old approach, but I’m not so sure about the new one. And I’m not sure even how much we can rely on compiler optimization while still using -Os

I still think it would be better to use a separate method for setting normalized frequency and let user decide which one is more suitable. We could just define void setFrequency(freq){ setNormalizedFrequency(freq * sr_multiplier)};

Maybe we need a more generalized solution for some cases, i.e. a processor that accept mono signal and outputs stereo (or more) channel audio could be used for panning. Not sure what would be a good way to handle this. One possible approach is to have multi-channel versions of array classes use templates for SignalProcessor that sets types for its ::process() class. Then we could something like:

template<typename Input, typename Output>
class SignalProcessor {
    virtual void process(Input input, Output output){
        // ...
    }
};

class Panner: public SignalProcessor<FloatArray, StereoFloatArray> {
    void process(FloatArray input, StereoFloatArray output) override {
        //  ...
    }
}

Btw, we don’t really need signal generator if signal processor would be multi-channel. It would be just a synonym to a processor with 0 input channels.

Also, Teensy has a fairly nice audio library based around AudioStream object that is something between whatever SignalProcessor would become and patch object. They can be connected in a graph, of course.

Header file for reference - cores/AudioStream.h at master · PaulStoffregen/cores · GitHub

Audio library that uses it - GitHub - PaulStoffregen/Audio: Teensy Audio Library

The cool thing is that FAUST’s Teensy integration exports them, so it’s possible to generate multiple independent objects from FAUST and use them from C++ code. It would be extremely cool if we could cover this too (generating SignalProcessor objects from FAUST). Teensy arch here - cores/AudioStream.h at master · PaulStoffregen/cores · GitHub

Okay so maybe normalised frequencies is not a good idea - reverted!

I actually think that your ‘ugly’ example is rather lovely! The sequence and results are perfectly clear. Though I’d prefer it block based.

I love all the syntactic sugar that C++ provides but not in this library. The idea is that programmers that are less confident with things like overloaded operators and templates should not be put off, and that the meaning of code is clear. Without being in any way dumbed down. Simple objectives, but hard to achieve!

I actually don’t see a problem with adding normalized frequency support and would prefer to have them in the new API with a few conditions:

  1. They would be used in separate methods, i.e. we would have .setFrequency() and .setFrequencyN() that would distinguish which one is used.

  2. SR wouldn’t be passed to every call that uses it, but given to the constructor and stored as multiplier

  3. There would be just one implementation for both methods, i.e. typically we’ll have .setFrequency(freq) { .setFrequencyN(freq *freq_norm);}

This would allow us to support normalized frequency and deprecate using raw frequencies if necessary (without breaking existing patches)

Speaking of templates, Biquad class is begging to be converted into one (with stages number as parameter)

Potentially we could also have a single template Array class with generic non-optimized implementation and add optimized specialized implementations with math code on ARM. And we can create template aliases for the old version, so FloatArray would be available under the old name in the old header, but we’ll only need to define optimized methods there.

Array.h:

template<typename Element>
class Array{
...
    void add(Array<Element> operand2, Array<Element> destination) {
        // ... Skipped size checks 
        for(size_t n=0; n<size; n++){
            destination[n]=data[n]+operand2[n];
        }
        return destination;
    }
};

FloatArray.h

#include "Array.h"

using FloatArray = Array<float>;

#ifdef ARM_CORTEX
inline FloatArray FloatArray::add(FloatArray operand2, FloatArray destination) {
    // ... Skipped size checks 
    arm_add_f32(data, operand2.data, destination.data, size);
    return destination;
};
#endif

This way we would keep compatibility with the old code, but we’ll also have generic implementations for all types of arrays - currently many of them are only written for FloatArray.

Array template: yes this in general is a good plan, there are just a couple of niggles:

  • defining specialisations the way you suggest means putting the CMSIS DSP includes in the header. No big deal, and I think we already do it elsewhere e.g. BiquadFilter.
  • some functions don’t follow a simple template, e.g. ShortArray::getPower() returns an int64_t, not a short.

Regarding templating SignalProcessor and/or subclasses, I have reservations. The problem with templates is that the parameters must be known at compile time and forms part of the object class, which precludes some use cases.

An example where this doesn’t work so well is if you want to make a patch that can process as many channels as the firmware provides, like a multichannel filter.

At the moment we’ve got this:

template<size_t channels>
class MultiStateVariableFilter : public AbstractStateVariableFilter, MultiSignalProcessor {
...
}

// I want 2 channels:
MultiStateVariableFilter<2>* filter = MultiStateVariableFilter<2>::create(getSampleRate());

So the number of channels has to be known at compile time, and we can’t make a patch that will work for 2, 4 or 8 channels without recompiling.

But if channels was just an argument passed to MultiStateVariableFilter::create() then we could let it allocate as many state variables as needed.

I know, that’s a disgusting way to do it. Actually I’m more annoyed by inlined method implementation than CMSIS being present. But hey, let’s thanks C++ for being so easy to use for metaprogramming :wink:

There are a few possible approaches to this:

  • add separate parameter for large data that can differ from base return type (i.e. Array<float, float> and Array<int16_t, int64_t>)
  • we could inherit from specialized template instead of just using it directly. then we can put extra stuff that don’t belong to template to child class:
class ShortArray : public Array<int16_t> {
    int64_t getPower()  {
        // ...
    };
}

This can be solved in different ways too:

  • instantiate all 3 classes for 2/4/8 channels SVF
  • use an array of 1 / 2 / 4 stereo SVF

Also, we may be solving the wrong problem with templates here. I was thinking about something like this more or less:

template <typename Processor, size_t channels>
class MultiChannelProcessor : public Processor, MultiSignalProcessor {
public:
  MultiChannelProcessor(size_t channels, float sr) : Processor(sr) {}

  void process(const FloatArray& in, FloatArray& out) {
    size_t len = min(channels, min(input.getChannels(), output.getChannels()));
    for(int ch=0; ch<len; ++ch){
      FloatArray in = input.getSamples(ch);
      FloatArray out = output.getSamples(ch);
      size_t nFrames = in.getSize();
      for(size_t s = 0; s < nFrames; s++){
        process(in[s], out[s], ch);
    }
}


template <typename Processor>
class DynamicChannelProcessor : public Processor, MultiSignalProcessor {
public:
  DynamicChannelProcessor(size_t channels, float sr) : Processor(sr), channels(channels) {}

  void process(const FloatArray& in, FloatArray& out) {
    size_t len = min(channels, min(input.getChannels(), output.getChannels()));
    for(int ch=0; ch<len; ++ch){
      FloatArray in = input.getSamples(ch);
      FloatArray out = output.getSamples(ch);
      size_t nFrames = in.getSize();
      for(size_t s = 0; s < nFrames; s++){
        process(in[s], out[s], ch);
    }

Then you’ll need to inherit from them for MultiStateVariableFilter and DynamicStateVariableFilter, implement process method and add extra methods as required. Things would get interesting if we could use the Processor parameter itself to perform computation, but I’m not sure what would be a good approach to this (pass channel state to it somehow?). And it’s probably better to benchmark performance for both versions to see if fixed number of channels gives any measureable improvements.

I’d say delay line implementation belongs to this update too. Mostly like this, but:

  1. array type and element as template parameters

  2. I was going to suggest a few extra methods for interpolation, but actually it would make more sense to have them in separate classes (or just functions? either way, it could be given as template parameter to delay line.). We should have at least Hermite in addition to linear.

  3. shortcut for using it as APF

  1. Yes definitely agree, I’ve just not come up with a class that I’m happy with yet!

  2. and 3. why not

I had a go at templating arrays, see
https://github.com/pingdynasty/OwlProgram/blob/feature/arraytemplate/LibSource/AbstractArray.h
and
https://github.com/pingdynasty/OwlProgram/blob/feature/arraytemplate/LibSource/FloatArray.h

Incomplete still, I just wanted to easily compare before and after. Some patches end up slightly bigger, others slightly smaller.

I’m not convinced the advantages outweigh the disadvantages, but if you want to do some experimentation pls go ahead.

Looks good as a starting point, but I think the name should be something like BaseArray or GenericArray, because abstract has a very specific meaning used for abstract base classes. I think that it should have all methods implemented - specifically to avoid making it an ABC that would use vptr lookup, etc.

Also, you can’t have specialized method implemented in .cpp files. It won’t work like that, because C++ sucks and it has to be done the ugly way by placing them in headers the way I’ve written above. Otherwise other translation units won’t see those implementations and will always use the generic one.

Another thing is that you’re overloading methods in child classes rather than providing specialized versions for parent class. This is not the same semantically, but I’m not sure if there would be practical difference. I would avoid this to have a single approach for method name resolution.

Actually, it may work the way you think specifically because of shadowing of generic version. But this means that methods in base class that call other methods would always use the generic version. So this trick with avoiding virtual methods via specialization works correctly only if specialized methods are always visible in headers.

it may work the way you think specifically because of shadowing of generic version

yes correct, and it depends on how you initialise an object too, ie AbstractArray<float> is not the same as FloatArray.

works correctly only if specialized methods are always visible in headers

Yes they must be declared in the header, but can be defined in a .cpp, because the compiler knows there’s that method to look for.

It all gets very confusing though. I think a stripped-down base class might be useful, with just a few access methods and operators that never need to be overridden/shadowed/specialised. But that’s about it.

Is there any reason why we don’t pass arrays by reference? I understand that it’s not a big deal, because this is just passing 64 bits of data vs 32 bits reference. But generally this is frowned upon, to the extend of requiring disallowing copying and assignment constructors in style guides (i.e. Google). I don’t think it would break compatibility with existing code, so might be a good time to change it?

Also, could we have an Oscillator::setPhase() method rather than in SineOscillator only? It’s not specific to sines and has several applications:

  • phase modulation
  • necessary if morphing with more than 2 VCOs (we need to render just 2 of them at once, sync discontinuity is removed due to morphing)

Other than that, I’m running out of constructive ideas, so it must be almost ready?

Actually it seems like SineOscillator has lots of generic stuff related to phase/frequency that could go to its parent class. Trying to port some BLEP oscillator code and it requires copypaste from SineOscillator.

yes: putting two words on the stack is faster than a pointer, once it comes to loading the values.

not sure I follow you here but I look forward to seeing your patch, and there’s one more reason: hard sync (though that’s also what reset() can be used for).

I’m not sure the phase is in radians for all oscillators but I guess it should be, otherwise getting/setting it at the oscillator level makes little sense.