Soundfile primitive support in Faust?

sletz · February 2, 2021, 9:04am

What is the state of soundfile primitive support in Faust ? (some simple DSP program here faust/tests/soundfile at master-dev · grame-cncm/faust · GitHub).

antisvin · February 2, 2021, 9:02am

I was looking into this before and now can be a good time to discuss preferred approach for loading samples on OWL.

We’ve recently added access to MCU flash storage. There’s a primitive flash-friendly filesystem on it, so data can be accessed by name and deleted by changing a few bits of data. There are several ways we can handle loading audio samples:

Only store raw float arrays and read them as is
Allow more data types, encode its type in resource name (i.e. “data[u16]” would be convert from unisigned integers to floats). As far as I understand, FAUST requires float to be output, even if we use more compact format for storage.
Add a primitive WAV parser that can process a single file
Same as above, but use RIFF playlist/cue chunks to store more than one file in a single resource. This would be beneficial as number of resource slots is limited.
Support other formats in addition to WAVs

Last option (and maybe 3 & 4) would handle file parsing files by firmware to avoid patch size bloat. This is not a problem, because it would be possible to reuse the same code for C++ and other integrations.

sletz · February 2, 2021, 11:22am

Thanks. First I can explain in more details the current model in Faust:

at the DSP program level, the developer uses the soundfile language primitive that can load a set of sounds (up to 256 for a same soundfile item), and then address each of them with the so-called part parameter (first input in the soundfile block-diagram, see Faust Syntax - Faust Documentation)
at C++ architecture file level, the code has to use a SoundUI controller that aims in loading all soundfiles described in the DSP, see faust/SoundUI.h at master-dev · grame-cncm/faust · GitHub, so something like SoundUI sound_manager; dsp->buildUserInterface(&sound_manager); in the C++ architecture
the SoundUI uses a concrete audio file loader, a subclass of the base class SoundfileReader described here: faust/Soundfile.h at master-dev · grame-cncm/faust · GitHub. The audio file loader check each file, and progressively fills a Soundfile structure, to be shared with the actual DSP object, that is going to use it at runtime. They are currently two implementations: LibsndfileReader that used the libsndfile file library, and JuceReader when using JUCE framework.
we also have a much simpler class WaveReader that could be used in the OWL. So my suggestion would be to start from that.

antisvin · February 2, 2021, 12:53pm

Thanks, I’m already familiar with most of the above, but somehow missed the WaveReader class. This is what I’ve described as option #3 above. I would definitely agree that it would be suitable for storing 1 WAV per resource. But my preferred approach would be to have a similar parser, but also supporting cue points that would allow loading multiple parts in a single file. This would allow using multisample resources, i.e. for something like a drum kit or a multi-layer velocity sensitive sample.

Some info about WAVs with cue chunks is here:

Such files are typically not used, but they would be backwards compatible with a single sample WAV files.

I’ve used the following script to generate sample data: wavfile.py (enhanced) · GitHub

sletz · February 2, 2021, 1:25pm

then how to you see this “cue points” model fit with the current soundfile semantic ? Possibly separating each sub-part of the single WAV + cue points file to be accessed by the “part” first input of the soundfile block-diagram?
can you possibly make the simple WaveReader model work, so that to have something to start with ?

antisvin · February 2, 2021, 2:10pm

I would definitely try to get the simple reader usable soon. The cue points stuff can be added later and should be backwards compatible to loading whole files.

I think that we might encode this as a custom URL scheme, so that just using “foo.wav” would work for reading the whole file without cue points, while something like “owl://foo.wav/cue1” would be used for reading a specific sample part. This would allow us to keep compatibility with other platform for the most common use case.

Then we can implement several use cases:

load whole data chunk as a sample (“url:foo.wav”)
load a single cue point as a sample(“url:owl://foo.wav/que1”)
loading all cue points as multiple samples (“url:owl://foo.wav/*”)

In the latter case we’d have to use one soundfile object per resource, but this would allow addressing each cue point individually with the part input like you’ve suggested.

sletz · February 2, 2021, 2:24pm

OK, this makes sense, but if this model of multi cue points is generic enough to be supported everywhere, then we can imagine to have it on other platforms and finally remove the OWL special coding with owl: in the label.
But lets proceed step by step

antisvin · February 2, 2021, 2:58pm

Cue loading definitely could become an official feature. There’s a chicken and egg problem with cues in WAV files - the format itself supports this, but software authors don’t bother implementing it because it’s uncommon, while content creators don’t use it because it would be ignored by most parsers.

Also, rather than use a special scheme name we could specify cue as #cue in the end of URL to mimic HTTP fragment syntax. But I agree that it’s too early to consider such details until basic loading is usable.

antisvin · February 6, 2021, 8:52pm

@sletz there are a few minor issues when using current soundfile code on OWL:

Soundfile.h has try/catch blocks and we build code with exceptions disabled. This can be solved by a commonly used ugly hack in our arch file:

#define try if(true)
#define catch(x) if(false)

It also uses throw in a few places in WaveReader.h file. This is used in classes that we don’t actually need, so I’ve made an edited copy of that file. I’ll probably use it for adding more formats and maybe for experimenting with cue points, so this is not a big deal.
Soundfile.h contains #include <iostream> that can’t be used on OWL (we don’t build with full stdlib). But it looks like nothing from that header is actually used in that file, it works if that include is disabled. Perhaps it’s just a leftover from older version of WaveReader coder that can be removed in upstream to resolve this remaining problem?

sletz · February 6, 2021, 9:05pm

Possibly. Yes please do a PR with any cleanup that can go upstream.

antisvin · February 8, 2021, 9:00pm

So I’ve got a working early version of soundfile API, it will need some cleanup and some minor finishing touches. But it won’t take that much time to complete.

Performance is usable - 9% load for raw playback, 20% for interpolated loop on OWL2.

Original code was using 16k size for empty buffers - faust/Soundfile.h at master-dev · grame-cncm/faust · GitHub

This caused patch to run out of memory during initialization. This is only used for empty buffer and is quite wasteful. Changing it to 1k leaves us with 2Mb used, which we can allocated. I’ve checked smaller sizes too. Minimal size that could be used was 27 bytes (requiring ~60kb extra RAM). I think this crash could be related to parts of soundfile allocated in different regions (SRAM/SDRAM). I’ll be checking what exactly happens with debugger, but my guess is that FAUST generated that could be trying to access something by invalid address in such situation. Or it could be an issue in OWL’s memory allocator.

@sletz, is there any particular reason for such huge (by microcontroller metrics) default size for the empty buffers?

sletz · February 9, 2021, 9:57am

Not, this value was chosen quite arbitrarily. We can lower it. Send a PR when you have something working for you, so the I can check on other systems.

mars · February 9, 2021, 1:42pm

How many empty buffers are required? and why?

antisvin · February 9, 2021, 4:12pm

I’m fairly sure that this excessive memory usage is not required, it’s just an omission that is not noticeable on bigger hardware.

FAUST represent soundfile object as a 2D array of 256 parts (i.e. files) x 64 channels. AFAICT, empty parts would still perform a separate buffer allocation. As a worst case scenario, loading 1 file per soundfile would create 255 empty buffers (of 16k currently).

Note that I’m using a slightly edited copy of their code in our integration, so I can alter it freely (upstream code requires C++ stdlib and uses exceptions) . The plan is to use a pointer to a single empty object to avoid this useless allocation.

Also, FAUST does have a global empty object that is used as replacement in some cases (i.e. if loading failed). So I’ll make sure that this object and unallocated parts point to a single empty buffer.

sletz · February 9, 2021, 6:22pm

OK. Again feel free to do a PR for all those modifications. If they are better than the original implementation, then no reason to not have them everywhere. Thanks.