DONE! A new Audio-Out Approach in BlitzMax FreeAudio RingBuffer

Started by Midimaster, March 22, 2021, 15:16:23

Previous topic - Next topic

Midimaster

Final Version:

The Worklog has a final solution which is on post#59. You can find it here:
https://www.syntaxbomb.com/index.php/topic,8377.msg347049187.html#msg347049187
The code and the examples work on BlitzMax NG and Blitz 1.50 too.



Why needed?

As many people here are looking for a more direct approach to play sounds in BlitzMax I will try to write something simple like Java has:

Other languages do it

Java uses a FIFO-Buffer. You fill in Sample-Values (SHORTs) on the top of this stack. And the system takes it away on the bottom. The whole will work asynchron. There is no need for a callback or for watching the state of the buffer.

If the buffer gets empty, the system will play SILENCE (like SHORTs with value=0  would do). If you fill a lot  into the buffer it will not overrun, but increase. So the only effect this will have is a longer latency time until the SHORTs reach the bottom.

What is it good for?

This gives you the opportunity to manipulate samples values until the very last moment. f.e. in synthesizer you would be able to adjust the sound color in realtime. for realtime you need to work with a very small buffer size. If you would put only 441 SHORTs on top this would mean that they only need 10msec to reach the bottom. For a user listening to it it would sound like realtime. On a stage 3 meter cause also 10msec! Everything below 40msec (13meter) can be seen as realtime.

You could mix together playback audio with incoming microphone signal. And because the times ellapsed since playback left the app is only 10msec and 10msec later the recording signal is already avaiable the recording signal appears 20msec after the playback. Fast enough for the listener to feel it synchron.

How to do that?
There is a SDK PortAudio which offers in the new version 19 exactly this approach. Years ago there was already a wrapper for portAudio V17 done by Simon Armstrong.

But first we want to see if this Ringbuffer can be done already with the "standard" Audio-SDK of BlitzMax: FreeAudio. In a future worklog I will investigate the PortAudio



Who wants to help?
Anybody who want to help is welcome. I know a lot about music, but never wrote a wrapper. Also I do not really write C. So feel free to join this project.

Step 1 will be to publish here the source codes and URLs with the inofrmations

Sources and informations:

Official Port-Audio Homepage:
http://files.portaudio.com/docs/v19-doxydocs/index.html


C Sample code for the new approach without callback:
http://files.portaudio.com/docs/v19-doxydocs/paex__write__sine_8c_source.html


Simon's PortAudio.mod:
https://github.com/nitrologic/axe.mod/tree/master/portaudio.mod


...on the way to Egypt

Scaremonger

Hi,
After a recent post relating to ZXSpectrum sound emulation, I started researching exactly what you have suggested in your post and found that others had worked on it too:

https://mojolabs.nz/codearcs.php?code=3129
https://www.syntaxbomb.com/index.php?topic=5807.0

With BlitzmaxNG now including support for SDL, this is also a possibility:

https://wiki.libsdl.org/Tutorials/AudioStream

I'm spending all my spare time on my Competition game entry at the moment but will help when I can.

Si...





Derron

As written to you in private messages already - I am not sure if it requires PortAudio.

FreeAudio has buffers in which you can write. Talking about TSound-buffers which you play in TChannel. You just need to wrap your "fifo"-approach on it and it would not need much further adjustments.
I explained it to you (the OP) that it does not differ what kind of audio date you provide - it might be a streamed audio file (or web stream), or dynamically created sounds.

Working with audio.soloud is a bit different, as they do not expose the buffer (at least I did not get it to work - but raised an issue for it).



Why am I writing this?
If you used PortAudio you surely need to use it for all your audio aspects. You need to make it work with current BlitzMax NG versions if stuff gets updated etc. I would prefer to "avoid" this workload - if possible. So I'd prefer to piggyback on an existing audio solution.


bye
Ron

Scaremonger

@Derron: I saw your post (2014) on one of those threads I linked to GitHub.

https://github.com/GWRon/Dig/blob/3ea47f1e85680dc4403c490751fdd069a2d28047/base.sfx.soundstream.bmx

I've not had a chance to look at it yet, but would this still run under BlitzMaxNG?

Si...

Derron

Quote from: Scaremonger on March 22, 2021, 21:32:40
I've not had a chance to look at it yet, but would this still run under BlitzMaxNG?

https://github.com/TVTower/TVTower/blob/master/source/Dig/base.sfx.soundmanager.freeaudio.bmx
https://github.com/TVTower/TVTower/blob/master/source/Dig/base.sfx.soundmanager.freeaudio.c

should be the current version of it (did not put back all updates into "Dig" yet :)).
I already linked this to Midimaster in our private/direct communication. The "c" file is just there for "vanilla/legacy" to have a refill-buffer-thread at hand even when doing non-threaded builds (NG is threaded by default).

Placing streamed data in the buffer is nearly the same as placing "created" data in the buffer. Issues I had was that some stuff/information is not provided by "TSound" or "TChannel" (see how long a channel is playing or such stuff). Also important is: that FreeAudio has a parameter to pass to TSound so it stays "in memory" - you cannot pass this param in the TSound-creation, so you need to explicitely do some "FreeAudio" stuff.
Saying this as this needs a "switch" then to support brl.freeaudio and audio.soloud (or others).


bye
Ron

GW

https://www.syntaxbomb.com/index.php/topic,5807.0.html

If your audio buffer is too big, you get a delay. If it's too small you get stuttering with the sound driver starved for input. Every computer and program will be different.
Try it with Freeaudio/Raylib/OpenAl first and then make adjustments from there.

Derron

How does the buffer define a delay???

this only is valid if you cannot alter the buffer. The buffer is like a ring - a so called "ring buffer". Excuse my simple drawing now.




Your buffer data ends where it starts. So once the last bit is read the first bit will follow. The ring is your data. The red area is the part currently send to your sound backend (DX, OSS, Alsa, PulseAudio ...) which sends it to your sound card / chip / ... / your playback device.
The arrow shows that this red ring moves around the data buffer. The gray area (rest of the ring) is where you can freely alter the content of the buffer.
Each time you send data out to the playback device you will move the "red ring" for the amount of send data.

I am not sure how you will now get a delay (assuming you send out data often enough).
Stuttering could then be avoided by filling the buffer with "silence" after your dynamic generated data.
So if your "play the following tone" occupies a quarter of the ring (buffer), fill the other 3 quarters with silence.


Blue is here our freshly "to add" data .. gray is what you need to fill with "silence" (not ignoring it).

So the only way you then will hear "stuttering" is:
- your "playback device feeder" (the one saying "DX please play this stuff here") is still handing out data from the ring buffer
- and your "data creator" is no longer working (so changing content of the buffer).


Or am I missing something important here?

bye
Ron

Midimaster

Ok I interupted my study of usinf a FIFO-stack to now explore in the direction Derron suggested "FreeAudio-Ringbuffer". I tried to play around with the possibilities of Tsound and TAudioSample.

I created audio samples in TAudioSample and did load them into TSound. It was as I expected: LoadSound() produces a real copy of the TAudioSample-RAM for playing it. So an afterwards manipulation of the TAudioSample-RAM did not show any effects. And I see no chance of accessing this TSound-RAM.


As I never used the FreeAudio-functions directly, I do not know how to test whether the FreeAudio has more capibility. And the code Derron offers is to big for a learning.

Can somebody teach me in a dozend lines how to get writing access to a currently playing Sound in FreeAudio. I only need to find out the buffers adress. (I know how to code ringbuffers and timing behavior if I already have the access.)

Some Questions: How to get  FreeAudioDriver-Object? This is wrong:
Local Driver%=SetAudioDriver("FreeAudio" )
Local MySample:TAudioSample=CreateAudioSample(44100,44100,SF_MONO16LE)
Local NewSound:TFreeAudioSound = Driver.CreateSound( MySample,1)


...on the way to Egypt

iWasAdam

#8
Free Audio doesn't have any of the realtime functions - you will need to go into the c++ code and rewrite/augment the entire subsystem.
This will include TSound, TChannel, TAudio, TAudioSample etc

But...

There are two main issues you will have:
1. complexity - the more functions you add the more complex the audio subsystem will get and the more users won't understand how it all works - They can't use the current system and it's simple.
2. abstraction - to make it usable for end user you HAVE to write a proper editor. just having a load of function calls is NOT enough. You need to wrap it all up into something visual that will spit out the data the new audio system requires. And even then, about 70% of users wont understand a thing!
2a. structure - you will need to fully understand and create a base set of rules for any sound creation. vca, lfo, adsr, loops, and that is without adding bus fx.

4. The last thing will be a sequencer - because that is what people really want.

it's all perfectly doable - but the general timescale would be 6 months to a year for writng/testing and deving the initial editors, etc.

Now I'm going to include a few concept thoughts for you (These are real and not just photoshop - so you get an idea of the undertaking)
this is a pure realtime approach


for something simpler - just creating waveforms, etc



You will also need to think about data construction.
IMHO the best way is to have the following:
1. a sample stack - once loaded this data never changes (unless you are creating new samples internally)
2. a control system - this takes a sample as input and manipulates a new result using a standard set of instructions. This is where all the magic happens
3. a channel system - this calls a single control and a sample and coordinates things
4. the core - this handles the channels and any routing/fx if this is is programmed

currently NG has a basic sample stack, and basic channel architecture


oh, and one last thing...
sample format. The best approach is to fix this as 16bit stereo and/or mono
convert all incoming to this format
you will have to duplicate the internal code (one for mono and one for stereo) and handle each. unless you deal with just stereo - then you will have to work out how to pan in stereo - it's not as simple as it first looks.
But this is much simpler that having to do it 4 times with 8bit and 16bit!!

Derron

#9
Quote from: Midimaster on March 23, 2021, 10:35:35
I created audio samples in TAudioSample and did load them into TSound. It was as I expected: LoadSound() produces a real copy of the TAudioSample-RAM for playing it. So an afterwards manipulation of the TAudioSample-RAM did not show any effects. And I see no chance of accessing this TSound-RAM.


Code (BlitzMax) Select

Self.buffer = New TBank
Self.buffer.Resize( bufferLength * 4 ) 'SizeOf( Int(0) ) )
Local audioSample:TAudioSample = CreateStaticAudioSample(buffer.Lock(), GetBufferLength(), freq, format)

'driver specific sound creation
CreateSound(audioSample)
....


'=== CONTAINING FREE AUDIO SPECIFIC CODE ===
?bmxng
Method CreateSound:Byte Ptr(audioSample:TAudioSample)
?Not bmxng
Method CreateSound:Int(audioSample:TAudioSample)
?
'not possible as "Loadsound"-flags are not given to
'fa_CreateSound, but $80000000 is needed for dynamic sounds
Rem
$80000000 = "dynamic" -> dynamic sounds stay in app memory
sound = LoadSound(audioSample, $80000000)
endrem

'LOAD FREE AUDIO SOUND
'?bmxng
Local fa_sound:Byte Ptr = fa_CreateSound( audioSample.length, bits, channels, freq, audioSample.samples, $80000000 )
'?Not bmxng
' Local fa_sound:Int = int(fa_CreateSound( audioSample.length, bits, channels, freq, audioSample.samples, $80000000 ))
'?
'"audioSample" is ignored in the module, so could be skipped
'sound = TFreeAudioSound.CreateWithSound( fa_sound, audioSample)
sound = TFreeAudioSound.CreateWithSound( fa_sound, Null)
End Method


The important part is the creation of a TAudioSample with already allocated memory. "$80000000" is needed to make the "sound" to stay "in memory". You cannot pass it to "TSound"  generically (so for eg Soloud). You need to work with the "fa_" stuff (freeaudio).


so cut down:
Code (BlitzMax) Select

Local format:Int = SF_STEREO16LE
Local bits:Int = 16
Local freq:Int = 44100
Local channels:Int = 2

Local bufferLength:Int = 1024
Local audioBufferBank:TBank = New TBank
audioBufferBank.Resize( bufferLength * 4 ) 'SizeOf( Int(0) ) )

Local audioSample:TAudioSample = CreateStaticAudioSample(audioBufferBank.Lock(), bufferLength, freq, format)
Local fa_sound:Byte Ptr = fa_CreateSound( audioSample.length, bits, channels, freq, audioSample.samples, $80000000 )
Local sound:TSound = TFreeAudioSound.CreateWithSound( fa_sound, Null)


That way you should be able to access the ringbuffer in audioBufferBank via helper methods or as byte-block with audioBufferBank.Lock().



@ IWasAdam
Your knowledge is mindbreaking  - as always. My code is just allowing to play back custom / generated data. Yours is more of a generator itself already. This is a can of worms I would hopefully not have to open :)



Edit: Once you get the ringbuffer "manipulateable" ... you would just put your "FIFO"-stack on top of it. Adding data, and the buffer-refiller takes data from it as much as possible (fitting into the ring buffer). If the stack is empty, it fills with "silence" (but does not alter "last valid added data" position - so you could add again to the stack and it starts filling - if desired).

bye
Ron

Midimaster

@iWasAdam

I was also convinced, that FreeAudio will not help me in my quest, but Derron told me, that there is a usable approach to do this with FreeAudio. So I still would prefer an audio-SDK which offers a  FIFO-buffer. It is easy to use. I can push various amount of SHORTs into it. And there is no need to keep a exact timing for this pushing.



For me a ringbuffer for audio has the charme of a racing course, where a sportscar turn his rounds, while I was forced to repaint the center lines on the street.

I not try to write any sythesizer software, but i gave this sample that other user understand, that I do not handle with existing sounds.


I also do not want to write a new mod, which other users can use. I'm only interested in one single function. Sending 16bit-unsigned-48kHz sample values to an audio device. The sample values need to react to OpenAl incoming stream and need to be played within 20-40msec after the signal came in.

So may I ask how you managed this approach? It looks you also work in music software. Thats also my business. Did you already have a look on my 20track-player for practicing for orchester rehearsals?
https://20tracks.org/ or Google Playstore: https://play.google.com/store/apps/details?id=midimaster.twentytrackd

Here I push every 20msec 441 SHORTs on a Java-based audio stack, while the audio device fetches 441 SHORTs in the same time. In the middle of the buffer are another 441 SHORTS not to risk crackles.

I heared about the new PortAudio V19, which now offers a FIFO approach next tot the previous CALLBACK 
But for BlitzMax only exist a Wrapper for old V18.1.
...on the way to Egypt

Derron

So your application does connect to OpenAL already. Did not check yet if you can use FreeAudio "coexisting" but it should - if you only use FreeAudio for "output" and OpenAL for "input" (the brl.audio system does not handle "input" - so it should be doable).


Using the code I posted above should be easy to "try out" ... and if it is not feasible you can always go back to wrap a complete library - just thought I could help you avoiding this big pile of workload.


bye
Ron

Midimaster

#12
Quote from: Derron on March 23, 2021, 12:16:20...so cut down:

Derron, you speak in riddles to me...

When I do this little code:
Code (BlitzMax) Select
Graphics 800,600
Local AudioSample:TAudioSample = CreateAudioSample(44100, 44100, SF_MONO16LE)
For Local i%=0 To 44100-1
AudioSample.samples[i]=Sin(i)*1000.0
Next
Local sound:TSound =LoadSound(AudioSample)
PlaySound Sound
Repeat
Flip 1
Until AppTerminate()

...it works as expected.

Now to your code:
Code (BlitzMax) Select
Graphics 800,600
Local audioSample:TAudioSample = CreateAudioSample(44100, 441000, SF_MONO16LE)
For Local i%=0 To 44100-1
AudioSample.samples[i]=Sin(i)*1000.0
Next

Local fa_sound:Byte Ptr = fa_CreateSound( audioSample.length, 16, 1, 44100, audioSample.samples, $80000000 )
Local sound:TSound = TFreeAudioSound.CreateWithSound( fa_sound, Null)

PlaySound Sound
Repeat

Until AppTerminate()

...does not play anything. What did I forget?

now to additional use the TBanks makes the things more complicate, but not working, or?



...on the way to Egypt

iWasAdam

@ MidiMaster
20 tracks sounds a lot like a STEM mixer - which is great :)

I went directly into the BlitzMax mods and began working out how it all worked and how modify it all. Internally FreeAudio uses a ring buffer (I think that most audio systems do at their base), but the has a sort of queueing voice architecture that both freeaudio and audio and channel and sample all link into - so you can't modify one without the results being passed along. All the base code is in c++, so you will need to think in c++ then write wrappers around them and interface with the Mod structure to get it all working. Hence there is no version of NG as I didn't want to go through all the trouble of rewriting all the stuff.

So it would be better to have (as you rightly said) something that was above all that and just went for the hardware.

Simple ring buffer is the way to go, you can directly output 32 voices with no issues beyond that you will need to have very tight code. Trick here is to only use half your volume in any sample. that way you can simply mix them together and the results wont 'clip' ;)

It's why I suggest having your voice architecture different from the sample bank and the control bank.
That way you could pick a 'free' voice load it with a control and sample and you have a single voice outputted.

If you go down the road of realtime manipulation then I would suggest only using shorts as the output, but float as the internal storage with 0 being the mid point. This will give you much better fidelity, but also allow you to scale/volume and mix just by doing
sample*0 = no volume
sample*1 = full volume
sample = sample A + sample B, etc


iWasAdam

The best way would be to create your own (simple) format that you can test and display and start from there ;)

Don't expect anyone to write your editor for you - they wont - hehehe