Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raw Waveform display/processing (notes + code of failed attempt) #398

Open
teadrinker opened this issue Mar 20, 2024 · 4 comments
Open

Raw Waveform display/processing (notes + code of failed attempt) #398

teadrinker opened this issue Mar 20, 2024 · 4 comments

Comments

@teadrinker
Copy link
Contributor

I thought it would be nice to have access to the waveform, both for processing in order to generate sync events, and for display/feed waveform data into 3D points and other fun stuff... However I failed. (this was around Sept/Oct 2023)

I suspect Bass library might not be able to do this properly.
You can pull waveform data, however you can not align the data in relation to the previous data you pulled.

But it might also just be I totally messed up somewhere...

I will dump the code here in case it might be useful:

Core/Audio/AudioAnalysis.cs

    public static readonly int WaveBufferLength = 44100 * 2 * 4; // 4 seconds (if format is 44100 stereo)
    /// <summary>
    /// circular buffer containing stereo waveform
    /// </summary>
    public static readonly float[] WaveBuffer = new float[WaveBufferLength]; 
    public static int WaveBufferPos = 0; // circular beffer end

    public static int WaveSourceReadFloatCount = 8192 * 2;     // if framerate drops below ~5 fps (for 44100 stereo), WaveBuffer will not be correct
    public static long WaveSourcePos = -1;
    //public static double WaveSourcePosSeconds = -1.0; 
    public static readonly float[] WaveTmpReadBuffer = new float[WaveSourceReadFloatCount];
    public static System.Collections.Generic.List<string> debugSizes = new();

in Core/Audio/AudioEngine.cs, I created UpdateWaveBuffer, which is called right after UpdateFftBuffer and uses the same args:

private static void UpdateWaveBuffer(int soundStreamHandle, Playback playback)
{
    if (playback.Settings != null && playback.Settings.AudioSource == PlaybackSettings.AudioSources.ProjectSoundTrack)
    {
        Bass.ChannelGetInfo(soundStreamHandle, out ChannelInfo info);
        var channels = info.Channels;
        var monoIsZeroMultiChannelIsOne = channels > 1 ? 1 : 0;

        long posInBytes = Bass.ChannelGetPosition(soundStreamHandle, PositionFlags.Bytes);
        long newPosSamples = (long) (Bass.ChannelBytes2Seconds(soundStreamHandle, posInBytes) * (double)info.Frequency);
        int diffInSamples = (int)(newPosSamples - AudioAnalysis.WaveSourcePos);
        bool hadValidSourcePos = AudioAnalysis.WaveSourcePos != -1;
        bool validDiff = hadValidSourcePos && diffInSamples > 0 && diffInSamples < AudioAnalysis.WaveSourceReadFloatCount / channels;
        AudioAnalysis.WaveSourcePos = newPosSamples;

        if (!validDiff)
            diffInSamples = 0;

        // Update circular buffer position
        AudioAnalysis.WaveBufferPos += diffInSamples * 2; // * 2 for stereo
        AudioAnalysis.WaveBufferPos %= AudioAnalysis.WaveBufferLength; // keep within bound

        // Bass.ChannelGetData gets data for forward in time,
        // since we don't know how long the next frame will be,
        // we always need to get enough bytes for worst case scenario to avoid gaps
        var byteCountToRead = AudioAnalysis.WaveSourceReadFloatCount * sizeof(float);
        var actualBytesRead = Bass.ChannelGetData(soundStreamHandle, AudioAnalysis.WaveTmpReadBuffer, (int)(DataFlags.Float) | byteCountToRead);

        var samplesRead = actualBytesRead / (sizeof(float) * channels);

#if DEBUG_WAVEFORM
        AudioAnalysis.debugSizes.Add("\n" + (byteCountToRead - actualBytesRead) + " " + byteCountToRead + " " + actualBytesRead+ " samplesRead:" + samplesRead + " diffInSamples:" + diffInSamples + " channels:" + channels + " freq:" + info.Frequency);
#endif                
        var wpos = AudioAnalysis.WaveBufferPos;

        // todo: avoid overwriting stuff we saved from last frame
        //int start_i = validDiff && byteCountToRead == actualBytesRead ? samplesRead - diffInSamples : 0;
        int start_i = 0;
        for (int i = start_i; i < samplesRead; i++)
        {
            AudioAnalysis.WaveBuffer[wpos    ] = AudioAnalysis.WaveTmpReadBuffer[i * channels];
            AudioAnalysis.WaveBuffer[wpos + 1] = AudioAnalysis.WaveTmpReadBuffer[i * channels + monoIsZeroMultiChannelIsOne];
            wpos += 2;
            if (wpos >= AudioAnalysis.WaveBufferLength)
            {
                wpos = 0;
#if DEBUG_WAVEFORM
                var tmp = new List<string>();
                for(var j = 0; j < 3*44100; j++)
                    tmp.Add("" + AudioAnalysis.WaveBuffer[j*2]);
                File.WriteAllText("C:\\_UnityProj\\tmpData.txt", string.Join(",", tmp ));
                File.WriteAllText("C:\\_UnityProj\\tmpInfo.txt", string.Join(",", AudioAnalysis.debugSizes));
#endif
            }
        }
    }
@teadrinker
Copy link
Contributor Author

A simple solution would just be to pull all samples from the Project Sound Track, and keep them globally.
Downsides:

  • Large memory use for lengthy tracks
  • Only works for Project Sound Track, not streams

@pixtur
Copy link
Collaborator

pixtur commented Mar 27, 2024

Interesting. I was using a similar approach by serializing the result of fft as json.
I'm honestly not sure if processing the waveform directly on the flight would be fast enough in c#. But I'm not so much into audio. Maybe @HolgerFoerterer has an idea how to do this.

@HolgerFoerterer
Copy link
Contributor

To get sample-precise output for video rendering, I spent a lot of time trying to convince Bass to switch from a real-time-based-approach to another mode where I could get access to buffered data in a consistent way. To be honest I failed too. Whatever I did... whenever I repositioned the playback in any way... things screwed up. So at the moment, I position the playback at the very beginning of the recording and avoid repositioning during the render.

So yes, you should theoretically be able to obtain buffered data by using a comparable approach. At least for FFT data there is a flag to fill the FFT without consuming new data.

But in a real-time scenario, don't expect buffered data to be exactly the same every time. Bass will apparently playback faster/slower and even skip as it sees fit to keep sync. And I don't know how to align that data then. When we get new data, it's obviously current, but that seems to be all we know.

And to answer the question by @pixtur: C# should be able to handle manual processing of stereo samples at 44.1-48 kHz easily.

@teadrinker
Copy link
Contributor Author

whenever I repositioned the playback in any way... things screwed up

I suspect it might not be possible with the current API due to the nature of sound running in another thread.
You'd need a function that give you the data AND the position in the same API call.... otherwise there is no guarantee they would be in sync.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants