docs: Add initial proposal for V2 recording & playback API. (WIP) #791

microbit-carlos · 2023-05-15T10:08:11Z

Docs preview:

This initial proposal has been discussed in:

Recording/Getting samples: interaction with CODAL for sound recording microbit-foundation/micropython-microbit-v2#49

But we have some open question that will likely result and a rework of some of this.

Initial proposal

The initial proposal in this PR was to create a new AudioBuffer class to contain the audio data and sampling rate.
The AudioBuffer.rate property could then be used by microphone.record() and audio.play() to configure recording and playback rates.
This was done to avoid introducing a new parameter to audio.play() to configure the sampling rate, when it could only work with a single type of sound input (as it might not be possible to change the rate of the SoundExpressions or AudioFrames).

Disadvantages

However, changing the rate in a buffer type to change the playback rate in real-time is a bit awkward:

my_recording = audio.AudioBuffer(duration=5000, rate=5500)
microphone.record_into(my_recording)
audio.play(my_recording, wait=False)
while audio.is_playing():
    x = accelerometer.get_x()
    my_recording.rate = scale(x, (-1000, 1000), (2250, 11000))
    sleep(50)

An alternative we considered was to have the playback sampling rate modified via the audio module itself:

audio.play(my_recording, wait=False)
while audio.is_playing():
    x = accelerometer.get_x()
    audio.set_rate(scale(x, (-1000, 1000), (2250, 11000)))
    sleep(50)

However, this would have to set the same rate to everything played via the audio module, and Sound Expression have a different default rate (44K) than recordings (11K). So audio.set_rate(22000) should slow down Sound Expression and speed up recordings.

Alternatively, if we wanted to change the playback rate via the audio module, we could set a ratio instead. Something equivalent to audio.set_speed(100%) (with different semantics). But a disadvantage would be that it's removing some of math/physics learning opportunity to directly relate the sampling rate value with the effects that it has in playback speed.

Alternative proposal: bytearray as the buffer type

In this case a byte array would be returned by microphone.record() and used withmicrophone.record_into().

As this data type does not include info about the rate, we depend on the audio.play() adding an extra argument that might not work with other sound types like Sound Expressions and Audio Frames.

However, we still have the issue of updating the playback rate in real time during playback, which means we might would have to use use a similar approach to the previously mentioned audio.set_speed(100%):

sound_in_byte_array = microphone.record(duration=3000, rate=5500)
audio.play(sound_in_byte_array, rate=5500 wait=False)
while audio.is_playing():
    x = accelerometer.get_x()
    audio.set_speed(scale(x, (-1000, 1000), (50, 200)))
    sleep(50)

DURATION_SECONDS = 3
SAMPLE_RATE = 5500
recording = bytearray(DURATION_SECONDS * SAMPLE_RATE)
microphone.record_into(recording, rate=SAMPLE_RATE)
audio.play(recording, rate=SAMPLE_RATE)

Alternative proposal: AudioFrames as the buffer type

This would be the same as the bytearray proposal, but using the existing AudioFrames instead.

We might need to tweak the AudioFrame class to let us user larger buffers, as the default is 32 samples. As audio.play() can consume an iterable as well, we would need to figure out a good balance between AudioFrame size and number of AudioFrames in a recording buffer.

microbit-carlos · 2023-08-03T17:34:03Z

Based on the latest discussion we have agreed that we'd prefer to avoid introducing a new data type for his feature.
In that case we have two options:

Use a byte array and change sampling via function in the audio module

sound_in_byte_array = microphone.record(duration=3000, rate=5500)
audio.play(sound_in_byte_array, wait=False)
while audio.is_playing():
    x = accelerometer.get_x()
    playback_sampling_rate = scale(x, (-1000, 1000), (2_200, 11_000))
    audio.set_rate(playback_sampling_rate)
    sleep(50)

One previous suggestion was to create a function in the lines of audio.set_speed() that used a percentage value (0 to 100) as a the input range. This could solve the issue with different pipelines in the sound mixer having different playback sampling rates, however we believe it's important to use real numbers to be able to directly compare and understand how changing the sampling rate during recording vs playback affects sound.

To be able to implement something like audio.set_rate() we would probably need to ensure everything that is inputted to audio.play() uses the same default sampling rate. Right now these are the types of input audio.play() takes:

Sound effects via user-created audio.SoundEffect() instances
- CODAL SoundExpressions, default sampling rate: 44_100
Built-in sounds via microbit.Sound pre-generated instances
- CODAL SoundExpressions, default sampling rate: 44_100
audio.AudioFrame
- MicroPython data type, default sampling rate: 8_000?
  - Not sure about this one, the docs list as single frame of 32 samples to take a bit over 4ms.

We'd have to check how it'd affect sound quality, but we could consider decreasing the sampling rate for SoundExpressions to 11K. However, we would still have the isse that we won't be able to increase the default AudioFrame sampling rate as that would how old programmes sound.

While this would be the cleanest way to do this for the user API, I'm not seeing a way in which it can be achieved with the current requirements? Does anybody have any ideas to overcomes this issues?

Expand AudioFrame to include sampling rate

This option is similar to the original proposals about creating a new AudioBuffer data type, but instead we can expand AudioFrames to be able to have different sizes and different sampling rates.

By default they should still behave as they do in micro:bit V1 (and older MicroPython versions for V2), which is 32 samples at 8K (?) sampling rate.

But this could be changed via constructor parameters to have any size buffer at any sample rate.

Unfortunately, that still leaves us with the awkward case of changing playback sampling rate by changing a variable from the samples class, instead of a method in the audio module.

my_recording = audio.AudioFrames(length=11000, rate=5500)
my_recording = microphone.record_into(my_recording)
# Or
my_recording = microphone.record(duration=2000, rate=5500)

audio.play(my_recording, wait=False)
while audio.is_playing():
    x = accelerometer.get_x()
    my_recording.rate = scale(x, (-1000, 1000), (2250, 11000))
    sleep(50)

microbit-carlos · 2023-08-23T08:17:45Z

@dpgeorge the docs have been updated, let me know if something doesn't match our previous conversation.

microbit-carlos · 2023-09-13T09:42:08Z

@dpgeorge there are couple of issues related to setting the sampling rate, but shouldn't affect the MicroPython implementation and should be fixed (without changes in the API) in the next CODAL release: https://github.com/lancaster-university/codal-microbit-v2/issues?q=is%3Aopen+milestone%3Av0.2.60+label%3Ap0

dpgeorge · 2023-10-16T10:53:27Z

docs/audio.rst

@@ -69,6 +72,13 @@ Functions

    Stops all audio playback.

+.. py:function:: set_rate(sample_rate)


Perhaps this could be made into a static-method of the AudioFrame class? That way the scope of it is clear, it only acts on these objects. And it would then be possible to write:

AudioFrame.set_rate(8000) ... my_frame = AudioFrame(...) my_frame.set_rate(8000) ``

So, as a static method that would affect all AudioFrame playback instead of individual instances?

I agree that having it as part of AudioFrame makes it a lot more clear that it only affects playback of this type, but might be confusing when accessing the function via instances:

frame_one = AudioFrame(...) frame_two = AudioFrame(...) frame_one.set_rate(8000) frame_two.set_rate(8000) ... while audio.is_playing(): frame_one.set_rate(scale(accelerometer values...)) # At this point, I'd expect playback for frame_two to still be 8000? audio.play(frame_two)

Would it'd be more intuitive if each instance hold its playback rate?

So, as a static method that would affect all AudioFrame playback instead of individual instances?

Yes, as a static method calling it once affects all AudioFrame instances.

If we went this way, probably best to document it as AudioFrame.set_rate(...) and never refer to it as frame.set_rate(...). Then it's clear it sets the global rate for all audio frames.

Would it'd be more intuitive if each instance hold its playback rate?

I'm not sure... maybe? But then it would be an instance method and you would always call frame.set_rate(...) to set the playback rate of that instance.

I think that's possible to implement.

dpgeorge · 2023-10-16T10:54:26Z

docs/audio.rst

 AudioFrame
 ==========

 .. py:class::
-    AudioFrame
+    AudioFrame(size=32)


Perhaps this could be extended to take optional keyword arguments duration and rate (with a default), to make it easy to create an AudioFrame of a given duration.

I think in this case, similar to the previous comment about having an AudioFrame.set_rate(), if rate was added to the constructor then we'd need to have set_rate() as a method changing each instance value, instead of a class static function.

I do like the idea to be able to make the arguments of the AudioFrame constructor simpler to work out if the user just wants a specific "size in time".

What would happen if the constructor is provided an argument for size and for duration? Would that throw an exception?

Note that there's a difference between record and play rates. The rate here is related to the recording rate, although the actual recording rate is set when you call microphone.record_into(). And that's different again to the playback rate set by .set_rate().

What would happen if the constructor is provided an argument for size and for duration? Would that throw an exception?

Yes it could throw an exception.

Discussion about specifying the size of an AudioFrame:

A way to create an AudioFrame of a specific size microbit-foundation/micropython-microbit-v2#193

dpgeorge · 2023-10-16T10:55:30Z

docs/audio.rst

+The ``audio.play()`` function can consume an instance or iterable
+(sequence, like list or tuple, or generator) of ``AudioFrame`` instances.
+Its default playback rate is 7812 Hz, and uses linear interpolation to output
+a PWM signal at 32.5 kHz.


I think we should get rid of this linear interpolation, and just output the 7812 Hz signal directly. I did some tests, and the audio quality is better without the interpolation.

Sounds good, in that case let's do that and remove this from the docs 👍

dpgeorge · 2023-10-16T10:58:07Z

docs/microphone.rst

@@ -70,11 +119,61 @@ Functions
    * **return**: a representation of the sound pressure level in the range 0 to
      255.

+.. py:function:: record(duration=3000, rate=7812, wait=True)


Maybe we can remove this function altogether, and just have record_into(). That way memory management is explicit, it avoids situations where there's a lot of heap fragmentation, and also avoids any difficulties with this function returning an AudioFrame object that is growing over time.

Sorry, I forgot the context in which AudioFrame would be growing. I take it that was the case when wait=False, but if we have to have a known/default value for duration and rate, wouldn't the AudioFrame be allocated at the start before the recording occurs?

I still quite like the idea of being able to do audio.play(microphone.record(duration=3000)).
Would it be simpler if wait was removed from this version and only available in record_into()?

I take it that was the case when wait=False, but if we have to have a known/default value for duration and rate, wouldn't the AudioFrame be allocated at the start before the recording occurs?

When wait=False the returned-AudioFrame will be gradually filling up. If we go with the current implementation where the whole buffer is preallocated at the start of the recording, then I guess it's not too bad, the user just sees blank data that's gradually filling up.

So, we can keep this function. And even support wait=False. Under the hood it will simply allocate an AudioFrame of the desired size, then pass it to record_into().

dpgeorge · 2023-10-16T11:00:51Z

docs/microphone.rst


-Example
-=======
+.. py:function:: record_into(buffer, rate=7812, wait=True)


Perhaps this function can set some new internal state in an AudioFrame object which indicates the length of the recording (as opposed to the total allocated length of the buffer). Then audio.play() would use this state to only play the amount that was recorded.

Could then add a method to get/set this length, eg AudioFrame.get_recording_length(), and AudioFrame.set_recording_length().

Yes, I completely agree here, playing with the current branch and recording a couple of seconds into a 5 second "buffer" results in a few seconds of "silent playback".

Would record_into() then also use this marker to continue recording where it left off?

Would record_into() then also use this marker to continue recording where it left off?

I think that would be confusing. It would require the user to explicitly reset the marker to the beginning to reuse the buffer. Maybe instead record_into() could have an additional argument which lets the user specify the starting location in the buffer.

Discussion:

Should microphone.record_into() have an additional duration and/or buffer_offset parameters? microbit-foundation/micropython-microbit-v2#197

microbit-matt-hillsdon · 2024-03-20T11:33:30Z

docs/microphone.rst

+
+.. py:function:: stop_recording()
+
+    Stops an a recording running in the background.


Suggested change

Stops an a recording running in the background.

Stops a recording running in the background.

Fixing typo but that also got me wondering what happens if you record more than once with wait=False (before the first has finished).
Do they all work? If so does this function stop them all?

Only one can be recorded at once, so a previously running recording is aborted at the point you call record() or record_into() again (as though you called stop_recording() first).

This issue to discuss some implementation enhancements would cover this as well:

Multiple calls to audio.play(wait=False) and microphone.record_into(wait=False) microbit-foundation/micropython-microbit-v2#198

docs/audio.rst

docs/microphone.rst

microbit-carlos · 2024-07-21T19:36:11Z

@dpgeorge the docs have been updated with the conclusion from microbit-foundation/micropython-microbit-v2#205 (comment).

One thing I've changed that we haven't discussed before was to remove the rate argument from microphone.record_into(), as the rate can be set in the input AudioRecording/AudioTrack, and without the argument there isn't any ambiguity as to what takes precedence.

dpgeorge · 2024-08-01T06:04:36Z

docs/audio.rst

+--------------
+
+.. py:class::
+    AudioRecording(duration, rate=7812)


Shall we try with a default rate of 11000?

Sounds good, will update the docs.

dpgeorge · 2024-08-01T06:08:08Z

docs/audio.rst

+
+        :return: The configured sample rate.
+
+    .. py:function:: copy()


Do we need this method? The Python way would be to use the constructor:

new_recording = AudioRecording(old_recording)

(I can't remember if we discussed this or not.)

Yes, we looked at picking between a copy constructor and a copy() method, in the end we decided to go with the method because the micro:bit API hasn't used copy constructors before and the copy() method was already a pattern used in classes like Image.

OK, I've implemented AudioRecording.copy().

(There's actually also the existing SoundEffect.copy(), so that's nice and consistent.)

dpgeorge · 2024-08-01T06:11:26Z

docs/audio.rst

+
+        :returns: a copy of the ``AudioRecording``.
+
+    .. py:function:: track(start_ms=0, end_ms=-1)


Are these keyword-only arguments, or positional?

Ie do we allow recording.track(123) or require recording.track(start_ms=123)?

I think it's fine to leave them as positional, users can still use keywords for clarity.

I don't think they are positional at the moment:

>>> audio.AudioRecording(100).track(0) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: extra positional arguments given

I've now made them positional.

dpgeorge · 2024-08-01T06:13:53Z

docs/audio.rst

+
+    :param buffer: The buffer containing the audio data.
+    :param rate: The sampling rate at which data will be stored
+        via the microphone, or played via the ``audio.play()`` function.


If you pass in an AudioRecording as the buffer and don't specify the rate, should it take the rate of the AudioRecording?

If the answer to that is "yes" then the signature would need to be:

AudioTrack(buffer, rate=None)

and None means:

if buffer is an AudioRecording (or AudioTrack?) then take the rate from that

otherwise rate is default 7812 (or 11000?)

Right, originally I was thinking to force the default rate, as it's also a simpler implementation and explanation. But after some thought I think most users would expected the rate to be copied. I'll update the docs to the rate=None method.

dpgeorge · 2024-08-01T06:29:03Z

docs/audio.rst

+        ``AudioTrack``, ``AudioRecording`` or buffer-like object like
+        a ``bytes`` or ``bytearray`` instance.
+
+        :param other: Buffer-like instance from which to copy the data.


If the other is smaller than self, should the remaining bytes be left untouched? Probably.

Yes, I'll add to the docs as well.

dpgeorge · 2024-08-01T06:34:21Z

docs/audio.rst

+        a ``bytes`` or ``bytearray`` instance.
+
+        :param other: Buffer-like instance from which to copy the data.
+


Will need to document access via subscript and slicing, and len(audio_track) operator.

Is that commonly documented via the dunder methods?
That might be a bit hard to understand for some of our audience, maybe we should add an explanation before or after the class documentation, or the in the class description?

You can see an example of how CPython documents dict: https://docs.python.org/3/library/stdtypes.html#mapping-types-dict

Eg it uses:

len(d) for length

d[key] for subscript

d | other for binary ops

d |= other for inplace ops

Would be great to document these somehow as otherwise it's trial and error/reading the C.

I think for AudioTrack we have:

@overload def __getitem__(self, i: int) -> int: ... @overload def __getitem__(self, s: slice) -> AudioTrack: ... def __setitem__(self, i: int, x: int) -> None: ... def __add__(self, v: AudioTrack) -> AudioTrack: ... def __iadd__(self, v: AudioTrack) -> AudioTrack: ... def __sub__(self, v: AudioTrack) -> AudioTrack: ... def __isub__(self, v: AudioTrack) -> AudioTrack: ... def __mul__(self, v: float) -> AudioTrack: ... def __imul__(self, v: float) -> AudioTrack: ...

So slice access but not slice assignment.

docs/audio.rst

microbit-carlos force-pushed the docs-recording branch 3 times, most recently from 52f230c to e635d92 Compare May 15, 2023 10:53

microbit-carlos marked this pull request as ready for review May 19, 2023 11:27

microbit-carlos marked this pull request as draft May 19, 2023 11:27

microbit-carlos force-pushed the docs-recording branch from e635d92 to d430151 Compare May 19, 2023 13:40

microbit-carlos changed the title ~~docs: Add initial proposal for recording & playback API. (WIP)~~ docs: Add initial proposal for V2 recording & playback API. (WIP) May 19, 2023

microbit-carlos force-pushed the docs-recording branch from e57605c to e5c1b73 Compare August 22, 2023 15:27

dpgeorge reviewed Oct 16, 2023

View reviewed changes

dpgeorge mentioned this pull request Nov 14, 2023

WIP: Audio recording and playback microbit-foundation/micropython-microbit-v2#163

Draft

microbit-carlos force-pushed the v2-docs branch from 1c3fc31 to acee09a Compare February 26, 2024 18:10

microbit-carlos force-pushed the docs-recording branch 2 times, most recently from 102cd8c to 24bd949 Compare March 1, 2024 18:41

microbit-matt-hillsdon reviewed Mar 20, 2024

View reviewed changes

microbit-grace reviewed Apr 3, 2024

View reviewed changes

docs/audio.rst Show resolved Hide resolved

microbit-carlos mentioned this pull request Apr 3, 2024

[DO NOT MERGE] Update for forthcoming beta MicroPython microbit-foundation/micropython-microbit-stubs#97

Draft

microbit-matt-hillsdon reviewed Apr 3, 2024

View reviewed changes

docs/microphone.rst Outdated Show resolved Hide resolved

microbit-carlos force-pushed the docs-recording branch from 24bd949 to 513ddf8 Compare April 12, 2024 11:04

microbit-carlos force-pushed the v2-docs branch from 7a4cfb5 to 2505216 Compare May 7, 2024 15:30

microbit-carlos force-pushed the docs-recording branch 3 times, most recently from 26a1a8f to 874d32c Compare July 15, 2024 18:39

microbit-carlos force-pushed the v2-docs branch from 2505216 to 62ea073 Compare July 18, 2024 13:39

microbit-carlos force-pushed the docs-recording branch 2 times, most recently from 72212bc to d37bfa0 Compare July 21, 2024 19:16

dpgeorge reviewed Aug 1, 2024

View reviewed changes

microbit-carlos mentioned this pull request Aug 6, 2024

Recording/Getting samples: interaction with CODAL for sound recording microbit-foundation/micropython-microbit-v2#49

Closed

microbit-matt-hillsdon reviewed Aug 21, 2024

View reviewed changes

docs/audio.rst Show resolved Hide resolved

microbit-carlos force-pushed the v2-docs branch from 62ea073 to 905577a Compare September 16, 2024 17:23

microbit-carlos added 7 commits September 16, 2024 18:28

docs: Add initial proposal for recording & playback API.

458e5a1

docs: Update recording & playback based on review.

1407c6b

docs: Update Recording & Playback based on the latest implementation.

996f0ae

docs: Add audio.sound_level() and tweak audio descriptions.

abbcfa1

docs: Update Recording & Playback to use AudioRecording & AudioTrack.

bf291ec

docs: update Recording & Playback on review + tweaks to descriptions.

4841b60

docs: Update the recording default rate from 11k to 7812.

b129d16

microbit-carlos force-pushed the docs-recording branch from 14aa028 to b129d16 Compare September 16, 2024 17:44

		@@ -69,6 +72,13 @@ Functions

		Stops all audio playback.

		.. py:function:: set_rate(sample_rate)


		.. py:function:: stop_recording()

		Stops an a recording running in the background.

	Stops an a recording running in the background.
	Stops a recording running in the background.


		:returns: a copy of the ``AudioRecording``.

		.. py:function:: track(start_ms=0, end_ms=-1)

		a ``bytes`` or ``bytearray`` instance.

		:param other: Buffer-like instance from which to copy the data.

docs: Add initial proposal for V2 recording & playback API. (WIP) #791

Are you sure you want to change the base?

docs: Add initial proposal for V2 recording & playback API. (WIP) #791

Conversation

microbit-carlos commented May 15, 2023 • edited Loading

Initial proposal

Disadvantages

Alternative proposal: bytearray as the buffer type

Alternative proposal: AudioFrames as the buffer type

microbit-carlos commented Aug 3, 2023 • edited Loading

Use a byte array and change sampling via function in the audio module

Expand AudioFrame to include sampling rate

microbit-carlos commented Aug 23, 2023

microbit-carlos commented Sep 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

microbit-carlos commented Jul 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

microbit-matt-hillsdon Aug 21, 2024 • edited Loading

Choose a reason for hiding this comment

microbit-carlos commented May 15, 2023 •

edited

Loading

microbit-carlos commented Aug 3, 2023 •

edited

Loading

microbit-matt-hillsdon Aug 21, 2024 •

edited

Loading