Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add initial proposal for V2 recording & playback API. (WIP) #791

Draft
wants to merge 7 commits into
base: v2-docs
Choose a base branch
from
232 changes: 224 additions & 8 deletions docs/audio.rst
microbit-carlos marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -12,19 +12,36 @@ a speaker to pin 0 and GND on the edge connector to hear the sounds.
The ``audio`` module can be imported as ``import audio`` or accessed via
the ``microbit`` module as ``microbit.audio``.

There are three different kinds of audio sources that can be played using the
There are five different kinds of audio sources that can be played using the
:py:meth:`audio.play` function:

1. `Built in sounds <#built-in-sounds-v2>`_ (**V2**),
microbit-carlos marked this conversation as resolved.
Show resolved Hide resolved
e.g. ``audio.play(Sound.HAPPY)``

2. `Sound Effects <#sound-effects-v2>`_ (**V2**), a way to create custom sounds
by configuring its parameters::

my_effect = audio.SoundEffect(freq_start=400, freq_end=2500, duration=500)
audio.play(my_effect)

3. `Audio Frames <#audioframe>`_, an iterable (like a list or a generator)
of Audio Frames, which are lists of 32 samples with values from 0 to 255::
3. `Audio Recordings <#audiorecording-audiotrack-v2>`_, an object that can
be used to record audio from the microphone::

recording = audio.AudioRecording(duration=4000)
microphone.record_into(recording)
audio.play(recording)

4. `Audio Tracks <#audiorecording-audiotrack-v2>`_, a way to point to a portion
of the data in an ``AudioRecording`` or a ``bytearray`` and/or modify it::

recording = audio.AudioRecording(duration=4000)
microphone.record(recording)
track = AudioTrack(recording)[1000:3000]
audio.play(track)

5. `Audio Frames <#audioframe>`_, an instance or an iterable (like a list or
generator) of Audio Frames, which are lists of samples with values
from 0 to 255::

square_wave = audio.AudioFrame()
for i in range(16):
Expand All @@ -47,8 +64,13 @@ Functions
be found in the `Built in sounds <#built-in-sounds-v2>`_ section.
- ``SoundEffect``: A sound effect, or an iterable of sound effects,
created via the :py:meth:`audio.SoundEffect` class
- ``AudioFrame``: An iterable of ``AudioFrame`` instances as described
in the `AudioFrame Technical Details <#id2>`_ section
- ``AudioRecording``: An instance of ``AudioRecording`` as described
in the `AudioRecording <#audiorecording-audiotrack-v2>`_ section
- ``AudioTrack``: An instance of ``AudioTrack`` as described in the
`AudioTrack <#audiorecording-audiotrack-v2>`_ section
- ``AudioFrame``: An instance or an iterable of ``AudioFrame``
instances as described in the
`AudioFrame Technical Details <#technical-details>`_ section

:param wait: If ``wait`` is ``True``, this function will block until the
source is exhausted.
Expand All @@ -69,6 +91,13 @@ Functions

Stops all audio playback.

.. py:function:: sound_level()

Get the sound pressure level produced by audio currently being played.

:return: A representation of the output sound pressure level in the
range 0 to 255.


Built-in sounds **V2**
======================
Expand Down Expand Up @@ -215,6 +244,190 @@ Sound Effects Example
.. include:: ../examples/soundeffects.py
:code: python


AudioRecording & AudioTrack **V2**
==================================

To record and play back audio, we need a way to store the audio data and
the sampling rate that has been used to record it and play it back.

Two new classes are introduced in micro:bit V2 for this purpose:

- The ``AudioRecording`` class holds its own audio data and sampling rate.
It is initialised with a size defined in units of time, and it's the object
type that the ``microphone.record()`` function returns.
- The ``AudioTrack`` class contains its sampling rate, but does not hold its
own data. It instead points to a buffer externally created,
like an ``AudioRecording``, or a basic type like a ``bytearray``.
It's similar to a
`memoryview <https://docs.micropython.org/en/v1.9.3/pyboard/reference/speed_python.html#arrays>`_
and it can be used to easily modify the audio data or chop into portions
of different sizes.

AudioRecording
--------------

.. py:class::
AudioRecording(duration, rate=7812)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we try with a default rate of 11000?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, will update the docs.


The ``AudioRecording`` object contains audio data and the sampling rate
associated to it.

The size of the internal buffer will depend on the ``rate``
(number of samples per second) and ``duration`` parameters.
The larger these values are, the more memory that will be used.

:param duration: Indicates how many milliseconds of audio this
instance can store.
:param rate: The sampling rate at which data will be stored
via the microphone, or played via the ``audio.play()`` function.

.. py:function:: set_rate(sample_rate)

Configure the sampling rate associated with the data in the
``AudioRecording`` instance.

:param sample_rate: The sample rate to set.

.. py:function:: get_rate()

Return the configured sampling rate for this
``AudioRecording`` instance.

:return: The configured sample rate.

.. py:function:: copy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this method? The Python way would be to use the constructor:

new_recording = AudioRecording(old_recording)

(I can't remember if we discussed this or not.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we looked at picking between a copy constructor and a copy() method, in the end we decided to go with the method because the micro:bit API hasn't used copy constructors before and the copy() method was already a pattern used in classes like Image.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've implemented AudioRecording.copy().

(There's actually also the existing SoundEffect.copy(), so that's nice and consistent.)


:returns: a copy of the ``AudioRecording``.

.. py:function:: track(start_ms=0, end_ms=-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these keyword-only arguments, or positional?

Ie do we allow recording.track(123) or require recording.track(start_ms=123)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to leave them as positional, users can still use keywords for clarity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think they are positional at the moment:

>>> audio.AudioRecording(100).track(0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: extra positional arguments given

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now made them positional.


Create an `AudioTrack <#audio.AudioTrack>`_ instance from a portion of
the data in this ``AudioRecording`` instance.

Out-of-range values will be truncated to the recording limits.
If ``end_ms`` is lower than ``start_ms``, an empty track will be
created.

:param start_ms: Where to start of the track in milliseconds.
:param end_ms: The end of the track in milliseconds.
If the default value of ``-1`` is provided it will end the track
at the end of the AudioRecording.

When an ``AudioRecording`` is used to record data from the microphone,
a higher sampling rate produces better sound quality,
but it also uses more memory.

During playback, increasing the sampling rate speeds up the sound
and decreasing the sample rate slows it down.

The data inside an ``AudioRecording`` is not easy to modify, so the
``AudioTrack`` class is provided to help access the audio data like a list.
The method ``AudioRecording.track()`` can be used to create an ``AudioTrack``,
and its arguments ``start_ms`` and ``end_ms`` can be used to slice portions
of the data.

AudioTrack
----------

.. py:class::
AudioTrack(buffer, rate=None)

The ``AudioTrack`` object points to the data provided by the input buffer,
which can be an ``AudioRecording``, another ``AudioTrack``,
or a buffer-like object like a ``bytearray``.

When the input buffer has an associated rate (e.g. an ``AudioRecording``
or ``AudioTrack``), the rate is copied. If the buffer object does not have
a rate, the default value of 7812 is used.

Changes to an ``AudioTrack`` rate won't affect the original source rate,
so multiple instances pointing to the same buffer can have different
rates and the original buffer rate would stay unmodified.

:param buffer: The buffer containing the audio data.
:param rate: The sampling rate at which data will be stored
via the microphone, or played via the ``audio.play()`` function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you pass in an AudioRecording as the buffer and don't specify the rate, should it take the rate of the AudioRecording?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the answer to that is "yes" then the signature would need to be:

AudioTrack(buffer, rate=None)

and None means:

  • if buffer is an AudioRecording (or AudioTrack?) then take the rate from that
  • otherwise rate is default 7812 (or 11000?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, originally I was thinking to force the default rate, as it's also a simpler implementation and explanation. But after some thought I think most users would expected the rate to be copied. I'll update the docs to the rate=None method.


.. py:function:: set_rate(sample_rate)

Configure the sampling rate associated with the data in the
``AudioTrack`` instance.

:param sample_rate: The sample rate to set.

.. py:function:: get_rate()

Return the configured sampling rate for this ``AudioTrack`` instance.

:return: The configured sample rate.

.. py:function:: copyfrom(other)

Overwrite the data in this ``AudioTrack`` with the data from another
``AudioTrack``, ``AudioRecording``, or buffer-like object like
a ``bytearray`` instance.

If the input buffer is smaller than the available space in this
instance, the rest of the data is left untouched.
If it is larger, it will stop copying once this instance is filled.

:param other: Buffer-like instance from which to copy the data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the other is smaller than self, should the remaining bytes be left untouched? Probably.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll add to the docs as well.


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will need to document access via subscript and slicing, and len(audio_track) operator.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that commonly documented via the dunder methods?
That might be a bit hard to understand for some of our audience, maybe we should add an explanation before or after the class documentation, or the in the class description?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can see an example of how CPython documents dict: https://docs.python.org/3/library/stdtypes.html#mapping-types-dict

Eg it uses:

  • len(d) for length
  • d[key] for subscript
  • d | other for binary ops
  • d |= other for inplace ops

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to document these somehow as otherwise it's trial and error/reading the C.

I think for AudioTrack we have:

    @overload
    def __getitem__(self, i: int) -> int: ...
    @overload
    def __getitem__(self, s: slice) -> AudioTrack: ...
    def __setitem__(self, i: int, x: int) -> None: ...

    def __add__(self, v: AudioTrack) -> AudioTrack: ...
    def __iadd__(self, v: AudioTrack) -> AudioTrack: ...
    def __sub__(self, v: AudioTrack) -> AudioTrack: ...
    def __isub__(self, v: AudioTrack) -> AudioTrack: ...
    def __mul__(self, v: float) -> AudioTrack: ...
    def __imul__(self, v: float) -> AudioTrack: ...

So slice access but not slice assignment.

An ``AudioTrack`` can be created from an ``AudioRecording``, another
``AudioTrack``, or a ``bytearray`` and individual bytes can be accessed and
modified like elements in a list::

my_track = AudioTrack(bytearray(100))
# Create a square wave
half_length = len(my_track) // 2
for i in range(half_length):
my_track[i] = 255
for i in range(half_length, len(my_track)):
my_track[i] = 0


Or smaller AudioTracks can be created using slices, useful to send them
via radio or serial::

recording = microphone.record(duration=2000)
track = AudioTrack(recording)
packet_size = 32
for i in range(0, len(track), packet_size):
radio.send_bytes(track[i:i+packet_size])

Example
-------

::

from microbit import *

# An AudioRecording holds the audio data
recording = audio.AudioRecording(duration=4000)

# AudioTracks point to a portion of the data in the AudioRecording
# We can obtain the an AudioTrack from the AudioRecording.track() method
first_half = recording.track(end_ms=2000)
# Or we can create an AudioTrack from an AudioRecording and slice it
full_track = audio.AudioTrack(recording)
second_half = full_track[full_track.length() // 2:]

while True:
if button_a.is_pressed():
# We can record directly inside the AudioRecording
microphone.record(recording)
if button_b.is_pressed():
audio.play(recording, wait=False)
# The rate can be changed while playing
first_half.set_rate(
scale(accelerometer.get_x(), from_=(-1000, 1000), to=(3_000, 30_000))
)
if pin_logo.is_touched():
# We can also play the AudioTrack pointing to the AudioRecording
audio.play(first_half)


AudioFrame
==========

Expand All @@ -241,9 +454,9 @@ Technical Details
It is just here in case you wanted to know how it works.

The ``audio`` module can consumes an iterable (sequence, like list or tuple, or
generator) of ``AudioFrame`` instances, each 32 samples at 7812.5 Hz, and uses
linear interpolation to output a PWM signal at 32.5 kHz, which gives tolerable
sound quality.
generator) of ``AudioFrame`` instances, each 32 samples at 7812.5 Hz,
which take just over 4 milliseconds to play each frame
(1/7812.5 * 32 = 0.004096 = 4096 microseconds).

The function ``play`` fully copies all data from each ``AudioFrame`` before it
calls ``next()`` for the next frame, so a sound source can use the same
Expand All @@ -260,5 +473,8 @@ the buffer. This means that a sound source has under 4ms to compute the next
AudioFrame Example
------------------

Creating and populating ``AudioFrame`` iterables and generators with
different sound waveforms:

.. include:: ../examples/waveforms.py
:code: python
10 changes: 10 additions & 0 deletions docs/microbit_micropython_api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,16 @@ The Microphone is accessed via the `microphone` object::
set_threshold(128)
# Returns a representation of the sound pressure level in the range 0 to 255.
sound_level()
# Record audio into a new `AudioRecording`
recording = record(duration, rate=7812)
# Record audio into an existing `AudioRecording`
record_into(recording, wait=True)
# Returns `True` if the microphone is currently recording audio
is_recording()
# Stop any active audio recording
stop()
# Set the microphone sensitivity (also referred as gain)
set_sensitivity(microphone.SENSITIVITY_MEDIUM)

Pins
----
Expand Down
Loading