Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drivers: dma: add dw axi dma controller driver #61444

Merged
merged 4 commits into from
Aug 19, 2024

Conversation

pbalsundar
Copy link
Contributor

  • add driver for dw axi dma controller
  • add sample application to perform memory to memory transfer
  • adding driver support for agilex5 platform

@pbalsundar pbalsundar force-pushed the dma_dw_axi_support branch 2 times, most recently from c2dcd8d to 83eaaa5 Compare August 12, 2023 16:33
@pbalsundar pbalsundar marked this pull request as ready for review August 12, 2023 17:04
@zephyrbot zephyrbot added platform: Intel SoC FPGA Agilex Intel Corporation, SoC FPGA Agilex area: ARM64 ARM (64-bit) Architecture area: Devicetree Binding PR modifies or adds a Device Tree binding area: DMA Direct Memory Access labels Aug 12, 2023
@pbalsundar
Copy link
Contributor Author

Hi @teburd, @carlocaione, @tbursztyka, @nashif, @galak, @npitre, @povergoing, @SgrrZhf could you please review and give necessary comments?

Copy link
Collaborator

@carlocaione carlocaione left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a couple on nits, in general LGTM.

*
* @retval status of the channel
*/
static uint32_t dma_dw_axi_get_ch_status(const struct device *dev, uint32_t channel)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general if you returning an enum like in this case then your function must return an enum. If you want to keep returning an uint32_t then just use a define instead of enum.

chan_data->cfg |= DMA_DW_AXI_CFG_TT_FC(M2P_DMAC);
lli_desc->ctl |= DMA_DW_AXI_CTL_SRC_MSIZE(msize_src) |
DMA_DW_AXI_CTL_DST_MSIZE(msize_dst);
WRITE_BIT(chan_data->cfg, DMA_DW_AXI_CFG_HW_HS_DST_BIT_POS, 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using sys_set_bit() instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carlocaione
We can use sys_clear_bit but, I think sys_* functions are mainly used to write onto the memory mapped registers as internally it will perform a volatile read or write.

In this case, chan_data->cfg is a global variable, so I thought it would be better to use WRITE_BIT

Copy link
Collaborator

@teburd teburd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The registers seem almost identical to the ones in dma_dw_common so I'd like to see why this needs to be a completely new driver.

The use of a work queue and mutex here is a bit different than all the other drivers. DMA is expected to work from interrupt call context and the caller is expected to manage ownership/mutual exclusion requirements.

Can you help me understand why this needs to be a different driver, why a work queue is needed, and why a mutex is being used here?

@pbalsundar
Copy link
Contributor Author

Hi @teburd
Sorry for late response. To answer your question

  1. The already existing driver dma_dw_common.c is designed for ahb interface and this PR is for axi interface. The register looks similar in naming convention but the register fields and the register offsets are entirely different. So new driver for dma driver with axi interface
  2. Work queue use: We have made an option to the user to choose either to process the callbacks in the interrupt context or the user context through Kconfig configuration. If the dma is used in case of the ethernet cases, user may opt to prcocess the callback in the user context. This is not mandatory that user needs process callback using workqueue, user can configure to process callback in interrupt context.
  3. Mutex Use: We are using mutex in the driver so as to avoid any synchronization issues. In the SMP system, multiple independent thread like i3c, i2c, uart, spi or any other IP which uses dma for data transfer can use the dma and its channel. This could bring up the synchronization issues if multiple thread tries to use dma at the same time. To protect the data structures we are using the mutext in the driver

Copy link
Collaborator

@teburd teburd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drivers in the tree today call their callbacks in ISRs, DMA does not provide a syscall interface and isn't expected to be directly used with user mode today. I somewhat doubt it ever could be safely. It's unclear why the work queue option is needed here. A work queue task could always be submitted in the callback if needed.

Drivers in the tree today assume single ownership semantics over channels. This allows calling DMA API functions from an ISR context and avoiding what is likely a programming mistake in trying to share a DMA channel. If there are a few common shared registers that might be contended on those should be protected with a spin lock rather than a mutex/semaphore. Updating a few registers should not be a scheduling point, and shouldn't prevent calls from being possible in the ISR context.

I hope that clarifies my reasoning here.

help
This flag can be enabled if hardware support Linked List multi-block transfer

config DMA_BOTTOM_HALF_WORK_QUEUE
Copy link
Collaborator

@teburd teburd Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this has a place in the DMA driver, do any drivers using DMA use this feature specifically? An application is perfectly capable of setting up their own work queue task that is submitted in the ISR callback. Or maybe instead a k_sem or k_poll_event is given that a user mode thread is waiting on. There's many options, and specifying this with a Kconfig seems prematurely deciding for an application how to communicate with a thread DMA completion/error events.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought in a different way that the user can always configure the Kconfig option to process the callback in the user context if he has to do the heavy task inside callback. And user can always disable the work queue if user want to process the callback in the isr context.

But as you mentioned, I agree to the fact that user has many options to design their own logic based on the use cases. I will revert these changes and invoke the callback in the interrupt context

/* user data for dma callback for dma block transfer completion */
void *priv_data_blk_tfr;
/* mutex for chan data */
struct k_mutex ch_mutex_lock;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are channels actually being shared among threads? This is a pretty unique feature to this DMA driver. The DMA drivers in the tree assume single ownership semantics, meaning there is a single owner of each DMA channel so synchronization is unnecessary unless the application decides to share a channel (unlikely).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intel_socfpga_agilex5_socdk has 2 instance of dma controller with 4 channels enabled per instance.
And there can be 48 possible number of users (may not be all active at a time but possible to have more than 8 users at a time depending on use cases)

This is the reason we have implemented mutual exclusion to avoid any synchronization issues

Copy link
Collaborator

@teburd teburd Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand what you mean by users in this case? Are you talking about hardware peripheral drivers sharing DMA channels? In this case I can understand why you'd like to have some form of synchronization. Usually Zephyr has dedicated channels assigned to hardware peripheral drivers through devicetree such that a channel solely owned by a peripheral driver and cannot be shared.

If that's not the case here, and the channels are indeed more like a pool of resources thats ok as well actually.

dma channel acquire and dma channel release use atomics to semantically provide a lock free pool of channels.

What is not provided however is a way to pend a thread on acquiring a dma channel. I can understand a desire for that particular feature and would accept that wrapped in a Kconfig optional. Where channel acquire and release is done with a semaphore along with the already existing atomics

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's several test cases that can be ran as samples in tests/drivers/dma including memory to memory, scatter gather, and various other tests. I don't believe we need a new sample replicating the existing test case that only works on a single platform.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in zephyr for each different driver category we do have sample applications and also ztest available under tests directory.

tests directory mainly focuses on the regression tests with different possible combination of test cases(which can include in case of dma tests: setting dma transfer on different channels, changing the burst len, changing data width, etc).

I think main purpose of sample application would be to give the overview or reference code which will enable the user to know how to write the application

I still would like to keep the sample application here. The sample application is generic and anyone can add support of the board to this sample application as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a fair enough point, if you'd like to keep the sample please do not make it exclusively for socfpga then. I won't block the PR on a sample that is usable on multiple devices.

@teburd
Copy link
Collaborator

teburd commented Nov 8, 2023

From my side I'm working on updating existing drivers and adding documentation to better solidify this behavioral expectation #64982

@pbalsundar
Copy link
Contributor Author

pbalsundar commented Jan 2, 2024

Hi @teburd, we further had discussion on the synchronization in the driver(with reference to comment in the same PR).
We finally decide to use the existing implementation after going through the various use cases.

The hooks provided by you in the PR will still useful if the peripheral counts which uses dma is more than the channels supported. We can merge that PR if any dma driver wants to adapt to that in such scenarios.

We are planning to use existing implementation of dma_request_channel to get the channel through the atomic flag marking of the channel being used and release dma channel using dma_release_channel() maeking the channel is free to use.

While adding support in the Peripheral driver for dma transfer, we are planning to implement the devmux implementation, which will enable peripheral to use one of the free channel of any dma controller in the board

I will push the required changes with all the comments addressed.

@pbalsundar
Copy link
Contributor Author

Hi @teburd, @carlocaione @povergoing, Could you please review the latest requested changes that I pushed.

  1. Removed work queue implementation for interrupt handling
  2. Removed mutext synchronization

@carlocaione carlocaione requested a review from teburd February 26, 2024 11:06
@pbalsundar
Copy link
Contributor Author

Hi @carlocaione @teburd @SgrrZhf @nashif could you please review further and give necessary comments?

@pbalsundar pbalsundar force-pushed the dma_dw_axi_support branch from fc66421 to 1b9ba19 Compare June 4, 2024 11:12
@zephyrbot zephyrbot requested a review from decsny June 4, 2024 11:13
decsny
decsny previously requested changes Jun 4, 2024
"#dma-cells":
const: 1

# #dma-cells : Must be <2>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this say dma-cells must be 2 but above dma-cells is const 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially design was with two dma-cells. I missed to update the comment properly. I updated now

@pbalsundar pbalsundar force-pushed the dma_dw_axi_support branch 2 times, most recently from 0cc86b2 to 2b71a21 Compare June 5, 2024 09:26
@pbalsundar
Copy link
Contributor Author

Hi @teburd
I have removed the sample application as this is similar to scatter gather test case. And added support for agilex5 board to the test case

@decsny decsny dismissed their stale review June 5, 2024 16:19

dismiss

@pbalsundar
Copy link
Contributor Author

pbalsundar commented Jun 7, 2024

Thanks @teburd for reviewing and giving valuable comments :-)

@carlocaione, @decsny, @povergoing Could you please review and give your inputs?

@pbalsundar
Copy link
Contributor Author

@carlocaione, @decsny, @povergoing Could you please review and give your inputs?

@pbalsundar
Copy link
Contributor Author

Hi @teburd, could you please help me in adding dev-review label to this PR?

@teburd
Copy link
Collaborator

teburd commented Jul 2, 2024

This week the dev-review meeting is unlikely to happen (US holiday) and we're in the middle of a release. I'm adding the label but you will need to join the dev-review meeting July 11, 2024 to present this PR. Thanks!

@teburd teburd added the dev-review To be discussed in dev-review meeting label Jul 2, 2024
@MaureenHelm MaureenHelm removed the dev-review To be discussed in dev-review meeting label Jul 17, 2024
@pbalsundar
Copy link
Contributor Author

Hi @teburd,
I couldn't attend the meeting last week and I think the dev review label has been removed. what can we do as next step for 2nd review?

@pbalsundar
Copy link
Contributor Author

This week the dev-review meeting is unlikely to happen (US holiday) and we're in the middle of a release. I'm adding the label but you will need to join the dev-review meeting July 11, 2024 to present this PR. Thanks!

Hi @teburd
Currently dev_review lable is removed. what actions can we take further to get this PR merged?

@decsny decsny added the dev-review To be discussed in dev-review meeting label Aug 1, 2024
@MaureenHelm
Copy link
Member

CI results are quite old, could you please rebase to rerun CI?

Adding dma driver source for designware axi dma controller

Signed-off-by: Balsundar Ponnusamy <[email protected]>
add dts support for dma to accomodate dma driver bringup on agilex5

Signed-off-by: Balsundar Ponnusamy <[email protected]>
updated codeowners file

Signed-off-by: Balsundar Ponnusamy <[email protected]>
added overlay and conf file for agilex5 board

Signed-off-by: Balsundar Ponnusamy <[email protected]>
@zephyrbot zephyrbot requested a review from gdengi August 1, 2024 15:34
#dma-cells = <1>;
reg = <0x10DC0000 0x1000>;
interrupt-parent = <&gic>;
interrupts = <GIC_SPI 166 IRQ_TYPE_LEVEL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can add these in one single line if they are with in the limit - 100 lines column

}

/* enable dma controller and global interrupt bit */
sys_write64(DMA_DW_AXI_CFG_INT_EN | DMA_DW_AXI_CFG_EN, reg_base + DMA_DW_AXI_CFGREG);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these MMIO mapped registers are 64bit wide?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, these are 64-bit wide registers MMIO mapped registers


LOG_MODULE_REGISTER(dma_designware_axi, CONFIG_DMA_LOG_LEVEL);

#define DEV_CFG(_dev) ((const struct dma_dw_axi_dev_cfg *)(_dev)->config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need use/need these macros?

Copy link
Contributor Author

@pbalsundar pbalsundar Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conflict Requirement

There is one conflict requirement of struct dma_context(from dma.h) and DEVICE_MMIO_RAM(from device_mmio.h) to be a first member of a driver data structure.

Requirement 1:
dma.h expects dma_context structure to be a first member of driver struct data:

/**
 * DMA context structure
 * Note: the dma_context shall be the **first member**
 *       of DMA client driver Data, got by dev->data
 */
struct dma_context {
	/** magic code to identify the context */
	int32_t magic;
	/** number of dma channels */
	int dma_channels;
	/** atomic holding bit flags for each channel to mark as used/unused */
	atomic_t *atomic;
};

Requirement 2:
DEVICE_MMIO_RAM:
snippet from device_mmio.h

 * struct foo_driver_data {
 *	DEVICE_MMIO_RAM;       -->expected as first member of driver data structure
 *	int wibble;
 *	...
 * }

Solution:
So Since this is a conflict requirement and these both conditions can't be satisfied at the same time. We have other variation of DEIVICE_MMIO_* called DEVICE_MMIO_NAMED_*.

These DEVICE_MMIO_NAMED* macros can use the struct type of the driver's data structure to access the virtual address stored in struct data. So decided to use DEVICE_MMIO_NAMED_* macros instead of DEVICE_MMIO_* macros

As we have to use the DEVICE_MMIO_NAMED* macros, device_mmio.h internally expects DEV_CFG and DEV_DATA to be defined by driver so that device_mmio.h can access the proper address by typecasting

@@ -0,0 +1,912 @@
/*
* Copyright (c) 2023 Intel Corporation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can update the copyright header everywhere as '2024 Altera'

@MaureenHelm MaureenHelm removed the dev-review To be discussed in dev-review meeting label Aug 8, 2024
@gdengi gdengi self-requested a review August 17, 2024 13:47
@nashif nashif merged commit 0f7349d into zephyrproject-rtos:main Aug 19, 2024
22 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: ARM64 ARM (64-bit) Architecture area: Devicetree Binding PR modifies or adds a Device Tree binding area: DMA Direct Memory Access area: Samples Samples platform: Intel SoC FPGA Agilex Intel Corporation, SoC FPGA Agilex
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants