All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baochen Qiang <quic_bqiang@quicinc.com>
To: <ath11k@lists.infradead.org>
Cc: <linux-wireless@vger.kernel.org>, <quic_bqiang@quicinc.com>
Subject: [PATCH] wifi: ath11k: enable 36 bit mask for stream DMA
Date: Tue, 23 Jan 2024 09:52:01 +0800	[thread overview]
Message-ID: <20240123015201.28939-1-quic_bqiang@quicinc.com> (raw)

Currently 32 bit DMA mask is used, telling kernel to get us an DMA
address under 4GB when mapping a buffer. This results in a very high
CPU overhead in the case where IOMMU is disabled and more than 4GB
system memory is installed. The reason is, with more than 4GB memory
installed, kernel is likely to allocate a buffer whose physical
address is above 4GB. While with IOMMU disabled, kernel has to involve
SWIOTLB to map/unmap that buffer, which consumes lots of CPU cycles.

We did hit an issue caused by the reason mentioned above: in a system
that disables IOMMU and gets 8GB memory installed, a total of 40.5%
CPU usage is observed in throughput test. CPU profiling shows nearly
60% of CPU cycles are consumed by SWIOTLB.

By enabling 36 bit DMA mask, we can bypass SWIOTLB for any buffer
whose physical address is below 64GB. There are two types of DMA mask
within struct device, named dma_mask and coherent_dma_mask. Here we
only enable 36 bit for dma_mask, because firmware crashes if
coherent_dma_mask is also enabled, due to some unknown hardware
limitations. This is acceptable because coherent_dma_mask is used for
mapping a consistent DMA buffer, which generally does not happen in
a hot path.

With this change, the total CPU usage mentioned in above issue drops
to 18.9%.

Tested-on: WCN6855 hw2.1 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.23
Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1
Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1

Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
---
 drivers/net/wireless/ath/ath11k/mhi.c |  2 +-
 drivers/net/wireless/ath/ath11k/pci.c | 16 +++++++++++++---
 drivers/net/wireless/ath/ath11k/pci.h |  1 +
 3 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/ath/ath11k/mhi.c b/drivers/net/wireless/ath/ath11k/mhi.c
index 6835c14b82cc..8753dde2fe2b 100644
--- a/drivers/net/wireless/ath/ath11k/mhi.c
+++ b/drivers/net/wireless/ath/ath11k/mhi.c
@@ -423,7 +423,7 @@ int ath11k_mhi_register(struct ath11k_pci *ab_pci)
 			goto free_controller;
 	} else {
 		mhi_ctrl->iova_start = 0;
-		mhi_ctrl->iova_stop = 0xFFFFFFFF;
+		mhi_ctrl->iova_stop = ab_pci->dma_mask;
 	}
 
 	mhi_ctrl->rddm_size = RDDM_DUMP_SIZE;
diff --git a/drivers/net/wireless/ath/ath11k/pci.c b/drivers/net/wireless/ath/ath11k/pci.c
index 09e65c5e55c4..1d0f5d6e1cf2 100644
--- a/drivers/net/wireless/ath/ath11k/pci.c
+++ b/drivers/net/wireless/ath/ath11k/pci.c
@@ -18,7 +18,8 @@
 #include "qmi.h"
 
 #define ATH11K_PCI_BAR_NUM		0
-#define ATH11K_PCI_DMA_MASK		32
+#define ATH11K_PCI_DMA_MASK		36
+#define ATH11K_PCI_COHERENT_DMA_MASK	32
 
 #define TCSR_SOC_HW_VERSION		0x0224
 #define TCSR_SOC_HW_VERSION_MAJOR_MASK	GENMASK(11, 8)
@@ -526,13 +527,22 @@ static int ath11k_pci_claim(struct ath11k_pci *ab_pci, struct pci_dev *pdev)
 		goto disable_device;
 	}
 
-	ret = dma_set_mask_and_coherent(&pdev->dev,
-					DMA_BIT_MASK(ATH11K_PCI_DMA_MASK));
+	ret = dma_set_mask(&pdev->dev,
+			   DMA_BIT_MASK(ATH11K_PCI_DMA_MASK));
 	if (ret) {
 		ath11k_err(ab, "failed to set pci dma mask to %d: %d\n",
 			   ATH11K_PCI_DMA_MASK, ret);
 		goto release_region;
 	}
+	ab_pci->dma_mask = DMA_BIT_MASK(ATH11K_PCI_DMA_MASK);
+
+	ret = dma_set_coherent_mask(&pdev->dev,
+				    DMA_BIT_MASK(ATH11K_PCI_COHERENT_DMA_MASK));
+	if (ret) {
+		ath11k_err(ab, "failed to set pci coherent dma mask to %d: %d\n",
+			   ATH11K_PCI_COHERENT_DMA_MASK, ret);
+		goto release_region;
+	}
 
 	pci_set_master(pdev);
 
diff --git a/drivers/net/wireless/ath/ath11k/pci.h b/drivers/net/wireless/ath/ath11k/pci.h
index e9a01f344ec6..21274175f0d2 100644
--- a/drivers/net/wireless/ath/ath11k/pci.h
+++ b/drivers/net/wireless/ath/ath11k/pci.h
@@ -72,6 +72,7 @@ struct ath11k_pci {
 	/* enum ath11k_pci_flags */
 	unsigned long flags;
 	u16 link_ctl;
+	u64 dma_mask;
 };
 
 static inline struct ath11k_pci *ath11k_pci_priv(struct ath11k_base *ab)

base-commit: 8ff464a183f92836d7fd99edceef50a89d8ea797
-- 
2.25.1


             reply	other threads:[~2024-01-23  1:52 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-23  1:52 Baochen Qiang [this message]
2024-01-24  2:35 ` [PATCH] wifi: ath11k: enable 36 bit mask for stream DMA Jeff Johnson
2024-01-25 16:37   ` Kalle Valo
2024-01-25 16:38 ` Kalle Valo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240123015201.28939-1-quic_bqiang@quicinc.com \
    --to=quic_bqiang@quicinc.com \
    --cc=ath11k@lists.infradead.org \
    --cc=linux-wireless@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.