All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: linuxppc-dev@lists.ozlabs.org
Cc: Shrikanth Hegde <sshegde@linux.vnet.ibm.com>,
	Laurent Dufour <ldufour@linux.ibm.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	"Nysal Jan K . A" <nysal@linux.ibm.com>
Subject: [PATCH 1/6] powerpc/qspinlock: Fix stale propagated yield_cpu
Date: Mon, 16 Oct 2023 22:43:00 +1000	[thread overview]
Message-ID: <20231016124305.139923-2-npiggin@gmail.com> (raw)
In-Reply-To: <20231016124305.139923-1-npiggin@gmail.com>

yield_cpu is a sample of a preempted lock holder that gets propagated
back through the queue. Queued waiters use this to yield to the
preempted lock holder without continually sampling the lock word (which
would defeat the purpose of MCS queueing by bouncing the cache line).

The problem is that yield_cpu can become stale. It can take some time to
be passed down the chain, and if any queued waiter gets preempted then
it will cease to propagate the yield_cpu to later waiters.

This can result in yielding to a CPU that no longer holds the lock,
which is bad, but particularly if it is currently in H_CEDE (idle),
then it appears to be preempted and some hypervisors (PowerVM) can
cause very long H_CONFER latencies waiting for H_CEDE wakeup. This
results in latency spikes and hard lockups on oversubscribed
partitions with lock contention.

This is a minimal fix. Before yielding to yield_cpu, sample the lock
word to confirm yield_cpu is still the owner, and bail out of it is not.

Thanks to a bunch of people who reported this and tracked down the
exact problem using tracepoints and dispatch trace logs.

Fixes: 28db61e207ea ("powerpc/qspinlock: allow propagation of yield CPU down the queue")
Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Reported-by: Laurent Dufour <ldufour@linux.ibm.com>
Reported-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com>
Debugged-by: Nysal Jan K.A <nysal@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/lib/qspinlock.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c
index 253620979d0c..6dd2f46bd3ef 100644
--- a/arch/powerpc/lib/qspinlock.c
+++ b/arch/powerpc/lib/qspinlock.c
@@ -406,6 +406,9 @@ static __always_inline bool yield_to_prev(struct qspinlock *lock, struct qnode *
 	if ((yield_count & 1) == 0)
 		goto yield_prev; /* owner vcpu is running */
 
+	if (get_owner_cpu(READ_ONCE(lock->val)) != yield_cpu)
+		goto yield_prev; /* re-sample lock owner */
+
 	spin_end();
 
 	preempted = true;
-- 
2.42.0


  reply	other threads:[~2023-10-16 12:45 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-16 12:42 [PATCH 0/6] powerpc/qspinlock: Fix yield latency bug and other Nicholas Piggin
2023-10-16 12:43 ` Nicholas Piggin [this message]
2023-10-16 12:43 ` [PATCH 2/6] powerpc/qspinlock: stop queued waiters trying to set lock sleepy Nicholas Piggin
2023-10-20  9:50   ` Nysal Jan K.A.
2023-10-16 12:43 ` [PATCH 3/6] powerpc/qspinlock: propagate owner preemptedness rather than CPU number Nicholas Piggin
2023-10-16 12:43 ` [PATCH 4/6] powerpc/qspinlock: don't propagate the not-sleepy state Nicholas Piggin
2023-10-16 12:43 ` [PATCH 5/6] powerpc/qspinlock: Propagate sleepy if previous waiter is preempted Nicholas Piggin
2023-10-16 12:43 ` [PATCH 6/6] powerpc/qspinlock: Rename yield_propagate_owner tunable Nicholas Piggin
2023-10-18  7:41 ` [PATCH 0/6] powerpc/qspinlock: Fix yield latency bug and other Shrikanth Hegde
2023-10-20 10:04 ` Nysal Jan K.A.
2023-10-20 12:01 ` (subset) " Michael Ellerman
2023-10-27  9:59 ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231016124305.139923-2-npiggin@gmail.com \
    --to=npiggin@gmail.com \
    --cc=ldufour@linux.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=nysal@linux.ibm.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=sshegde@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.