All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christophe Leroy <christophe.leroy@csgroup.eu>
To: "nathanl@linux.ibm.com" <nathanl@linux.ibm.com>
Cc: Tyrel Datwyler <tyreld@linux.ibm.com>,
	Nick Child <nnac123@linux.ibm.com>,
	Andrew Donnellan <ajd@linux.ibm.com>,
	Scott Cheloha <cheloha@linux.ibm.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	Laurent Dufour <ldufour@linux.ibm.com>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH 8/8] powerpc/rtas: consume retry statuses in sys_rtas()
Date: Thu, 25 Jan 2024 15:55:09 +0000	[thread overview]
Message-ID: <33ca48b8-f847-4d2b-b95f-741f0e082d2d@csgroup.eu> (raw)
In-Reply-To: <20230220-rtas-queue-for-6-4-v1-8-010e4416f13f@linux.ibm.com>

Hi Nathan,

Le 06/03/2023 à 22:33, Nathan Lynch via B4 Relay a écrit :
> From: Nathan Lynch <nathanl@linux.ibm.com>
> 
> The kernel can handle retrying RTAS function calls in response to
> -2/990x in the sys_rtas() handler instead of relaying the intermediate
> status to user space.

 From this series with still have patches 5, 7 and 8 awaiting in 
patchwork, see 
https://patchwork.ozlabs.org/project/linuxppc-dev/list/?submitter=85747 
and patch 8 doesn't apply anymore.

Are those 3 patches still relevant or should they be discarded ?

Thanks
Christophe


> 
> Justifications:
> 
> * Currently it's nondeterministic and quite variable in practice
>    whether a retry status is returned for any given invocation of
>    sys_rtas(). Therefore user space code cannot be expecting a retry
>    result without already being broken.
> 
> * This tends to significantly reduce the total number of system calls
>    issued by programs such as drmgr which make use of sys_rtas(),
>    improving the experience of tracing and debugging such
>    programs. This is the main motivation for me: I think this change
>    will make it easier for us to characterize current sys_rtas() use
>    cases as we move them to other interfaces over time.
> 
> * It reduces the number of opportunities for user space to leave
>    complex operations, such as those associated with DLPAR, incomplete
>    and diffcult to recover.
> 
> * We can expect performance improvements for existing sys_rtas()
>    users, not only because of overall reduction in the number of system
>    calls issued, but also due to the better handling of -2/990x in the
>    kernel. For example, librtas still sleeps for 1ms on -2, which is
>    completely unnecessary.
> 
> Performance differences for PHB add and remove on a small P10 PowerVM
> partition are included below. For add, elapsed time is slightly
> reduced. For remove, there are more significant improvements: the
> number of context switches is reduced by an order of magnitude, and
> elapsed time is reduced by over half.
> 
> (- before, + after):
> 
>    Performance counter stats for 'drmgr -c phb -a -s PHB 23' (5 runs):
> 
> -          1,847.58 msec task-clock                       #    0.135 CPUs utilized               ( +- 14.15% )
> -            10,867      cs                               #    9.800 K/sec                       ( +- 14.14% )
> +          1,901.15 msec task-clock                       #    0.148 CPUs utilized               ( +- 14.13% )
> +            10,451      cs                               #    9.158 K/sec                       ( +- 14.14% )
> 
> -         13.656557 +- 0.000124 seconds time elapsed  ( +-  0.00% )
> +          12.88080 +- 0.00404 seconds time elapsed  ( +-  0.03% )
> 
>    Performance counter stats for 'drmgr -c phb -r -s PHB 23' (5 runs):
> 
> -          1,473.75 msec task-clock                       #    0.092 CPUs utilized               ( +- 14.15% )
> -             2,652      cs                               #    3.000 K/sec                       ( +- 14.16% )
> +          1,444.55 msec task-clock                       #    0.221 CPUs utilized               ( +- 14.14% )
> +               104      cs                               #  119.957 /sec                        ( +- 14.63% )
> 
> -          15.99718 +- 0.00801 seconds time elapsed  ( +-  0.05% )
> +           6.54256 +- 0.00830 seconds time elapsed  ( +-  0.13% )
> 
> Move the existing rtas_lock-guarded critical section in sys_rtas()
> into a conventional rtas_busy_delay()-based loop, returning to user
> space only when a final success or failure result is available.
> 
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
> ---
>   arch/powerpc/kernel/rtas.c | 28 ++++++++++++++++------------
>   1 file changed, 16 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index 47a2aa43d7d4..c330a22ccc70 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -1798,7 +1798,6 @@ static bool block_rtas_call(int token, int nargs,
>   /* We assume to be passed big endian arguments */
>   SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
>   {
> -	struct pin_cookie cookie;
>   	struct rtas_args args;
>   	unsigned long flags;
>   	char *buff_copy, *errbuf = NULL;
> @@ -1866,20 +1865,25 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
>   
>   	buff_copy = get_errorlog_buffer();
>   
> -	raw_spin_lock_irqsave(&rtas_lock, flags);
> -	cookie = lockdep_pin_lock(&rtas_lock);
> +	do {
> +		struct pin_cookie cookie;
>   
> -	rtas_args = args;
> -	do_enter_rtas(&rtas_args);
> -	args = rtas_args;
> +		raw_spin_lock_irqsave(&rtas_lock, flags);
> +		cookie = lockdep_pin_lock(&rtas_lock);
>   
> -	/* A -1 return code indicates that the last command couldn't
> -	   be completed due to a hardware error. */
> -	if (be32_to_cpu(args.rets[0]) == -1)
> -		errbuf = __fetch_rtas_last_error(buff_copy);
> +		rtas_args = args;
> +		do_enter_rtas(&rtas_args);
> +		args = rtas_args;
>   
> -	lockdep_unpin_lock(&rtas_lock, cookie);
> -	raw_spin_unlock_irqrestore(&rtas_lock, flags);
> +		/*
> +		 * Handle error record retrieval before releasing the lock.
> +		 */
> +		if (be32_to_cpu(args.rets[0]) == -1)
> +			errbuf = __fetch_rtas_last_error(buff_copy);
> +
> +		lockdep_unpin_lock(&rtas_lock, cookie);
> +		raw_spin_unlock_irqrestore(&rtas_lock, flags);
> +	} while (rtas_busy_delay(be32_to_cpu(args.rets[0])));
>   
>   	if (buff_copy) {
>   		if (errbuf)
> 

  parent reply	other threads:[~2024-01-25 15:56 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-06 21:33 [PATCH 0/8] RTAS changes for 6.4 Nathan Lynch via B4 Relay
2023-03-06 21:33 ` Nathan Lynch
2023-03-06 21:33 ` [PATCH 1/8] powerpc/rtas: ensure 8-byte alignment for struct rtas_args Nathan Lynch
2023-03-06 21:33   ` Nathan Lynch via B4 Relay
2023-03-23  4:00   ` Andrew Donnellan
2023-03-06 21:33 ` [PATCH 2/8] powerpc/rtas: use memmove for potentially overlapping buffer copy Nathan Lynch
2023-03-06 21:33   ` Nathan Lynch via B4 Relay
2023-03-23  4:09   ` Andrew Donnellan
2023-03-06 21:33 ` [PATCH 3/8] powerpc/rtas: rtas_call_unlocked() kerneldoc Nathan Lynch via B4 Relay
2023-03-06 21:33   ` Nathan Lynch
2023-03-23  4:15   ` Andrew Donnellan
2023-03-06 21:33 ` [PATCH 4/8] powerpc/rtas: fix miswording in rtas_function kerneldoc Nathan Lynch via B4 Relay
2023-03-06 21:33   ` Nathan Lynch
2023-03-23  0:17   ` Andrew Donnellan
2023-03-06 21:33 ` [PATCH 5/8] powerpc/rtas: rename va_rtas_call_unlocked() to va_rtas_call() Nathan Lynch
2023-03-06 21:33   ` Nathan Lynch via B4 Relay
2023-03-23  4:17   ` Andrew Donnellan
2023-03-23 16:11     ` Nathan Lynch
2023-03-29 12:24   ` Michael Ellerman
2023-03-06 21:33 ` [PATCH 6/8] powerpc/rtas: lockdep annotations Nathan Lynch
2023-03-06 21:33   ` Nathan Lynch via B4 Relay
2023-03-23  6:01   ` Andrew Donnellan
2023-03-06 21:33 ` [PATCH 7/8] powerpc/rtas: warn on unsafe argument to rtas_call_unlocked() Nathan Lynch via B4 Relay
2023-03-06 21:33   ` Nathan Lynch
2023-03-23  4:25   ` Andrew Donnellan
2023-03-23 12:17     ` Nathan Lynch
2023-03-24  0:56       ` Nathan Lynch
2023-03-29 12:20         ` Michael Ellerman
2023-03-29 16:23           ` Nathan Lynch
2023-03-06 21:33 ` [PATCH 8/8] powerpc/rtas: consume retry statuses in sys_rtas() Nathan Lynch via B4 Relay
2023-03-06 21:33   ` Nathan Lynch
2023-03-23  6:26   ` Andrew Donnellan
2023-03-23 19:39     ` Nathan Lynch
2023-03-23  9:44   ` Michael Ellerman
2023-03-23 13:40     ` Nathan Lynch
2024-01-25 15:55   ` Christophe Leroy [this message]
2024-01-25 16:33     ` Nathan Lynch
2024-01-25 16:46       ` Christophe Leroy
2024-01-25 17:23         ` Nathan Lynch
2023-04-06  1:09 ` (subset) [PATCH 0/8] RTAS changes for 6.4 Michael Ellerman
2023-04-26 12:12 ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=33ca48b8-f847-4d2b-b95f-741f0e082d2d@csgroup.eu \
    --to=christophe.leroy@csgroup.eu \
    --cc=ajd@linux.ibm.com \
    --cc=cheloha@linux.ibm.com \
    --cc=ldufour@linux.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=nathanl@linux.ibm.com \
    --cc=nnac123@linux.ibm.com \
    --cc=npiggin@gmail.com \
    --cc=tyreld@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.