All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christophe Leroy <christophe.leroy@csgroup.eu>
To: Michael Ellerman <mpe@ellerman.id.au>,
	Matthew Wilcox <willy@infradead.org>,
	Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: "linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"npiggin@gmail.com" <npiggin@gmail.com>
Subject: Re: [PATCH v2] powerpc/mm: Avoid calling arch_enter/leave_lazy_mmu() in set_ptes
Date: Sat, 11 Nov 2023 10:33:50 +0000	[thread overview]
Message-ID: <e381f776-8284-3720-53dd-7ee08878f56e@csgroup.eu> (raw)
In-Reply-To: <87bkccgz9b.fsf@mail.lhotse>



Le 02/11/2023 à 12:39, Michael Ellerman a écrit :
> Matthew Wilcox <willy@infradead.org> writes:
>> On Tue, Oct 24, 2023 at 08:06:04PM +0530, Aneesh Kumar K.V wrote:
>>>   		ptep++;
>>> -		pte = __pte(pte_val(pte) + (1UL << PTE_RPN_SHIFT));
>>>   		addr += PAGE_SIZE;
>>> +		/*
>>> +		 * increment the pfn.
>>> +		 */
>>> +		pte = pfn_pte(pte_pfn(pte) + 1, pte_pgprot((pte)));
>>
>> when i looked at this, it generated shit code.  did you check?
> 
> I didn't look ...
> 
> <goes and looks>
> 
> It's not super clear cut. There's some difference because pfn_pte()
> contains two extra VM_BUG_ONs.
> 
> But with DEBUG_VM *off* the version using pfn_pte() generates *better*
> code, or at least less code, ~160 instructions vs ~200.
> 
> For some reason the version using PTE_RPN_SHIFT seems to be byte
> swapping the pte an extra two times, each of which generates ~8
> instructions. But I can't see why.
> 
> I tried a few other things and couldn't come up with anything that
> generated better code. But I'll keep poking at it tomorrow.

On PPC32 the version using PTE_RPN_SHIFT is better, here is what the 
main loop of set_ptes() looks like:

  22c:	55 29 f0 be 	srwi    r9,r9,2
  230:	7d 29 03 a6 	mtctr   r9
  234:	39 3f 10 00 	addi    r9,r31,4096
  238:	39 1f 20 00 	addi    r8,r31,8192
  23c:	39 5f 30 00 	addi    r10,r31,12288
  240:	3b ff 40 00 	addi    r31,r31,16384
  244:	91 3e 00 04 	stw     r9,4(r30)
  248:	91 1e 00 08 	stw     r8,8(r30)
  24c:	91 5e 00 0c 	stw     r10,12(r30)
  250:	97 fe 00 10 	stwu    r31,16(r30)
  254:	42 00 ff e0 	bdnz    234 <set_ptes+0x78>

With the version using pfn_pte(), the main loop is:

  218:	54 e9 f8 7e 	srwi    r9,r7,1
  21c:	7d 29 03 a6 	mtctr   r9
  220:	57 e9 00 26 	clrrwi  r9,r31,12
  224:	39 29 10 00 	addi    r9,r9,4096
  228:	57 ff 05 3e 	clrlwi  r31,r31,20
  22c:	7d 29 fb 78 	or      r9,r9,r31
  230:	55 3f 00 26 	clrrwi  r31,r9,12
  234:	3b ff 10 00 	addi    r31,r31,4096
  238:	55 28 05 3e 	clrlwi  r8,r9,20
  23c:	7f ff 43 78 	or      r31,r31,r8
  240:	91 3d 00 04 	stw     r9,4(r29)
  244:	93 fd 00 08 	stw     r31,8(r29)
  248:	3b bd 00 08 	addi    r29,r29,8
  24c:	42 00 ff d4 	bdnz    220 <set_ptes+0x64>

Not only the loop is bigger, but it is also only unrolled by 2 while 
first one is unrolled by 4 (r7 and r9 contain the same value).

Therefore allthough the PTE_RPN_SHIFT version is 87 instructions while 
the other one is only 81 instructions, the former looks better.

Christophe

      reply	other threads:[~2023-11-11 10:35 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-24 14:36 [PATCH v2] powerpc/mm: Avoid calling arch_enter/leave_lazy_mmu() in set_ptes Aneesh Kumar K.V
2023-10-27  9:46 ` Michael Ellerman
2023-10-27 10:50 ` Matthew Wilcox
2023-11-02 11:39   ` Michael Ellerman
2023-11-11 10:33     ` Christophe Leroy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e381f776-8284-3720-53dd-7ee08878f56e@csgroup.eu \
    --to=christophe.leroy@csgroup.eu \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.