Re: [PATCH v14 08/14] mm: multi-gen LRU: support page table walks

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "Maciej W. Rozycki" <macro@orcam.me.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Matthew Wilcox" <willy@infradead.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"the arch/x86 maintainers" <x86@kernel.org>,
	"Yu Zhao" <yuzhao@google.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Andi Kleen" <ak@linux.intel.com>,
	"Aneesh Kumar" <aneesh.kumar@linux.ibm.com>,
	"Catalin Marinas" <catalin.marinas@arm.com>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Hillf Danton" <hdanton@sina.com>, "Jens Axboe" <axboe@kernel.dk>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Mel Gorman" <mgorman@suse.de>,
	"Michael Larabel" <Michael@michaellarabel.com>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Mike Rapoport" <rppt@kernel.org>, "Tejun Heo" <tj@kernel.org>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Will Deacon" <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	page-reclaim@google.com, "Brian Geffon" <bgeffon@google.com>,
	"Jan Alexander Steffens" <heftig@archlinux.org>,
	"Oleksandr Natalenko" <oleksandr@natalenko.name>,
	"Steven Barrett" <steven@liquorix.net>,
	"Suleiman Souhlal" <suleiman@google.com>,
	"Daniel Byrne" <djbyrne@mtu.edu>,
	"Donald Carr" <d@chaos-reins.com>,
	"Holger Hoffstätte" <holger@applied-asynchrony.com>,
	"Konstantin Kharlamov" <Hi-Angel@yandex.ru>,
	"Shuang Zhai" <szhai2@cs.rochester.edu>,
	"Sofia Trinh" <sofia.trinh@edi.works>,
	"Vaibhav Jain" <vaibhav@linux.ibm.com>
Subject: Re: [PATCH v14 08/14] mm: multi-gen LRU: support page table walks
Date: Sun, 23 Oct 2022 18:55:04 +0100 (BST)	[thread overview]
Message-ID: <alpine.DEB.2.21.2210211911390.50489@angie.orcam.me.uk> (raw)
In-Reply-To: <CAHk-=wjrpH1+6cQQjTO6p-96ndBMiOnNH098vhS2jLybxD+7gA@mail.gmail.com>

On Fri, 21 Oct 2022, Linus Torvalds wrote:

> > > We got rid of i386 support back in 2012. Maybe it's time to get rid of
> > > i486 support in 2022?
> >
> > Arnd suggested removing i486 last year and got a bit of pushback.
> > The most convincing to my mind was Maciej:
> 
> Hmm. Maciej added to the cc.

 Thanks!

> So I *really* don't think i486 class hardware is relevant any more.
> Yes, I'm sure it exists (Maciej being an example), but from a kernel
> development standpoint I don't think they are really relevant.
> 
> At some point, people have them as museum pieces. They might as well
> run museum kernels.
> 
> Moving up to requiring cmpxchg8b doesn't sound unreasonable to me.

 But is it really a problem?  I mean unlike MIPS R2000/R3000 class gear 
that has no atomics at all at the CPU level (SMP R3000 machines did exist 
and necessarily had atomics, actually via gating storage implemented by 
board hardware in systems we have never had support for even for UP) we 
have had atomics in x86 since forever.  Just not 64-bit ones.

 Given the presence of generic atomics we can emulate CMPXCHG8B easily 
LL/SC-style using a spinlock with XCHG even on SMP let alone UP.  So all 
the kernel code can just assume the presence of CMPXCHG8B, but any 
invocations of CMPXCHG8B would be diverted to the emulation, perhaps even 
at the assembly level via a GAS macro called `cmpxchg8b' (why not?).  All 
the maintenance burden is then shifted to that macro and said emulation 
code.

 Proof of concept wrapper:

#define LOCK_PREFIX ""

#define CC_SET(c) "\n\t/* output condition code " #c "*/\n"
#define CC_OUT(c) "=@cc" #c

#define unlikely(x) __builtin_expect(!!(x), 0)

__extension__ typedef unsigned long long __u64;
typedef unsigned int __u32;
typedef __u64 u64;
typedef __u32 u32;

typedef _Bool bool;

__asm__(
	".macro		cmpxchg8b arg\n\t"
	"pushl		%eax\n\t"
	"leal		\\arg, %eax\n\t"
	"xchgl		%eax, (%esp)\n\t"
	"call		cmpxchg8b_emu\n\t"
	".endm\n\t");

bool __try_cmpxchg64(volatile u64 *ptr, u64 *pold, u64 new)
{
	bool success;
	u64 old = *pold;
	asm volatile(LOCK_PREFIX "cmpxchg8b %[ptr]"
		     CC_SET(z)
		     : CC_OUT(z) (success),
		       [ptr] "+m" (*ptr),
		       "+A" (old)
		     : "b" ((u32)new),
		       "c" ((u32)(new >> 32))
		     : "memory");

	if (unlikely(!success))
		*pold = old;
	return success;
}

This assembles to:

cmpxchg8b.o:     file format elf32-i386

Disassembly of section .text:

00000000 <__try_cmpxchg64>:
   0:	55                   	push   %ebp
   1:	89 e5                	mov    %esp,%ebp
   3:	57                   	push   %edi
   4:	56                   	push   %esi
   5:	89 d7                	mov    %edx,%edi
   7:	53                   	push   %ebx
   8:	89 c6                	mov    %eax,%esi
   a:	8b 4d 0c             	mov    0xc(%ebp),%ecx
   d:	8b 02                	mov    (%edx),%eax
   f:	8b 5d 08             	mov    0x8(%ebp),%ebx
  12:	8b 52 04             	mov    0x4(%edx),%edx
  15:	50                   	push   %eax
  16:	8d 06                	lea    (%esi),%eax
  18:	87 04 24             	xchg   %eax,(%esp)
  1b:	e8 fc ff ff ff       	call   1c <__try_cmpxchg64+0x1c>
			1c: R_386_PC32	cmpxchg8b_emu
  20:	0f 94 c1             	sete   %cl
  23:	75 0b                	jne    30 <__try_cmpxchg64+0x30>
  25:	5b                   	pop    %ebx
  26:	88 c8                	mov    %cl,%al
  28:	5e                   	pop    %esi
  29:	5f                   	pop    %edi
  2a:	5d                   	pop    %ebp
  2b:	c3                   	ret    
  2c:	8d 74 26 00          	lea    0x0(%esi,%eiz,1),%esi
  30:	5b                   	pop    %ebx
  31:	89 07                	mov    %eax,(%edi)
  33:	5e                   	pop    %esi
  34:	89 57 04             	mov    %edx,0x4(%edi)
  37:	5f                   	pop    %edi
  38:	88 c8                	mov    %cl,%al
  3a:	5d                   	pop    %ebp
  3b:	c3                   	ret    

Of course there's a minor ABI nit for `cmpxchg8b_emu' to return a result 
in ZF and the wrapper relies on CONFIG_FRAME_POINTER for correct `arg' 
evaluation in all cases.  But that shouldn't be a big deal, should it?

 Then long-term maintenance would be minimal to nil and all the code 
except for the wrapper and the emulation handler need not be concerned 
about the 486 obscurity.  I can volunteer to maintain said wrapper and 
emulation (and for that matter generic 486 support) if that helped to keep 
the 486 alive.

 Eventually we may choose to drop 486 support after all, but CMPXCHG8B 
alone seems too small a reason to me for that to happen right now.

 NB MIPS R2000 dates back to 1985, solid 4 years before the 486, and we 
continue supporting it with minimal effort.  We do have atomic emulation 
for userland of course.

  Maciej

next prev parent reply	other threads:[~2022-10-23 17:55 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-15  7:13 [PATCH v14 00/14] Multi-Gen LRU Framework Yu Zhao
2022-08-15  7:13 ` [PATCH v14 01/14] mm: x86, arm64: add arch_has_hw_pte_young() Yu Zhao
2022-08-15  7:13 ` [PATCH v14 02/14] mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG Yu Zhao
2022-08-15  7:13 ` [PATCH v14 03/14] mm/vmscan.c: refactor shrink_node() Yu Zhao
2022-08-15  7:13 ` [PATCH v14 04/14] Revert "include/linux/mm_inline.h: fold __update_lru_size() into its sole caller" Yu Zhao
2022-08-15  7:13 ` [PATCH v14 05/14] mm: multi-gen LRU: groundwork Yu Zhao
2022-08-15  7:13 ` [PATCH v14 06/14] mm: multi-gen LRU: minimal implementation Yu Zhao
2022-08-15  7:13 ` [PATCH v14 07/14] mm: multi-gen LRU: exploit locality in rmap Yu Zhao
2022-09-01  9:18   ` Nadav Amit
2022-09-02  1:17     ` Yu Zhao
2022-09-02  1:28       ` Yu Zhao
2022-08-15  7:13 ` [PATCH v14 08/14] mm: multi-gen LRU: support page table walks Yu Zhao
2022-10-13 15:04   ` Peter Zijlstra
2022-10-19  5:51     ` Yu Zhao
2022-10-19 17:40       ` Linus Torvalds
2022-10-20 14:13         ` Peter Zijlstra
2022-10-20 17:29           ` Yu Zhao
2022-10-20 17:35           ` Linus Torvalds
2022-10-20 18:55             ` Peter Zijlstra
2022-10-21  2:10               ` Linus Torvalds
2022-10-21  3:38                 ` Matthew Wilcox
2022-10-21 16:50                   ` Linus Torvalds
2022-10-23 14:44                     ` David Gow
2022-10-23 17:55                     ` Maciej W. Rozycki [this message]
2022-10-23 18:35                       ` Linus Torvalds
2022-10-24  7:30                         ` Arnd Bergmann
2022-10-25 16:28                         ` Maciej W. Rozycki
2022-10-26 15:43                           ` Arnd Bergmann
2022-10-27 23:08                             ` Maciej W. Rozycki
2022-10-28  7:27                               ` Arnd Bergmann
2022-10-21 10:12                 ` Peter Zijlstra
2022-10-24 18:20                 ` Gareth Poole
2022-10-24 19:28                 ` Serentty
2022-08-15  7:13 ` [PATCH v14 09/14] mm: multi-gen LRU: optimize multiple memcgs Yu Zhao
2022-08-15  7:13 ` [PATCH v14 10/14] mm: multi-gen LRU: kill switch Yu Zhao
2022-08-15  7:13 ` [PATCH v14 11/14] mm: multi-gen LRU: thrashing prevention Yu Zhao
2022-08-15  7:13 ` [PATCH v14 12/14] mm: multi-gen LRU: debugfs interface Yu Zhao
2022-08-15  7:13 ` [PATCH v14 13/14] mm: multi-gen LRU: admin guide Yu Zhao
2022-08-15  9:06   ` Bagas Sanjaya
2022-08-15  9:12   ` Mike Rapoport
2022-08-17 22:46     ` Yu Zhao
2022-09-20  7:43   ` Bagas Sanjaya
2022-08-15  7:13 ` [PATCH v14 14/14] mm: multi-gen LRU: design doc Yu Zhao
2022-08-15  9:07   ` Bagas Sanjaya
2022-08-31  4:17 ` OpenWrt / MIPS benchmark with MGLRU Yu Zhao
2022-08-31 15:13   ` Dave Hansen
2022-08-31 22:18   ` Yu Zhao
2022-09-12  0:08 ` [PATCH v14 00/14] Multi-Gen LRU Framework Andrew Morton
2022-09-15 17:56   ` Yu Zhao
2022-09-18 20:40     ` Yu Zhao
2022-09-18 20:47       ` [PATCH v14-fix 01/11] mm: multi-gen LRU: update admin guide Yu Zhao
2022-09-18 20:47         ` [PATCH v14-fix 02/11] mm: multi-gen LRU: add comment in lru_gen_use_mm() Yu Zhao
2022-09-18 20:47         ` [PATCH v14-fix 03/11] mm: multi-gen LRU: warn on !ptep_test_and_clear_young() Yu Zhao
2022-09-18 23:47           ` Andrew Morton
2022-09-18 23:53             ` Yu Zhao
2022-09-18 20:47         ` [PATCH v14-fix 04/11] mm: multi-gen LRU: fix warning from __rcu Yu Zhao
2022-09-18 20:47         ` [PATCH v14-fix 05/11] mm: multi-gen LRU: fix warning from seq_is_valid() Yu Zhao
2022-09-18 20:47         ` [PATCH v14-fix 06/11] mm: multi-gen LRU: delete overcautious VM_WARN_ON_ONCE() Yu Zhao
2022-09-18 20:47         ` [PATCH v14-fix 07/11] mm: multi-gen LRU: dial down MAX_LRU_BATCH Yu Zhao
2022-09-18 20:47         ` [PATCH v14-fix 08/11] mm: multi-gen LRU: delete newline in kswapd_age_node() Yu Zhao
2022-09-18 20:47         ` [PATCH v14-fix 09/11] mm: multi-gen LRU: add comment in lru_gen_look_around() Yu Zhao
2022-09-18 20:47         ` [PATCH v14-fix 10/11] mm: multi-gen LRU: fixed long-tailed direct reclaim latency Yu Zhao
2022-09-18 20:47         ` [PATCH v14-fix 11/11] mm: multi-gen LRU: refactor get_nr_evictable() Yu Zhao
2022-09-18 23:47       ` [PATCH v14 00/14] Multi-Gen LRU Framework Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.2210211911390.50489@angie.orcam.me.uk \
    --to=macro@orcam.me.uk \
    --cc=Hi-Angel@yandex.ru \
    --cc=Michael@michaellarabel.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=axboe@kernel.dk \
    --cc=bgeffon@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=d@chaos-reins.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=djbyrne@mtu.edu \
    --cc=hannes@cmpxchg.org \
    --cc=hdanton@sina.com \
    --cc=heftig@archlinux.org \
    --cc=holger@applied-asynchrony.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=oleksandr@natalenko.name \
    --cc=page-reclaim@google.com \
    --cc=peterz@infradead.org \
    --cc=rppt@kernel.org \
    --cc=sofia.trinh@edi.works \
    --cc=steven@liquorix.net \
    --cc=suleiman@google.com \
    --cc=szhai2@cs.rochester.edu \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vaibhav@linux.ibm.com \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox