From: Peter Xu <peterx@redhat.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Minchan Kim <minchan@kernel.org>,
David Hildenbrand <david@redhat.com>,
Nadav Amit <nadav.amit@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>, Vlastimil Babka <vbabka@suse.cz>,
Andrea Arcangeli <aarcange@redhat.com>,
Andi Kleen <andi.kleen@intel.com>,
"Kirill A . Shutemov" <kirill@shutemov.name>
Subject: Re: [PATCH v3 7/7] mm/swap: Cache swap migration A/D bits support
Date: Wed, 10 Aug 2022 13:09:54 -0400 [thread overview]
Message-ID: <YvPmYnxmHCxNvPtH@xz-m1.local> (raw)
In-Reply-To: <87tu6keh8r.fsf@yhuang6-desk2.ccr.corp.intel.com>
On Wed, Aug 10, 2022 at 02:37:40PM +0800, Huang, Ying wrote:
> Peter Xu <peterx@redhat.com> writes:
>
> > Introduce a variable swap_migration_ad_supported to cache whether the arch
> > supports swap migration A/D bits.
> >
> > Here one thing to mention is that SWP_MIG_TOTAL_BITS will internally
> > reference the other macro MAX_PHYSMEM_BITS, which is a function call on
> > x86 (constant on all the rest of archs).
> >
> > It's safe to reference it in swapfile_init() because when reaching here
> > we're already during initcalls level 4 so we must have initialized 5-level
> > pgtable for x86_64 (right after early_identify_cpu() finishes).
> >
> > - start_kernel
> > - setup_arch
> > - early_cpu_init
> > - get_cpu_cap --> fetch from CPUID (including X86_FEATURE_LA57)
> > - early_identify_cpu --> clear X86_FEATURE_LA57 (if early lvl5 not enabled (USE_EARLY_PGTABLE_L5))
> > - arch_call_rest_init
> > - rest_init
> > - kernel_init
> > - kernel_init_freeable
> > - do_basic_setup
> > - do_initcalls --> calls swapfile_init() (initcall level 4)
> >
> > This should slightly speed up the migration swap entry handlings.
> >
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> > include/linux/swapfile.h | 1 +
> > include/linux/swapops.h | 7 +------
> > mm/swapfile.c | 8 ++++++++
> > 3 files changed, 10 insertions(+), 6 deletions(-)
> >
> > diff --git a/include/linux/swapfile.h b/include/linux/swapfile.h
> > index 54078542134c..87ec5e2cdb02 100644
> > --- a/include/linux/swapfile.h
> > +++ b/include/linux/swapfile.h
> > @@ -9,5 +9,6 @@
> > extern struct swap_info_struct *swap_info[];
> > extern unsigned long generic_max_swapfile_size(void);
> > extern unsigned long max_swapfile_size(void);
> > +extern bool swap_migration_ad_supported;
> >
> > #endif /* _LINUX_SWAPFILE_H */
> > diff --git a/include/linux/swapops.h b/include/linux/swapops.h
> > index 0e9579b90659..e6afc77c51ad 100644
> > --- a/include/linux/swapops.h
> > +++ b/include/linux/swapops.h
> > @@ -301,13 +301,8 @@ static inline swp_entry_t make_writable_migration_entry(pgoff_t offset)
> > */
> > static inline bool migration_entry_supports_ad(void)
> > {
> > - /*
> > - * max_swapfile_size() returns the max supported swp-offset plus 1.
> > - * We can support the migration A/D bits iff the pfn swap entry has
> > - * the offset large enough to cover all of them (PFN, A & D bits).
> > - */
> > #ifdef CONFIG_SWAP
> > - return max_swapfile_size() >= (1UL << SWP_MIG_TOTAL_BITS);
> > + return swap_migration_ad_supported;
> > #else /* CONFIG_SWAP */
> > return false;
> > #endif /* CONFIG_SWAP */
> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > index 794fa37bd0c3..c49cf25f0d08 100644
> > --- a/mm/swapfile.c
> > +++ b/mm/swapfile.c
> > @@ -64,6 +64,9 @@ EXPORT_SYMBOL_GPL(nr_swap_pages);
> > long total_swap_pages;
> > static int least_priority = -1;
> > static unsigned long swapfile_maximum_size;
> > +#ifdef CONFIG_MIGRATION
> > +bool swap_migration_ad_supported;
> > +#endif /* CONFIG_MIGRATION */
> >
> > static const char Bad_file[] = "Bad swap file entry ";
> > static const char Unused_file[] = "Unused swap file entry ";
> > @@ -3685,6 +3688,11 @@ static int __init swapfile_init(void)
> >
> > swapfile_maximum_size = arch_max_swapfile_size();
> >
> > +#ifdef CONFIG_MIGRATION
> > + if (swapfile_maximum_size >= (1UL << SWP_MIG_TOTAL_BITS))
> > + swap_migration_ad_supported = true;
> > +#endif /* CONFIG_MIGRATION */
> > +
> > return 0;
> > }
> > subsys_initcall(swapfile_init);
>
> I don't think it's necessary to add a variable for such a simple
> function and it's not a super hot path. But I don't have strong
> opinions here.
Logically referencing SWP_MIG_TOTAL_BITS needs to go check
MAX_PHYSMEM_BITS, which should further go with:
# define MAX_PHYSMEM_BITS (pgtable_l5_enabled() ? 52 : 46)
Then since swapfile.c doesn't have USE_EARLY_PGTABLE_L5 defined..
#define pgtable_l5_enabled() cpu_feature_enabled(X86_FEATURE_LA57)
Then,
#define cpu_feature_enabled(bit) \
(__builtin_constant_p(bit) && DISABLED_MASK_BIT_SET(bit) ? 0 : static_cpu_has(bit))
I think LA57 shouldn't be in DISABLED_MASK_BIT_SET() at all, in our case
the relevant disable mask is:
#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP| \
DISABLE_ENQCMD)
Here we should have:
#ifdef CONFIG_X86_5LEVEL
# define DISABLE_LA57 0
#else
# define DISABLE_LA57 (1<<(X86_FEATURE_LA57 & 31))
#endif
So DISABLE_LA57 should be 0 when 5level enabled (true in my case). Then we
really should land at static_cpu_has().
I checked up the code generated and surprisingly it's fairly fast indeed:
(after fetching swapfile_maximum_size() and put into %rax, I'll change
that into a variable soon..)
0xffffffff83932e41 <+185>: mov $0x1,%edx
0xffffffff83932e46 <+190>: shl $0x24,%rdx
0xffffffff83932e4a <+194>: xor %r8d,%r8d
0xffffffff83932e4d <+197>: cmp %rdx,%rax
0xffffffff83932e50 <+200>: jb 0xffffffff83932e59 <swapfile_init+209>
0xffffffff83932e52 <+202>: movb $0x1,0xeab897(%rip) # 0xffffffff847de6f0 <swap_migration_ad_supported>
0xffffffff83932e59 <+209>: mov %r8d,%eax
Obviously on my testing host SWP_MIG_TOTAL_BITS is directly set as $0x24
(which reflects a 4-level pgtable) but frankly I cannot tell how it did
that without checking boot cpu x86_capabilities flags.. I'm pretty sure my
kernel config has CONFIG_X86_5LEVEL=y.
It'll be great if anyone already notices why it can be optimized into a
constant, but even if so I'm not confident that'll be a constant for all
the hosts and whether static_cpu_has() will still consume some insns.
Since the change is fairly simple after previous patch, I think it'll be
nice to keep it too.
Thanks,
--
Peter Xu
prev parent reply other threads:[~2022-08-10 17:10 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-09 22:00 [PATCH v3 0/7] mm: Remember a/d bits for migration entries Peter Xu
2022-08-09 22:00 ` [PATCH v3 1/7] mm/x86: Use SWP_TYPE_BITS in 3-level swap macros Peter Xu
2022-08-10 1:13 ` Huang, Ying
2022-08-09 22:00 ` [PATCH v3 2/7] mm/swap: Comment all the ifdef in swapops.h Peter Xu
2022-08-10 1:19 ` Huang, Ying
2022-08-09 22:00 ` [PATCH v3 3/7] mm/swap: Add swp_offset_pfn() to fetch PFN from swap entry Peter Xu
2022-08-10 6:04 ` Huang, Ying
2022-08-10 13:17 ` Peter Xu
2022-08-09 22:00 ` [PATCH v3 4/7] mm/thp: Carry over dirty bit when thp splits on pmd Peter Xu
2022-08-10 6:24 ` Huang, Ying
2022-08-10 15:13 ` Peter Xu
2022-08-09 22:00 ` [PATCH v3 5/7] mm: Remember young/dirty bit for page migrations Peter Xu
2022-08-10 6:30 ` Huang, Ying
2022-08-10 15:19 ` Peter Xu
2022-08-11 15:19 ` Peter Xu
2022-08-12 2:32 ` Huang, Ying
2022-08-15 19:18 ` Peter Xu
2022-08-15 20:52 ` Nadav Amit
2022-08-15 21:03 ` Nadav Amit
2022-08-18 16:39 ` Dave Hansen
2022-08-17 1:49 ` Huang, Ying
2022-08-09 22:00 ` [PATCH v3 6/7] mm/swap: Cache maximum swapfile size when init swap Peter Xu
2022-08-10 6:33 ` Huang, Ying
2022-08-10 13:23 ` Peter Xu
2022-08-09 22:01 ` [PATCH v3 7/7] mm/swap: Cache swap migration A/D bits support Peter Xu
2022-08-10 6:37 ` Huang, Ying
2022-08-10 17:09 ` Peter Xu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YvPmYnxmHCxNvPtH@xz-m1.local \
--to=peterx@redhat.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=andi.kleen@intel.com \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=nadav.amit@gmail.com \
--cc=vbabka@suse.cz \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox