From: Dave Hansen <dave.hansen@intel.com>
To: Anthony Yznaga <anthony.yznaga@oracle.com>,
akpm@linux-foundation.org, willy@infradead.org,
markhemm@googlemail.com, viro@zeniv.linux.org.uk,
david@redhat.com, khalid@kernel.org
Cc: andreyknvl@gmail.com, luto@kernel.org, brauner@kernel.org,
arnd@arndb.de, ebiederm@xmission.com, catalin.marinas@arm.com,
linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, mhiramat@kernel.org, rostedt@goodmis.org,
vasily.averin@linux.dev, xhao@linux.alibaba.com, pcc@google.com,
neilb@suse.de, maz@kernel.org,
David Rientjes <rientjes@google.com>
Subject: Re: [RFC PATCH v3 00/10] Add support for shared PTEs across processes
Date: Wed, 2 Oct 2024 16:11:27 -0700 [thread overview]
Message-ID: <accf2b4b-2a54-4261-b67e-010cb74082ae@intel.com> (raw)
In-Reply-To: <2dffaf2e-8a27-44bb-8d54-ef4cc0b08dc5@oracle.com>
About TLB flushing...
The quick and dirty thing to do is just flush_tlb_all() after you remove
the PTE from the host mm. That will surely work everywhere and it's as
dirt simple as you get. Honestly, it might even be cheaper than the
alternative.
Also, I don't think PCIDs actually complicate the problem at all. We
basically do remote mm TLB flushes using two mechanisms:
1. If the mm is loaded, use INVLPG and friends to zap the TLB
2. Bump mm->context.tlb_gen so that the next time it _gets_
loaded, the TLB is flushed.
flush_tlb_func() really only cares about #1 since if the mm isn't
loaded, it'll get flushed anyway at the next context switch.
The alternatives I can think of:
Make flush_tlb_mm_range(host_mm) work somehow. You'd need to somehow
keep mm_cpumask(host_mm) up to date and also make do something to
flush_tlb_func() to tell it that 'loaded_mm' isn't relevant and it
should flush regardless.
The other way is to use the msharefs's inode ->i_mmap to find all the
VMAs mapping the file, and find all *their* mm's:
for each vma in inode->i_mmap
mm = vma->vm_mm
flush_tlb_mm_range(<vma range here>)
But that might be even worse than flush_tlb_all() because it might end
up sending more than one IPI per CPU.
You can fix _that_ by keeping a single cpumask that you build up:
mask = 0
for each vma in inode->i_mmap
mm = vma->vm_mm
mask |= mm_cpumask(mm)
flush_tlb_multi(mask, info);
Unfortunately, 'info->mm' needs to be more than one mm, so you probably
still need a new flush_tlb_func() flush type to tell it to ignore
'info->mm' and flush anyway.
After all that, I kinda like flush_tlb_all(). ;)
next prev parent reply other threads:[~2024-10-02 23:11 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-03 23:22 Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 01/10] mm: Add msharefs filesystem Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 02/10] mm/mshare: pre-populate msharefs with information file Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 03/10] mm/mshare: make msharefs writable and support directories Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 04/10] mm/mshare: allocate an mm_struct for msharefs files Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 05/10] mm/mshare: Add ioctl support Anthony Yznaga
2024-10-14 20:08 ` Jann Horn
2024-10-16 0:49 ` Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 06/10] mm/mshare: Add vm flag for shared PTEs Anthony Yznaga
2024-09-03 23:40 ` James Houghton
2024-09-03 23:58 ` Anthony Yznaga
2024-10-07 10:24 ` David Hildenbrand
2024-10-07 23:03 ` Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 07/10] mm/mshare: Add mmap support Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 08/10] mm/mshare: Add basic page table sharing support Anthony Yznaga
2024-10-07 8:41 ` Kirill A. Shutemov
2024-10-07 17:45 ` Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 09/10] mm: create __do_mmap() to take an mm_struct * arg Anthony Yznaga
2024-10-07 8:44 ` Kirill A. Shutemov
2024-10-07 17:46 ` Anthony Yznaga
2024-09-03 23:22 ` [RFC PATCH v3 10/10] mshare: add MSHAREFS_CREATE_MAPPING Anthony Yznaga
2024-10-02 17:35 ` [RFC PATCH v3 00/10] Add support for shared PTEs across processes Dave Hansen
2024-10-02 19:30 ` Anthony Yznaga
2024-10-02 23:11 ` Dave Hansen [this message]
2024-10-03 0:24 ` Anthony Yznaga
2024-10-07 8:44 ` David Hildenbrand
2024-10-07 15:58 ` Dave Hansen
2024-10-07 16:27 ` David Hildenbrand
2024-10-07 16:45 ` Sean Christopherson
2024-10-08 1:37 ` Anthony Yznaga
2024-10-07 8:48 ` David Hildenbrand
2024-10-07 9:01 ` Kirill A. Shutemov
2024-10-07 19:23 ` Anthony Yznaga
2024-10-07 19:41 ` David Hildenbrand
2024-10-07 19:46 ` Anthony Yznaga
2024-10-14 20:07 ` Jann Horn
2024-10-16 0:59 ` Anthony Yznaga
2024-10-16 13:25 ` Jann Horn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=accf2b4b-2a54-4261-b67e-010cb74082ae@intel.com \
--to=dave.hansen@intel.com \
--cc=akpm@linux-foundation.org \
--cc=andreyknvl@gmail.com \
--cc=anthony.yznaga@oracle.com \
--cc=arnd@arndb.de \
--cc=brauner@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=david@redhat.com \
--cc=ebiederm@xmission.com \
--cc=khalid@kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=markhemm@googlemail.com \
--cc=maz@kernel.org \
--cc=mhiramat@kernel.org \
--cc=neilb@suse.de \
--cc=pcc@google.com \
--cc=rientjes@google.com \
--cc=rostedt@goodmis.org \
--cc=vasily.averin@linux.dev \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
--cc=xhao@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox