From: David Hildenbrand <david@redhat.com>
To: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>
Subject: 32bit architectures and __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Date: Tue, 22 Nov 2022 15:05:24 +0100 [thread overview]
Message-ID: <ceb85a8b-d6e8-830f-eddb-69ae1531e10e@redhat.com> (raw)
Hi all,
Spoiler: is there a real use case for > 16 GiB of swap in a single file
on 32bit architectures?
I'm currently looking into implementing __HAVE_ARCH_PTE_SWP_EXCLUSIVE
support for all remaining architectures. So far, I only implemented it
for the most relevant enterprise architectures.
With __HAVE_ARCH_PTE_SWP_EXCLUSIVE, we remember when unmapping a page
and replacing the present PTE by a swap PTE for swapout whether the
anonymous page that was mapped was exclusive (PageAnonExclusive(), i.e.,
not COW-shared). When refaulting that page, whereby we replace the swap
PTE by a present PTE, we can reuse that information to map that page
writable and avoid unnecessary page copies due to COW, even if there are
still unexpected references on the page.
While this would usually be a pure optimization, currently O_DIRECT
still (wrongly) uses FOLL_GET instead of FOLL_PIN and can trigger in
corner cases memory corruptions. So for that case, it is also a
temporary fix until O_DIRECT properly uses FOLL_PIN. More details can be
found in [1].
Ideally, I'd just implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE for all
architectures. However, __HAVE_ARCH_PTE_SWP_EXCLUSIVE requires an
additional bit in the swap PTE. While mostly unproblematic on 64bit, for
32bit this implies that we'll have to "steal" one bit from the swap
offset on most architectures, reducing the maximum swap size per file.
Assuming we previously supported 32 GiB per swap file (e.g., hexagon,
csky), this number would get reduced to 16 GiB. The kernel would
automatically truncate the oversized swap area and the system would
continue working by using less space of that swapfile, but ... well, is
there a but?
Usually (well, there is PAE on x86 ...), a 32bit system can address 4
GiB of memory. Maximum swap size recommendation seem to be around 2--3
times the memory size (2x without hibernation, 3x with hibernation). So
it sounds like there is barely a use case for more swap space. Of course
one can use multiple swap files.
So, is anybody aware of excessive swap space requirements on 32bit?
Note that I thought about storing the exclusive marker in the swap_map
instead of in the swap PTE, but quickly decided to discard that idea
because it results in significantly more complexity and the swap code is
already horrible enough.
[1] https://lkml.kernel.org/r/20220329164329.208407-1-david@redhat.com
--
Thanks,
David / dhildenb
reply other threads:[~2022-11-22 14:08 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ceb85a8b-d6e8-830f-eddb-69ae1531e10e@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox