From: Nadav Amit <nadav.amit@gmail.com>
To: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Hugh Dickins <hughd@google.com>,
Andy Lutomirski <luto@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Nadav Amit <namit@vmware.com>,
Sean Christopherson <seanjc@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
x86@kernel.org
Subject: [RFC 4/6] mm/swap_state: respect FAULT_FLAG_RETRY_NOWAIT
Date: Wed, 24 Feb 2021 23:29:08 -0800 [thread overview]
Message-ID: <20210225072910.2811795-5-namit@vmware.com> (raw)
In-Reply-To: <20210225072910.2811795-1-namit@vmware.com>
From: Nadav Amit <namit@vmware.com>
Certain use-cases (e.g., prefetch_page()) may want to avoid polling
while a page is brought from the swap. Yet, swap_cluster_readahead()
and swap_vma_readahead() do not respect FAULT_FLAG_RETRY_NOWAIT.
Add support to respect FAULT_FLAG_RETRY_NOWAIT by not polling in these
cases.
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: x86@kernel.org
Signed-off-by: Nadav Amit <namit@vmware.com>
---
mm/memory.c | 15 +++++++++++++--
mm/shmem.c | 1 +
mm/swap_state.c | 12 +++++++++---
3 files changed, 23 insertions(+), 5 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index feff48e1465a..13b9cf36268f 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3326,12 +3326,23 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
}
if (!page) {
+ /*
+ * Back out if we failed to bring the page while we
+ * tried to avoid I/O.
+ */
+ if (fault_flag_allow_retry_first(vmf->flags) &&
+ (vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) {
+ ret = VM_FAULT_RETRY;
+ delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
+ goto out;
+ }
+
/*
* Back out if somebody else faulted in this pte
* while we released the pte lock.
*/
- vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
- vmf->address, &vmf->ptl);
+ vmf->pte = pte_offset_map_lock(vma->vm_mm,
+ vmf->pmd, vmf->address, &vmf->ptl);
if (likely(pte_same(*vmf->pte, vmf->orig_pte)))
ret = VM_FAULT_OOM;
delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
diff --git a/mm/shmem.c b/mm/shmem.c
index 7c6b6d8f6c39..b108e9ba9e89 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1525,6 +1525,7 @@ static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp,
shmem_pseudo_vma_init(&pvma, info, index);
vmf.vma = &pvma;
vmf.address = 0;
+ vmf.flags = 0;
page = swap_cluster_readahead(swap, gfp, &vmf);
shmem_pseudo_vma_destroy(&pvma);
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 751c1ef2fe0e..1e930f7ff8b3 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -656,10 +656,13 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
unsigned long mask;
struct swap_info_struct *si = swp_swap_info(entry);
struct blk_plug plug;
- bool do_poll = true, page_allocated;
+ bool page_allocated, do_poll;
struct vm_area_struct *vma = vmf->vma;
unsigned long addr = vmf->address;
+ do_poll = !fault_flag_allow_retry_first(vmf->flags) ||
+ !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT);
+
mask = swapin_nr_pages(offset) - 1;
if (!mask)
goto skip;
@@ -838,7 +841,7 @@ static struct page *swap_vma_readahead(swp_entry_t fentry, gfp_t gfp_mask,
pte_t *pte, pentry;
swp_entry_t entry;
unsigned int i;
- bool page_allocated;
+ bool page_allocated, do_poll;
struct vma_swap_readahead ra_info = {
.win = 1,
};
@@ -873,9 +876,12 @@ static struct page *swap_vma_readahead(swp_entry_t fentry, gfp_t gfp_mask,
}
blk_finish_plug(&plug);
lru_add_drain();
+
skip:
+ do_poll = (!fault_flag_allow_retry_first(vmf->flags) ||
+ !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) && ra_info.win == 1;
return read_swap_cache_async(fentry, gfp_mask, vma, vmf->address,
- ra_info.win == 1);
+ do_poll);
}
/**
--
2.25.1
next prev parent reply other threads:[~2021-02-25 7:34 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-25 7:29 [RFC 0/6] x86: prefetch_page() vDSO call Nadav Amit
2021-02-25 7:29 ` [RFC 1/6] vdso/extable: fix calculation of base Nadav Amit
2021-02-25 21:16 ` Sean Christopherson
2021-02-26 17:24 ` Nadav Amit
2021-02-26 17:47 ` Sean Christopherson
2021-02-28 9:20 ` Nadav Amit
2021-02-25 7:29 ` [RFC 2/6] x86/vdso: add mask and flags to extable Nadav Amit
2021-02-25 7:29 ` [RFC 3/6] x86/vdso: introduce page_prefetch() Nadav Amit
2021-02-25 7:29 ` Nadav Amit [this message]
2021-02-25 7:29 ` [RFC 5/6] mm: use lightweight reclaim on FAULT_FLAG_RETRY_NOWAIT Nadav Amit
2021-02-25 7:29 ` [PATCH 6/6] testing/selftest: test vDSO prefetch_page() Nadav Amit
2021-02-25 8:40 ` [RFC 0/6] x86: prefetch_page() vDSO call Peter Zijlstra
2021-02-25 8:52 ` Nadav Amit
2021-02-25 9:32 ` Nadav Amit
2021-02-25 9:55 ` Peter Zijlstra
2021-02-25 12:16 ` Matthew Wilcox
2021-02-25 16:56 ` Nadav Amit
2021-02-25 17:32 ` Matthew Wilcox
2021-02-25 17:53 ` Nadav Amit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210225072910.2811795-5-namit@vmware.com \
--to=nadav.amit@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=namit@vmware.com \
--cc=peterz@infradead.org \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox