From: David Rientjes <rientjes@google.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrey Ryabinin <a.ryabinin@samsung.com>,
Andrew Morton <akpm@linux-foundation.org>,
Dave Hansen <dave.hansen@intel.com>,
Sasha Levin <sasha.levin@oracle.com>,
linux-mm@kvack.org
Subject: Re: [PATCH 1/2] mm: close race between do_fault_around() and fault_around_bytes_set()
Date: Tue, 29 Jul 2014 15:36:57 -0700 (PDT) [thread overview]
Message-ID: <alpine.DEB.2.02.1407291531080.20991@chino.kir.corp.google.com> (raw)
In-Reply-To: <20140729142710.656A9E00A3@blue.fi.intel.com>
On Tue, 29 Jul 2014, Kirill A. Shutemov wrote:
> Things can go wrong if fault_around_bytes will be changed under
> do_fault_around(): between fault_around_mask() and fault_around_pages().
>
> Let's read fault_around_bytes only once during do_fault_around() and
> calculate mask based on the reading.
>
> Note: fault_around_bytes can only be updated via debug interface. Also
> I've tried but was not able to trigger a bad behaviour without the
> patch. So I would not consider this patch as urgent.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> mm/memory.c | 17 +++++++++++------
> 1 file changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 9d66bc66f338..7f4f0c41c9e9 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2772,12 +2772,12 @@ static unsigned long fault_around_bytes = rounddown_pow_of_two(65536);
>
> static inline unsigned long fault_around_pages(void)
> {
> - return fault_around_bytes >> PAGE_SHIFT;
> + return ACCESS_ONCE(fault_around_bytes) >> PAGE_SHIFT;
Not sure why this is being added here, ACCESS_ONCE() would be needed
depending on the context in which the return value is used,
do_read_fault() won't need it.
> }
>
> -static inline unsigned long fault_around_mask(void)
> +static inline unsigned long fault_around_mask(unsigned long nr_pages)
> {
> - return ~(fault_around_bytes - 1) & PAGE_MASK;
> + return ~(nr_pages * PAGE_SIZE - 1) & PAGE_MASK;
> }
>
>
This patch is corrupted because of the newline here that doesn't exist in
linux-next.
> @@ -2844,12 +2844,17 @@ late_initcall(fault_around_debugfs);
> static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
> pte_t *pte, pgoff_t pgoff, unsigned int flags)
> {
> - unsigned long start_addr;
> + unsigned long start_addr, nr_pages;
> pgoff_t max_pgoff;
> struct vm_fault vmf;
> int off;
>
> - start_addr = max(address & fault_around_mask(), vma->vm_start);
> + nr_pages = fault_around_pages();
> + /* race with fault_around_bytes_set() */
> + if (unlikely(nr_pages <= 1))
> + return;
Why exactly is this unlikely if fault_around_bytes is tunable via debugfs
to equal PAGE_SIZE? I assume we're expecting nobody is going to be doing
that, otherwise we'll hit the unlikely() branch here every time. So
either the unlikely or the tunable should be removed.
The problem is that fault_around_bytes isn't documented so we don't even
know the min value without looking at the source code.
I also don't see how nr_pages can be < 1.
> +
> + start_addr = max(address & fault_around_mask(nr_pages), vma->vm_start);
> off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
> pte -= off;
> pgoff -= off;
> @@ -2861,7 +2866,7 @@ static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
> max_pgoff = pgoff - ((start_addr >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) +
> PTRS_PER_PTE - 1;
> max_pgoff = min3(max_pgoff, vma_pages(vma) + vma->vm_pgoff - 1,
> - pgoff + fault_around_pages() - 1);
> + pgoff + nr_pages - 1);
>
> /* Check if it makes any sense to call ->map_pages */
> while (!pte_none(*pte)) {
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-07-29 22:37 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-29 11:33 [PATCH 0/2] faultaround updates Kirill A. Shutemov
2014-07-29 11:33 ` [PATCH 1/2] mm: close race between do_fault_around() and fault_around_bytes_set() Kirill A. Shutemov
2014-07-29 13:32 ` Andrey Ryabinin
2014-07-29 14:27 ` Kirill A. Shutemov
2014-07-29 17:07 ` Andrey Ryabinin
2014-07-29 22:36 ` David Rientjes [this message]
2014-07-29 23:16 ` Kirill A. Shutemov
2014-07-29 11:33 ` [PATCH 2/2] mm: mark fault_around_bytes __read_mostly Kirill A. Shutemov
2014-07-29 22:38 ` David Rientjes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.02.1407291531080.20991@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=a.ryabinin@samsung.com \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-mm@kvack.org \
--cc=sasha.levin@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox