Re: [PATCH 1/1] mm: do not increment pgfault stats when page fault handler retries

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Suren Baghdasaryan <surenb@google.com>
To: Peter Xu <peterx@redhat.com>
Cc: akpm@linux-foundation.org, willy@infradead.org,
	hannes@cmpxchg.org,  mhocko@suse.com, josef@toxicpanda.com,
	jack@suse.cz, ldufour@linux.ibm.com,  laurent.dufour@fr.ibm.com,
	michel@lespinasse.org, liam.howlett@oracle.com,
	 jglisse@google.com, vbabka@suse.cz, minchan@google.com,
	dave@stgolabs.net,  punit.agrawal@bytedance.com,
	lstoakes@gmail.com, linux-mm@kvack.org,
	 linux-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH 1/1] mm: do not increment pgfault stats when page fault handler retries
Date: Fri, 14 Apr 2023 15:14:23 -0700	[thread overview]
Message-ID: <CAJuCfpFc2SohkkJnEFqZD-uCpSS9sUzToPcQXOR6dHTTE0Ty5w@mail.gmail.com> (raw)
In-Reply-To: <ZDnJ1dOU2tpK6l68@x1n>

On Fri, Apr 14, 2023 at 2:47 PM Peter Xu <peterx@redhat.com> wrote:
>
> Hi, Suren,

Hi Peter,

>
> On Fri, Apr 14, 2023 at 10:54:44AM -0700, Suren Baghdasaryan wrote:
> > If the page fault handler requests a retry, we will count the fault
> > multiple times.  This is a relatively harmless problem as the retry paths
> > are not often requested, and the only user-visible problem is that the
> > fault counter will be slightly higher than it should be.  Nevertheless,
> > userspace only took one fault, and should not see the fact that the
> > kernel had to retry the fault multiple times.
> >
> > Fixes: 6b4c9f446981 ("filemap: drop the mmap_sem for all blocking operations")
> > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > ---
> > Patch applies cleanly over linux-next and mm-unstable
> >
> >  mm/memory.c | 16 ++++++++++------
> >  1 file changed, 10 insertions(+), 6 deletions(-)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 1c5b231fe6e3..d88f370eacd1 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -5212,17 +5212,16 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
> >
> >       __set_current_state(TASK_RUNNING);
> >
> > -     count_vm_event(PGFAULT);
> > -     count_memcg_event_mm(vma->vm_mm, PGFAULT);
> > -
> >       ret = sanitize_fault_flags(vma, &flags);
> >       if (ret)
> > -             return ret;
> > +             goto out;
> >
> >       if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
> >                                           flags & FAULT_FLAG_INSTRUCTION,
> > -                                         flags & FAULT_FLAG_REMOTE))
> > -             return VM_FAULT_SIGSEGV;
> > +                                         flags & FAULT_FLAG_REMOTE)) {
> > +             ret = VM_FAULT_SIGSEGV;
> > +             goto out;
> > +     }
> >
> >       /*
> >        * Enable the memcg OOM handling for faults triggered in user
> > @@ -5253,6 +5252,11 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
> >       }
> >
> >       mm_account_fault(regs, address, flags, ret);
>
> Here is the mm_account_fault() function taking care of some other
> accountings.  Perhaps good to put things into it?

That seems appropriate. Let me take a closer look.

>
> It also already ignores invalid faults:
>
>         if (ret & (VM_FAULT_ERROR | VM_FAULT_RETRY))
>                 return;

Can there be a case of (!VM_FAULT_ERROR && VM_FAULT_RETRY) - basically
we need to retry but no errors happened? If so then this condition
would double-count pagefaults in such cases. If such return code is
impossible then it's the same as checking for VM_FAULT_RETRY.

>
> I see that you may also want to account for sigbus, however I really don't
> know why.  Explanations would be great when it would matter.  So far it
> makes sense to me if we skip both RETRY or ERROR cases.

Accounting in case of a sigbus is not affected by this patch I think.
We account for sigbus or any other error cases because there was a
pagefault and we need to account for it. Whether we failed to handle
it or not should not affect the count. We skip the retry case because
we know the same fault will be retried. If we don't skip then we will
double-count this fault.

>
> > +out:
> > +     if (!(ret & VM_FAULT_RETRY)) {
> > +             count_vm_event(PGFAULT);
> > +             count_memcg_event_mm(vma->vm_mm, PGFAULT);
>
> There is one thing worth noticing is here vma may or may not be valid
> depending on the retval of the fault.
>
> RETRY is exactly one of the cases that accessing vma may be unsafe due to
> releasing of mmap read lock.  The other one is the recently added
> VM_FAULT_COMPLETE.  So if we want to move this chunk (or any vma reference)
> to be later we need to consider a valid vma / mm being there first, or
> we're prone to accessing a vma that has already been released, I think.

Good catch! I think you are right and I should have stored vma->vm_mm
in the beginning and used it when calling count_memcg_event_mm().
I'll prepare a new patch which handles this correctly.
Thanks,
Suren.

>
> > +     }
> >
> >       return ret;
> >  }
> > --
> > 2.40.0.634.g4ca3ef3211-goog
> >
> >
>
> Thanks,
>
> --
> Peter Xu
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>

next prev parent reply	other threads:[~2023-04-14 22:14 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-14 17:54 Suren Baghdasaryan
2023-04-14 18:11 ` Matthew Wilcox
2023-04-14 21:47 ` Peter Xu
2023-04-14 22:14   ` Suren Baghdasaryan [this message]
2023-04-14 22:26     ` Suren Baghdasaryan
2023-04-14 22:34     ` Peter Xu
2023-04-14 23:49       ` Suren Baghdasaryan
2023-04-15  0:11         ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJuCfpFc2SohkkJnEFqZD-uCpSS9sUzToPcQXOR6dHTTE0Ty5w@mail.gmail.com \
    --to=surenb@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave@stgolabs.net \
    --cc=hannes@cmpxchg.org \
    --cc=jack@suse.cz \
    --cc=jglisse@google.com \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@android.com \
    --cc=laurent.dufour@fr.ibm.com \
    --cc=ldufour@linux.ibm.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=mhocko@suse.com \
    --cc=michel@lespinasse.org \
    --cc=minchan@google.com \
    --cc=peterx@redhat.com \
    --cc=punit.agrawal@bytedance.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox