From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 3E8C6C77B76
	for <linux-mm@archiver.kernel.org>; Mon, 17 Apr 2023 21:21:34 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id B20A68E0002; Mon, 17 Apr 2023 17:21:33 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id AD0918E0001; Mon, 17 Apr 2023 17:21:33 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 9BF9D8E0002; Mon, 17 Apr 2023 17:21:33 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11])
	by kanga.kvack.org (Postfix) with ESMTP id 8D54B8E0001
	for <linux-mm@kvack.org>; Mon, 17 Apr 2023 17:21:33 -0400 (EDT)
Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay07.hostedemail.com (Postfix) with ESMTP id 40B73160499
	for <linux-mm@kvack.org>; Mon, 17 Apr 2023 21:21:33 +0000 (UTC)
X-FDA: 80692154466.21.6D65B5A
Received: from mail-yw1-f172.google.com (mail-yw1-f172.google.com [209.85.128.172])
	by imf14.hostedemail.com (Postfix) with ESMTP id 750EB100003
	for <linux-mm@kvack.org>; Mon, 17 Apr 2023 21:21:31 +0000 (UTC)
Authentication-Results: imf14.hostedemail.com;
	dkim=pass header.d=google.com header.s=20221208 header.b=gPqjm7ba;
	spf=pass (imf14.hostedemail.com: domain of surenb@google.com designates 209.85.128.172 as permitted sender) smtp.mailfrom=surenb@google.com;
	dmarc=pass (policy=reject) header.from=google.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1681766491;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=skxzpscAsv87Qax/0bgvCnRhsQO+KKAWjy3PoG439VQ=;
	b=NcBftSmdcXWHUs2DuPB+28sBtP8yaVh94zIBd1yY8oBLMcyJ8jH5NhSHhf+4e8moJx/Y6h
	hoipxDE9rQzsCZZfgewJ/I6Il3WsMCddjzHJTnpWkPiNF7P7suZKw1TYHcOS/UivECDkL0
	f9geenpVxmML6nei2XN1nrWYFzN1B0U=
ARC-Authentication-Results: i=1;
	imf14.hostedemail.com;
	dkim=pass header.d=google.com header.s=20221208 header.b=gPqjm7ba;
	spf=pass (imf14.hostedemail.com: domain of surenb@google.com designates 209.85.128.172 as permitted sender) smtp.mailfrom=surenb@google.com;
	dmarc=pass (policy=reject) header.from=google.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681766491; a=rsa-sha256;
	cv=none;
	b=Oa4XpPt8kl36iM/J/JXhiVcuWH5Tj+mybS8m4A/oJWgzW3Jpg28QPXNZDm0F1RomKB7acI
	JcDL+c5BxTXcVDW01nIKSJIrjNBtSTMVfS2ZWPi+Iz7NWaXpvX5LPJL5YvuCLhQyCytRsk
	IIYkxp1UkMPvZlysFDC5DIPQIfILX2g=
Received: by mail-yw1-f172.google.com with SMTP id 00721157ae682-54c12009c30so540382927b3.9
        for <linux-mm@kvack.org>; Mon, 17 Apr 2023 14:21:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20221208; t=1681766490; x=1684358490;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=skxzpscAsv87Qax/0bgvCnRhsQO+KKAWjy3PoG439VQ=;
        b=gPqjm7ba7fTmREghp9AWYVRFKr58sh4PS4CfZ/Fnori3qEbcxS/KfNJS0XZwsW6GMt
         TXVFCHSJ7Fn8o4+mQUwJuzw8Xp2NiDUo628lJRW5a9v7nuVqXxflBpIPPu+E2tPFnIyN
         W6IiBRfuAApLpYfoMZRm+fp/BqLRT05Yn8x+14P7AKCmF8kchA2oGoMy60I2IKYzf3To
         uIyxyZYfmT9LqIOpcM/u1oXW0kRh5iF94aoioa3wL1b52JC8lpyn0G+URi0+xcWo6fbE
         5HhM/x6usH5b2BLb48d6c5ugHQCrYB2Y6aVkotUvnHFddkvcoumGa35c786bME/oi6p6
         UFkg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1681766490; x=1684358490;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=skxzpscAsv87Qax/0bgvCnRhsQO+KKAWjy3PoG439VQ=;
        b=DOPPH71lW6CVr11Ujt9evQoDIPLDV+xMlMHmkI6BZaI6zpwFIs2wG6qY7d75uGE0xI
         SxNNqmnEqbJsycmV8MK/gbvYuMg7gMbgtvdNIPMa6mO05enEsGQiMsHxk4TXCPHEybg6
         CnKuEtkoSF70/Kj0uncE11v0LRy9qUChdJ3V63RaMdO4CLEafc+VjTt7mRUHs9qrY22S
         kxwUEjBVzV7bC9xTjOrE+nQ0PJkAPG/NFtRnqqBRSB+W6tcaIceSGNcOOMRm3PXGG/3G
         8TW1+ZXF5RF8ntaITfHh9XevGVnt3PfalQWMQEW1u0LxvUPuFhJ2RdhevOlz4vo7w88u
         1B3w==
X-Gm-Message-State: AAQBX9dtfZMnZELKg+N+/bMVRkOfon+fL9PmYQDzcPBxlW6Tm4g5pAO+
	lt23QTFMoq6G8grzqSB5+XZBL+ILbau/LjZWxUZEcw==
X-Google-Smtp-Source: AKy350blLU+TA2bW6CBbSNbqjHX8LQL1OTHe90yUAeABat6gSKf2OL44RLi4ryGaIDgRHCVtrb0qdgDt2Dwdph7UlPE=
X-Received: by 2002:a81:d441:0:b0:54f:a60c:12eb with SMTP id
 g1-20020a81d441000000b0054fa60c12ebmr9831050ywl.1.1681766490351; Mon, 17 Apr
 2023 14:21:30 -0700 (PDT)
MIME-Version: 1.0
References: <20230415000818.1955007-1-surenb@google.com> <ZD2gsbN2K66oXT69@x1n>
 <CAJuCfpFOSjES2hVM+zZX_i95JBBaZKr6zE9B8tWm_JqROJBPiQ@mail.gmail.com> <ZD22q95Vaex5dfU1@x1n>
In-Reply-To: <ZD22q95Vaex5dfU1@x1n>
From: Suren Baghdasaryan <surenb@google.com>
Date: Mon, 17 Apr 2023 14:21:19 -0700
Message-ID: <CAJuCfpEJph_ZDN-XCQZwrNBu+Ss44At+cEQ7MLAtsyGHGbH3PQ@mail.gmail.com>
Subject: Re: [PATCH v2 1/1] mm: do not increment pgfault stats when page fault
 handler retries
To: Peter Xu <peterx@redhat.com>
Cc: akpm@linux-foundation.org, willy@infradead.org, hannes@cmpxchg.org, 
	mhocko@suse.com, josef@toxicpanda.com, jack@suse.cz, ldufour@linux.ibm.com, 
	laurent.dufour@fr.ibm.com, michel@lespinasse.org, liam.howlett@oracle.com, 
	jglisse@google.com, vbabka@suse.cz, minchan@google.com, dave@stgolabs.net, 
	punit.agrawal@bytedance.com, lstoakes@gmail.com, linux-mm@kvack.org, 
	linux-kernel@vger.kernel.org, kernel-team@android.com
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Rspam-User: 
X-Rspamd-Server: rspam04
X-Rspamd-Queue-Id: 750EB100003
X-Stat-Signature: 9m9xp7ckd39g9hegbgkzfqctqzbc1ekp
X-HE-Tag: 1681766491-263147
X-HE-Meta: U2FsdGVkX1/o02CPuW5J0seQdxCzCVUwIumSmMvAKN2aLHcSE0YoWiU3bhBk1RX9ZVK14vY4a0HJzUqvkdHt9rWPw84x8qEjITm4+B7POBsTjupupb03/4Tkdc6xB5ndSvjIlij5DtluVg1w+Oa1uyqL6kfI2pDrkD+6cxiAqGTebeb+FSEdEd+gStC/4LU2cWhPu6guR4eBz5lkStvzT7OKRE9JwqhQHaLKgRgtodJktTeii9cSnMokbM5/TBVexNUJnAPuYA1jFpHsLtMMQ6hHc0LfPgfCSum76vGsgS9WQF+Y9CqwsIXw+95rg1kmMmeIPwU1cBx8eSbFBXt+zDIFr+UHVxcKiZWhxiJSOqH6wPawNXKgRxJPGIeAS1OqMkB1cLVGTxQ5kMvL0dugUG9rLXD1Krmskwcu25IIxvgUq4pgSM00nS8gFo/J194ZBVhq6am+gFrpiblcoOZdsnmtLiwsK8iutHFiABog4ZYC3LvWx5gJLHdd2bHSokevkjVauUUFsJphtKJ8kh8Xi413QfNHxeY7KWh5SJ+DOjY1fOnF97peKiZ9X2VgwNsCZvTxagEHenDarNK071Ox/Fv9aAWfQGDdOutO1rYSENtPNxLRX8Esd9YF15s6JDTR9H+EdNYPATQP77x8h9YP+WOh29CYcTxaX1pHAOUPkxguTKJFB29lppw5hQTIq573lJFaoxoIiA/goYhMaqjVKdodOHh0LA2TAOq89H0pCmj78xpxnqqI+5MZZIL9Ih1XrmdRy42hMXjV8G+TIrgnnT6GWyRF0Z3DCYc3EH2w6Ecx3+4BF46sFHpm/zwqSGLiuyyH2BsGERwyKPKSA4dO4FbTOvz8xiT5daS21mUzMkZKsCuU4LJfvoA3eEzAIP5lhkh0lzTQ22KkQQ/n/+npHVf1hwEX9jG6rCdfg8G4VQtgl7T7CYTTqsHy6ajAExW7YthbvHuUl8FWrDcv0zT
 Bn63FhLn
 L5jRv6bu7l6RqZmXC3GBL0zGQljIyrXNlAtJzZmZt2LZb+QVZpHDDXOg9pPoxNeegK8hQpDG+aTBsA3656w8iHS5qsQr9C26eeOgNDvErbGh0wbrvkhEPaTYkGlpWjVYDQfAprC1oZIonyBFRLL3JJMfdQjT3F01Qw+UOlpEx53OmWn5u0jFH6zsmKVcm6zMoRvm783k3MuSWscCgiHHdTmfE/TnsNhlSAc5nrACzNrkP27oW0g6mxlBVrUprGZWSycDCMrAc0z7Qo8HkQujm7Vz5Jqulp7AiNpOTnyCNEIUnz0oWLjEmAGfKrua3kWWExFRL3OfzFu432+gWrXNQ7HruYBS6wFfO4hPk
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Mon, Apr 17, 2023 at 2:14=E2=80=AFPM Peter Xu <peterx@redhat.com> wrote:
>
> On Mon, Apr 17, 2023 at 01:29:45PM -0700, Suren Baghdasaryan wrote:
> > On Mon, Apr 17, 2023 at 12:40=E2=80=AFPM Peter Xu <peterx@redhat.com> w=
rote:
> > >
> > > On Fri, Apr 14, 2023 at 05:08:18PM -0700, Suren Baghdasaryan wrote:
> > > > If the page fault handler requests a retry, we will count the fault
> > > > multiple times. This is a relatively harmless problem as the retry =
paths
> > > > are not often requested, and the only user-visible problem is that =
the
> > > > fault counter will be slightly higher than it should be.  Neverthel=
ess,
> > > > userspace only took one fault, and should not see the fact that the
> > > > kernel had to retry the fault multiple times.
> > > > Move page fault accounting into mm_account_fault() and skip incompl=
ete
> > > > faults which will be accounted upon completion.
> > > >
> > > > Fixes: d065bd810b6d ("mm: retry page fault when blocking on disk tr=
ansfer")
> > > > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > > > ---
> > > >  mm/memory.c | 45 ++++++++++++++++++++++++++-------------------
> > > >  1 file changed, 26 insertions(+), 19 deletions(-)
> > > >
> > > > diff --git a/mm/memory.c b/mm/memory.c
> > > > index 01a23ad48a04..c3b709ceeed7 100644
> > > > --- a/mm/memory.c
> > > > +++ b/mm/memory.c
> > > > @@ -5080,24 +5080,30 @@ static vm_fault_t __handle_mm_fault(struct =
vm_area_struct *vma,
> > > >   * updates.  However, note that the handling of PERF_COUNT_SW_PAGE=
_FAULTS should
> > > >   * still be in per-arch page fault handlers at the entry of page f=
ault.
> > > >   */
> > > > -static inline void mm_account_fault(struct pt_regs *regs,
> > > > +static inline void mm_account_fault(struct mm_struct *mm, struct p=
t_regs *regs,
> > > >                                   unsigned long address, unsigned i=
nt flags,
> > > >                                   vm_fault_t ret)
> > > >  {
> > > >       bool major;
> > > >
> > > >       /*
> > > > -      * We don't do accounting for some specific faults:
> > > > -      *
> > > > -      * - Unsuccessful faults (e.g. when the address wasn't valid)=
.  That
> > > > -      *   includes arch_vma_access_permitted() failing before reac=
hing here.
> > > > -      *   So this is not a "this many hardware page faults" counte=
r.  We
> > > > -      *   should use the hw profiling for that.
> > > > -      *
> > > > -      * - Incomplete faults (VM_FAULT_RETRY).  They will only be c=
ounted
> > > > -      *   once they're completed.
> > > > +      * Do not account for incomplete faults (VM_FAULT_RETRY). The=
y will be
> > > > +      * counted upon completion.
> > > >        */
> > > > -     if (ret & (VM_FAULT_ERROR | VM_FAULT_RETRY))
> > > > +     if (ret & VM_FAULT_RETRY)
> > > > +             return;
> > > > +
> > > > +     /* Register both successful and failed faults in PGFAULT coun=
ters. */
>
> [1]
>
> > > > +     count_vm_event(PGFAULT);
> > > > +     count_memcg_event_mm(mm, PGFAULT);
> > >
> > > Is there reason on why vm events accountings need to be explicitly
> > > different from perf events right below on handling ERROR?
> > >
> > > I get the point if this is to make sure ERROR accountings untouched f=
or
> > > these two vm events after this patch. IOW probably the only concern r=
ight
> > > now is having RETRY counted much more than before (perhaps worse with=
 vma
> > > locking applied).
> > >
> > > But since we're on this, I'm wondering whether we should also align t=
he two
> > > events (vm, perf) so they represent in an aligned manner if we'll cha=
nge it
> > > anyway.  Any future reader will be confused on why they account
> > > differently, IMHO, so if we need to differenciate we'd better add a c=
omment
> > > on why.
> > >
> > > I'm wildly guessing the error faults are indeed very rare and probabl=
y not
> > > matter much at all.  I just think the code can be slightly cleaner if
> > > vm/perf accountings match and easier if we treat everything the same.=
 E.g.,
> > > we can also drop the below "goto out"s too.  What do you think?
> >
> > I think the rationale might be that vm accounting should account for
> > *all* events, including failing page faults while for perf, the corner
> > cases which rarely happen would not have tangible effect.
>
> Note that it's not only for perf, but also task_struct.maj_flt|min_flt.
>
> If we check the reasoning of "why ERROR shouldn't be accounted for perf
> events", there're actually something valid in the comment:
>
>          * - Unsuccessful faults (e.g. when the address wasn't valid).  T=
hat
>          *   includes arch_vma_access_permitted() failing before reaching=
 here.
>          *   So this is not a "this many hardware page faults" counter.  =
We
>          *   should use the hw profiling for that.
>
> IMHO it suggests that if someone wants to trap either ERROR or RETRY one
> can use the hardware counters instead.  The same reasoning just sounds
> applicable to vm events too, because vm events are not special in this ca=
se
> to me.
>
> > I don't have a strong position on this issue and kept it as is to
> > avoid changing the current accounting approach. If we are fine with
> > such consolidation which would miss failing faults in vm accounting, I
> > can make the change.
>
> I don't have a strong opinion either.  We used to change this path before
> for perf events and task events and no one complains with the change.  I'=
d
> just bet the same to vm events:
>
> https://lore.kernel.org/all/20200707225021.200906-1-peterx@redhat.com/

Ok, if these rare failures don't change anything then let's
consolidate the code. It should simplify things a bit and will account
faults in a consistent way. I'll post v3 shortly incorporating your
and Matthew's feedbacks. Thanks for the input!

>
> Please feel free to keep it as-is if you still feel unsafe on changing
> ERROR handling.  If so, would you mind slightly modify [1] above explaini=
ng
> why we need ERROR to be accounted for vm accountings with the reasoning?
> Current comment only says "what it does" rather than why.
>
> Thanks,
>
> --
> Peter Xu
>