From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 711D9C77B71 for ; Fri, 14 Apr 2023 22:26:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C5181900003; Fri, 14 Apr 2023 18:26:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BDA85900002; Fri, 14 Apr 2023 18:26:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA2DF900003; Fri, 14 Apr 2023 18:26:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 9B8B0900002 for ; Fri, 14 Apr 2023 18:26:17 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 687411C6235 for ; Fri, 14 Apr 2023 22:26:17 +0000 (UTC) X-FDA: 80681431194.07.EDE35C3 Received: from mail-yw1-f181.google.com (mail-yw1-f181.google.com [209.85.128.181]) by imf29.hostedemail.com (Postfix) with ESMTP id 96D2E120014 for ; Fri, 14 Apr 2023 22:26:15 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=GWKNzmUV; spf=pass (imf29.hostedemail.com: domain of surenb@google.com designates 209.85.128.181 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681511175; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3Zo5vi/OUyjW18FHtcym6cirbr2/eIzRlQ1LRqhG3GY=; b=g/CsEHTXsNLxQbzWdt/PgB1Hx1edKqncuZnANVKNF/NvNbkX1Go+4UDU5cag/E8asRjogU s/7TbBxwRTf/Lf6KJ6GCnFRM6Ap2hhZ3n2BMuBaQwfjcizZm4ccNt+FkHE+myRQlFWlz0x zSELxHof2HTkPzwN/kegjh2Hk9Hu20g= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=GWKNzmUV; spf=pass (imf29.hostedemail.com: domain of surenb@google.com designates 209.85.128.181 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681511175; a=rsa-sha256; cv=none; b=fMN3tsBcUV7bbaoQEg8clwFffqpbU2RUz6R/PuvZzVHTOiqhcIRqxCnTTeBxUHNi60ZsLY Wp1kv14pfRBbcUT48J5jPsshAz/3IYU1sBLzsxdwn/ZGvmAODsvhk5Qb0kApF1Ewos3l8H pkV2WFIYxdhtURycn2+0N+ftjqslST0= Received: by mail-yw1-f181.google.com with SMTP id 00721157ae682-54c12009c30so475590937b3.9 for ; Fri, 14 Apr 2023 15:26:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1681511175; x=1684103175; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=3Zo5vi/OUyjW18FHtcym6cirbr2/eIzRlQ1LRqhG3GY=; b=GWKNzmUVOGiLIAfUwSiBRxsKt4GwKIfy+0j6OAsN0mqsgAgWNRJ9wlEjLL3ugCNS78 6hxjzhZxO3qRxioyxewdPISwbWiGfGAsDaO69CseW+1hR0N7sOz9BCAhxYqQfhy/5VI/ +QyBz7BkxLno38LWBDkV4GYqen4AXCI+v7IllOan2OEZUTWT5ZGe+DeaMborbwZeDG/D Wa7RR9O+TCtZhLqlf+Pj5pfwks+oFcVghutImp63URcO+O9eIul/3M0xdw2LwNf7Rk6u 5i7YXqH2cD1FUPsFakkd9pQnPmT6hyX71kWbHJWgzzJzzb22lMl3RJY+ivzBmKWEbYjG YN/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681511175; x=1684103175; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3Zo5vi/OUyjW18FHtcym6cirbr2/eIzRlQ1LRqhG3GY=; b=hKXk3Zytf4nvx5zwNy//NxvSIxTPLPlSioMKSR4F9dyK6lI3NRw27wi9KEOCXtFI42 4+PZ/Ix+mJPcHStQ1WHhLEVsxMtD6WT/3wbcJUMHhmwniRg47uhtCYQCHaFC+NrqkTrE kwx/KMAcj2MyVqEWe4nE0MFer9SqFnJx0C5sXjLVBh69ZH6fW8Om49n/hEftoxOmzpab bt3s4lC4eE/sP7iawRANBlJxu5skG8gz1MG97y8A4AWB+8JVdDGyoWlKYf9JJJaXoOsj Pz4oRVCYBcug+41AkPJ4NnHg4jkgR19ySL/PNw7r/aZNHLaeA8bpszizqEu9UGzRmp3d hiLw== X-Gm-Message-State: AAQBX9cMNQ3fYxNxTm8m4vSzRrhcgo0eJyX5jzCmq9h0111whGP8TEDK 5vBA3+u2CkBjZhr2j0neUMl5UMy/6tD7Mqo6Vwo9Qw== X-Google-Smtp-Source: AKy350a6izrIjhN/n8CfqBd++qEYoiOYfIqd1Czw5zeeiDH3YQmm42V3W+2ubr9idRCr1myaN3sI9FZr9eUCHvlZrIA= X-Received: by 2002:a81:d44c:0:b0:54e:d618:f86c with SMTP id g12-20020a81d44c000000b0054ed618f86cmr4751086ywl.1.1681511174586; Fri, 14 Apr 2023 15:26:14 -0700 (PDT) MIME-Version: 1.0 References: <20230414175444.1837474-1-surenb@google.com> In-Reply-To: From: Suren Baghdasaryan Date: Fri, 14 Apr 2023 15:26:03 -0700 Message-ID: Subject: Re: [PATCH 1/1] mm: do not increment pgfault stats when page fault handler retries To: Peter Xu Cc: akpm@linux-foundation.org, willy@infradead.org, hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com, jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, michel@lespinasse.org, liam.howlett@oracle.com, jglisse@google.com, vbabka@suse.cz, minchan@google.com, dave@stgolabs.net, punit.agrawal@bytedance.com, lstoakes@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: u37hbhzejfyckn1ox5b9wcezkd6ab16s X-Rspamd-Queue-Id: 96D2E120014 X-HE-Tag: 1681511175-537578 X-HE-Meta: U2FsdGVkX1/36JS2V8xHFJc/dHkC7vyasIOV5RgDdjsvgRZkOkG1VEFgB2LkhrzScU7qlRH1hyQ28Zy3Vl+IdeTlsr/K+q6hsye67Q/PjwqKWEC4UL2XpKT5glpDJ9Xmqc96IVdABrdmABYjvDezDcLZkRhlzbZZf9iRpTDL3QvDxeLJa4Hviez/Sx2b73Z738M5N5BuqRamapH489L+kGeGS3nMxp7SJf4Yjg+YCHOSMM1674uzQBP0BXpsogMU8O34gNIY2BecP+ypmSBP+18tohSZZJ+by7RdZ5JTkQ5XdK0CYa2V91OvTrcz946RvpuzSXWkqT+Du98r2oO+/FmYwz6kfrg+EZSJ0Oavu2zkORTqfjTKJlMAY/nuzDh+BYtcC5wMWHSf1lWBvPUUF3w+ER3HmaTvAlYa3yZseg2WTq3EY82Gh/g12M0Q3QyeiJXsPLcJEiBQWhh31IRJNWSMSJhzA9XJDxCUzhYZae96ugo9mtZo8GNFTZDxA2wM5WBmtjqLIDuKCOlNr29ynTGdJPu38y+mALi3nUsIzq5sQgiigWWMnxM50cFDhG0Hoh85sJo5aHJSmfG1hTMTc4sSNDNsc2b9iodo2PeYZettMXFu1wMak2hTBlbakTxNFD1QknTZoOReJ/vvDbraKaLHhtgPZ75zPqEsxO6W+Y/YK2ny3/IKI/Fuc06hUtaBV36jTGbrsZIvkRESwzEg3FvlA/8qo6cgQ0skeX1VHqAa6xws4fXIzVQa//AB25RNJkkK6zpyKNgwuvOIZQuWqOyAf4tRKThNxWaSYwPpwpTlhrIp92U96vEA+dQ6eBm3J8iCxsVwdIhaV+OaLPo8jj+gjeo1f75MS2ezCaY8Mkx7xeyuaRUoFVpxk6J/XSXlPqt9QtIsosUEZm38UEBGFsg6VI/Xo6SnQ7qpe3Js+EJssduudMRutBTBy4EpteMXAo6acS6XOt8CLpDsrcU dr59Lzzv X6KkAPhWZf7I+wHT9eDn7siqCIJRshjYNCZS+3LFXtM6XQULBbzrRDy+JbbN/HKskF4EBPR6JaFrBOU9E91wS+6d98aGGKidolGF+s2qpvvOwKNhqKqjQrOZBmstE47VsI485sVl+2LNqtSMY8xO8sap4vPj5s2feSch6yzEf5Mt+Dr8eOgko59dR9OMNulkWG3Q0isl4fHDU8q/tCHm0Ok3Aor8H5G/NlD3y3MMsnK6Oamu6T0qsNtIQiOnqpdyX/hxJQOtsi9+VEfy7N0idztz58MaF4lMVotN1O+LhHbxmBja12v+XG5vgyyKiBHy42mAhHeE3Wy2ZQBRsbqmuJXRf+OsMtaH9gGKPfEJZoFtrLWgryf6eVVc69SyM7evAii2HaQYuI73shv1Y1BQaKevBQg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Apr 14, 2023 at 3:14=E2=80=AFPM Suren Baghdasaryan wrote: > > On Fri, Apr 14, 2023 at 2:47=E2=80=AFPM Peter Xu wrot= e: > > > > Hi, Suren, > > Hi Peter, > > > > > On Fri, Apr 14, 2023 at 10:54:44AM -0700, Suren Baghdasaryan wrote: > > > If the page fault handler requests a retry, we will count the fault > > > multiple times. This is a relatively harmless problem as the retry p= aths > > > are not often requested, and the only user-visible problem is that th= e > > > fault counter will be slightly higher than it should be. Nevertheles= s, > > > userspace only took one fault, and should not see the fact that the > > > kernel had to retry the fault multiple times. > > > > > > Fixes: 6b4c9f446981 ("filemap: drop the mmap_sem for all blocking ope= rations") > > > Signed-off-by: Suren Baghdasaryan > > > Reviewed-by: Matthew Wilcox (Oracle) > > > --- > > > Patch applies cleanly over linux-next and mm-unstable > > > > > > mm/memory.c | 16 ++++++++++------ > > > 1 file changed, 10 insertions(+), 6 deletions(-) > > > > > > diff --git a/mm/memory.c b/mm/memory.c > > > index 1c5b231fe6e3..d88f370eacd1 100644 > > > --- a/mm/memory.c > > > +++ b/mm/memory.c > > > @@ -5212,17 +5212,16 @@ vm_fault_t handle_mm_fault(struct vm_area_str= uct *vma, unsigned long address, > > > > > > __set_current_state(TASK_RUNNING); > > > > > > - count_vm_event(PGFAULT); > > > - count_memcg_event_mm(vma->vm_mm, PGFAULT); > > > - > > > ret =3D sanitize_fault_flags(vma, &flags); > > > if (ret) > > > - return ret; > > > + goto out; > > > > > > if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE, > > > flags & FAULT_FLAG_INSTRUCT= ION, > > > - flags & FAULT_FLAG_REMOTE)) > > > - return VM_FAULT_SIGSEGV; > > > + flags & FAULT_FLAG_REMOTE))= { > > > + ret =3D VM_FAULT_SIGSEGV; > > > + goto out; > > > + } > > > > > > /* > > > * Enable the memcg OOM handling for faults triggered in user > > > @@ -5253,6 +5252,11 @@ vm_fault_t handle_mm_fault(struct vm_area_stru= ct *vma, unsigned long address, > > > } > > > > > > mm_account_fault(regs, address, flags, ret); > > > > Here is the mm_account_fault() function taking care of some other > > accountings. Perhaps good to put things into it? > > That seems appropriate. Let me take a closer look. > > > > > It also already ignores invalid faults: > > > > if (ret & (VM_FAULT_ERROR | VM_FAULT_RETRY)) > > return; > > Can there be a case of (!VM_FAULT_ERROR && VM_FAULT_RETRY) - basically > we need to retry but no errors happened? If so then this condition > would double-count pagefaults in such cases. If such return code is > impossible then it's the same as checking for VM_FAULT_RETRY. > > > > > I see that you may also want to account for sigbus, however I really do= n't > > know why. Explanations would be great when it would matter. So far it > > makes sense to me if we skip both RETRY or ERROR cases. > > Accounting in case of a sigbus is not affected by this patch I think. > We account for sigbus or any other error cases because there was a > pagefault and we need to account for it. Whether we failed to handle > it or not should not affect the count. We skip the retry case because > we know the same fault will be retried. If we don't skip then we will > double-count this fault. mm_account_fault() has a nice comment explaining why it skips errors and that now makes sense to me. Let me move the accounting there and see if others agree that's the right place. > > > > > > +out: > > > + if (!(ret & VM_FAULT_RETRY)) { > > > + count_vm_event(PGFAULT); > > > + count_memcg_event_mm(vma->vm_mm, PGFAULT); > > > > There is one thing worth noticing is here vma may or may not be valid > > depending on the retval of the fault. > > > > RETRY is exactly one of the cases that accessing vma may be unsafe due = to > > releasing of mmap read lock. The other one is the recently added > > VM_FAULT_COMPLETE. So if we want to move this chunk (or any vma refere= nce) > > to be later we need to consider a valid vma / mm being there first, or > > we're prone to accessing a vma that has already been released, I think. > > Good catch! I think you are right and I should have stored vma->vm_mm > in the beginning and used it when calling count_memcg_event_mm(). > I'll prepare a new patch which handles this correctly. > Thanks, > Suren. > > > > > > + } > > > > > > return ret; > > > } > > > -- > > > 2.40.0.634.g4ca3ef3211-goog > > > > > > > > > > Thanks, > > > > -- > > Peter Xu > > > > -- > > To unsubscribe from this group and stop receiving emails from it, send = an email to kernel-team+unsubscribe@android.com. > >