From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 745D6C77B76 for ; Mon, 17 Apr 2023 19:40:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EC8678E0001; Mon, 17 Apr 2023 15:40:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E78776B0072; Mon, 17 Apr 2023 15:40:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D40078E0001; Mon, 17 Apr 2023 15:40:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id C15D66B0071 for ; Mon, 17 Apr 2023 15:40:43 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 6F234160697 for ; Mon, 17 Apr 2023 19:40:43 +0000 (UTC) X-FDA: 80691900366.27.C221ACB Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf13.hostedemail.com (Postfix) with ESMTP id 1262320011 for ; Mon, 17 Apr 2023 19:40:40 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=PCYLvkK9; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf13.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681760441; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QD7Mgv4uX8Hx1T0et0GtT03NFYK1sdUeat222CePNVI=; b=4CZfI6mpKitSJtfpc+fQ5KBuIgsTTl398/lpzGyRrmMo0HZ76+Bpf+zZaHNRwtjUbasWrW 79Ti/UqlKhA25qAu0TLDeFrb8bUn2K8DqLyGYzXfmQ3xkCRd9GUQVrH9BL+h6+TFM6A6Oj w6stZi6xTJlFCUBY9PPJjdcnFKQhqig= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=PCYLvkK9; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf13.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681760441; a=rsa-sha256; cv=none; b=MiLH5IRpl0bhcpEGxSGwlAMJ89aDQrGgbTxiwZFpbr17x1b/ikhOQMCQAL8nN8AZFJFQj8 BKuGsRzOEa8fOSfOsqS4I6OXDrMiRLEV/dBMoPrtcNoQQCsTkGLZuEAlMIAJV309uP/waP tdgSAvKFB8Q//l9w90WgiPNmMo7yr/Q= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681760440; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=QD7Mgv4uX8Hx1T0et0GtT03NFYK1sdUeat222CePNVI=; b=PCYLvkK9YkKNZrm9jsNLQRh4dbNWt/0e9UWaOfvsPzwXh0QNe6KK++geZLi2ufyzvUFDFI bta8UAuWE0oBrxrJDe/nhDHlaGV6MRzpkj5Igk8nT58cTLk2JsT1+tq35XbZclfwV9FT74 ed74FGwO58U1TCSaAcGZ9YKBN3Iy4sk= Received: from mail-yb1-f197.google.com (mail-yb1-f197.google.com [209.85.219.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-104-ixXIqiA9PhWIK8CN1JOKcA-1; Mon, 17 Apr 2023 15:40:37 -0400 X-MC-Unique: ixXIqiA9PhWIK8CN1JOKcA-1 Received: by mail-yb1-f197.google.com with SMTP id 3f1490d57ef6-b9266754251so59188276.1 for ; Mon, 17 Apr 2023 12:40:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681760436; x=1684352436; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=QD7Mgv4uX8Hx1T0et0GtT03NFYK1sdUeat222CePNVI=; b=jC/m0FHOOk4+rOQ8g1kbN2hJ7nCuX6QME5acxXOiUHyruTJu00biaxCibvSFr3FoIj 9NnLdO1S2tmdbg74+fX1bFdtW0mcoKBF5LIEHvoOl5hJZQvOz/ETuPmsNWftKzWAxJLF PrKgQUIZtjqYSHelW1yMCIrr+TCcg086p2p2sHMN3yR48EV1x+UIBe8NVZPVZDanxGVb pfGa390s5Kkdr4Op9/3GbmzMVUk2PpL/3lPdAPPm90L+ljkPzgdkGvqwkeGG44a64VbF sfoUNUlnYGcqrwQRYEslshmsob0VVY1biWXNLU02qH1TIy60QkJkaRclQN/wsLMd03qL Br5g== X-Gm-Message-State: AAQBX9cwOru6Xvi+2a3hG0s+7pHYLULgoFfN8QkoUNyhjRyRAmbJS2VU 95AcZ9x2mgqWgq/WK7koMm25QHshejNouJgCaBVs0mtIGTKjeZqCtXgcz6fKOZJ1z9yD2HOk6Pv We/1od/DODbU= X-Received: by 2002:a81:583:0:b0:54f:e32a:deff with SMTP id 125-20020a810583000000b0054fe32adeffmr10434447ywf.3.1681760436396; Mon, 17 Apr 2023 12:40:36 -0700 (PDT) X-Google-Smtp-Source: AKy350YD6q5vvoHKqDTd7K6KiDA4ROcON8C6ZRoJdO4XZzX7qGWk2mPo31E07XvG1yaycpLtT59qiw== X-Received: by 2002:a81:583:0:b0:54f:e32a:deff with SMTP id 125-20020a810583000000b0054fe32adeffmr10434432ywf.3.1681760435990; Mon, 17 Apr 2023 12:40:35 -0700 (PDT) Received: from x1n (bras-base-aurron9127w-grc-40-70-52-229-124.dsl.bell.ca. [70.52.229.124]) by smtp.gmail.com with ESMTPSA id b10-20020a81bd0a000000b0054bfc94a10dsm3296416ywi.47.2023.04.17.12.40.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Apr 2023 12:40:35 -0700 (PDT) Date: Mon, 17 Apr 2023 15:40:33 -0400 From: Peter Xu To: Suren Baghdasaryan Cc: akpm@linux-foundation.org, willy@infradead.org, hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com, jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, michel@lespinasse.org, liam.howlett@oracle.com, jglisse@google.com, vbabka@suse.cz, minchan@google.com, dave@stgolabs.net, punit.agrawal@bytedance.com, lstoakes@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Subject: Re: [PATCH v2 1/1] mm: do not increment pgfault stats when page fault handler retries Message-ID: References: <20230415000818.1955007-1-surenb@google.com> MIME-Version: 1.0 In-Reply-To: <20230415000818.1955007-1-surenb@google.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 1262320011 X-Stat-Signature: 8ji7m66cjo4coapggtnj5ygxd45d4f3p X-HE-Tag: 1681760440-149886 X-HE-Meta: U2FsdGVkX1/C7qTzOUufuOMHSlrME5/xXRLdEb/gKNwIomcOshUomWcNOvGSg0UBwNzq3uolZozSsQMVBWe2rF6juwlIqjSMygLSEGP+d/hdqeGH3lEWxaSHSDyTzmNlnbKDdNlpdym/7qSeBeYLUuPGvTGcVmTfL6lRln7bx+uqWXduDL1WvmZl1kvTpaGlGQ7hQ+mUyNYSiOAp9ZaJ03JWpWEr1xkuagvPXukhEH9GneTCSO0e9V1Vd1yXZ22o1wII9RyThB8R5Edw21QLzscm/2x0Id0Zi32y3F0dBZXGIIJveZrbW+skrlj3C5NRMiFc1Dqe0Jtu346ZBnKGrE3zf+66n9++uts9NXhkwgFoTSYjsARM3ds1joQCLGzUPhTxMeyCt7nl4pn1yiQ25s6HWVKtF5yXtaTfypM7WU1iXyOwkZOr2YVkDuIV9JO6IdFWYLIKPUZ31MdQqUvyt9jUsxnydg/0T+LkW6/6OWPurGpKdI0apr5xkShsW91l16qGz+0Pa5zSzK5sQGz4L69GO8ZrjQF/iqWaGzTwe1RYnyutbUVe0dVpn/NZuPanIpbF/mYCNpQjRmveYmP1QV3EGvNQGOKoKCjYH/mGjolxwACXNyDtKsbhgu3/wfu2/eXbqMuaQBq5jHySmARNqj18C6yRqpEBjfVfEHrsX6eatnEWkHtqqmuKYctxugbnnrXwSyD1QzX+UlXa+b67WZiCvHGPS2FeiVdisCjesTQ1F1ejvcS23vYRo7t1O/voMnZusXWwc3fvYO5aFaJm/J6bZdfhr7UAkizsxPt8YESzALQo6a7hgA6tQZr1jEMuoqKYvZYrxChHvX5b19W3a7kddR9otecma6R6CNs1aAXMC7m93oQNgzN0Rsu8eb6ft4U5lI9AuFmqjG+T07onp1p36Y5CgVwgXxcQH06bxFwJ2X0tW4AVmGeRLeo5/FVxVQfVQ9RFDXTm5HRsd7d l0yQshWn beobIjGbGKkQNGtO8gmenAUElosVFCL558h7rNF9RJgo10OpOJPguWTewdpN9nA6VEzjL4h6Fy+QQcs/yQTaSC4DIzFxjU36ZmfQHe4fGkcSdGuT3f3o6Q1Tkfj9nlMIYVMvseV1sEefhcMIOgnqnDK8aYk7mPFf2nhKZ2MIPLcTZzddimSznvlIOHaGFeftkkAsIfARKPI9fjuMQhDfAvDBk58hS7q2xPLCkoGWZzrwYyeuVIlWJq4qGXjo/MecdnJdTSIEG/LTQwIzfa7mYp0wQkGVZJ/yGugaYI/euJjBVGdhCvmVPaa5NFAJLTy4YhIPSM4sFKWbuRJFYPHAm2j72ohiddIeP0IPWLY9pu0vBdGLpCD9qes2Jr5TqNt4ykJJQPNKFgDFgBHbk9QUtlk2uMw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Apr 14, 2023 at 05:08:18PM -0700, Suren Baghdasaryan wrote: > If the page fault handler requests a retry, we will count the fault > multiple times. This is a relatively harmless problem as the retry paths > are not often requested, and the only user-visible problem is that the > fault counter will be slightly higher than it should be. Nevertheless, > userspace only took one fault, and should not see the fact that the > kernel had to retry the fault multiple times. > Move page fault accounting into mm_account_fault() and skip incomplete > faults which will be accounted upon completion. > > Fixes: d065bd810b6d ("mm: retry page fault when blocking on disk transfer") > Signed-off-by: Suren Baghdasaryan > --- > mm/memory.c | 45 ++++++++++++++++++++++++++------------------- > 1 file changed, 26 insertions(+), 19 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 01a23ad48a04..c3b709ceeed7 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -5080,24 +5080,30 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, > * updates. However, note that the handling of PERF_COUNT_SW_PAGE_FAULTS should > * still be in per-arch page fault handlers at the entry of page fault. > */ > -static inline void mm_account_fault(struct pt_regs *regs, > +static inline void mm_account_fault(struct mm_struct *mm, struct pt_regs *regs, > unsigned long address, unsigned int flags, > vm_fault_t ret) > { > bool major; > > /* > - * We don't do accounting for some specific faults: > - * > - * - Unsuccessful faults (e.g. when the address wasn't valid). That > - * includes arch_vma_access_permitted() failing before reaching here. > - * So this is not a "this many hardware page faults" counter. We > - * should use the hw profiling for that. > - * > - * - Incomplete faults (VM_FAULT_RETRY). They will only be counted > - * once they're completed. > + * Do not account for incomplete faults (VM_FAULT_RETRY). They will be > + * counted upon completion. > */ > - if (ret & (VM_FAULT_ERROR | VM_FAULT_RETRY)) > + if (ret & VM_FAULT_RETRY) > + return; > + > + /* Register both successful and failed faults in PGFAULT counters. */ > + count_vm_event(PGFAULT); > + count_memcg_event_mm(mm, PGFAULT); Is there reason on why vm events accountings need to be explicitly different from perf events right below on handling ERROR? I get the point if this is to make sure ERROR accountings untouched for these two vm events after this patch. IOW probably the only concern right now is having RETRY counted much more than before (perhaps worse with vma locking applied). But since we're on this, I'm wondering whether we should also align the two events (vm, perf) so they represent in an aligned manner if we'll change it anyway. Any future reader will be confused on why they account differently, IMHO, so if we need to differenciate we'd better add a comment on why. I'm wildly guessing the error faults are indeed very rare and probably not matter much at all. I just think the code can be slightly cleaner if vm/perf accountings match and easier if we treat everything the same. E.g., we can also drop the below "goto out"s too. What do you think? Thanks, > + > + /* > + * Do not account for unsuccessful faults (e.g. when the address wasn't > + * valid). That includes arch_vma_access_permitted() failing before > + * reaching here. So this is not a "this many hardware page faults" > + * counter. We should use the hw profiling for that. > + */ > + if (ret & VM_FAULT_ERROR) > return; > > /* > @@ -5180,21 +5186,22 @@ static vm_fault_t sanitize_fault_flags(struct vm_area_struct *vma, > vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address, > unsigned int flags, struct pt_regs *regs) > { > + /* Copy vma->vm_mm in case mmap_lock is dropped and vma becomes unstable. */ > + struct mm_struct *mm = vma->vm_mm; > vm_fault_t ret; > > __set_current_state(TASK_RUNNING); > > - count_vm_event(PGFAULT); > - count_memcg_event_mm(vma->vm_mm, PGFAULT); > - > ret = sanitize_fault_flags(vma, &flags); > if (ret) > - return ret; > + goto out; > > if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE, > flags & FAULT_FLAG_INSTRUCTION, > - flags & FAULT_FLAG_REMOTE)) > - return VM_FAULT_SIGSEGV; > + flags & FAULT_FLAG_REMOTE)) { > + ret = VM_FAULT_SIGSEGV; > + goto out; > + } > > /* > * Enable the memcg OOM handling for faults triggered in user > @@ -5223,8 +5230,8 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address, > if (task_in_memcg_oom(current) && !(ret & VM_FAULT_OOM)) > mem_cgroup_oom_synchronize(false); > } > - > - mm_account_fault(regs, address, flags, ret); > +out: > + mm_account_fault(mm, regs, address, flags, ret); > > return ret; > } > -- > 2.40.0.634.g4ca3ef3211-goog > > -- Peter Xu