From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1EB3BC77B75 for ; Tue, 18 Apr 2023 22:58:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3D8DF8E0003; Tue, 18 Apr 2023 18:58:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 387408E0001; Tue, 18 Apr 2023 18:58:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2279A8E0003; Tue, 18 Apr 2023 18:58:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 10C9F8E0001 for ; Tue, 18 Apr 2023 18:58:41 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D0A23C047D for ; Tue, 18 Apr 2023 22:58:40 +0000 (UTC) X-FDA: 80696028000.21.D01EEBA Received: from mail-yb1-f179.google.com (mail-yb1-f179.google.com [209.85.219.179]) by imf19.hostedemail.com (Postfix) with ESMTP id BB0CF1A001C for ; Tue, 18 Apr 2023 22:58:38 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=ypRebcfY; spf=pass (imf19.hostedemail.com: domain of surenb@google.com designates 209.85.219.179 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681858718; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bvYEot/5uPo2yTiZUwgFRQERBlrKMXbkA+R9nVGIhZs=; b=Iq6lemW2TRtdB4EVZkbmbSS707oyI0TEChjUEqiAvDnZ50BAKjbnwbYB6ZlU1aRABgORhD tyJ9Lrd35eESCvHH5TlnvAG9rqKAkon+lNk6cTayImUwKx3r9xfMoNhIVNkCjCGxGs+B8t hGIUYV2hDbemn49xU6UBOpFIx9b+9ik= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=ypRebcfY; spf=pass (imf19.hostedemail.com: domain of surenb@google.com designates 209.85.219.179 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681858718; a=rsa-sha256; cv=none; b=hP9bbvmd8JolcO9TVbEHLCxIwuESyOOT/ZuTsoH+DIj+FUYQ0zSP3rbxiuwkscansun/sf 5MwX5hlw87H7B2IQ+07fpY6GaY9lnz9BZZm6YmOI18TpOtjgDikmoJufNBu7vU6qGTWcEl g5pNINrVI9T+DUPvG+DtdqQ90wjw0nI= Received: by mail-yb1-f179.google.com with SMTP id a11so8182563ybm.3 for ; Tue, 18 Apr 2023 15:58:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1681858718; x=1684450718; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=bvYEot/5uPo2yTiZUwgFRQERBlrKMXbkA+R9nVGIhZs=; b=ypRebcfY7uVsRRBGCbwjczNtpsGKEYG/LJYUgFn5aRrgwHezDeSgDNDtB1vIJVSL9k whA28RwOITWSFhUQi3/xRtw0wiTblGhbFqok0pv9LyNqlwkIoT+TTKuy5dJin1KnH9HS itySWSaGVjuMod6EWJCLoQRULeKOQREAeCf4a4/JT6iyTSC74XSc0Rbp/RYXQPuE0VHi ODmp65ImNG5zt3jdMvR/vz1LjAtFkzO0bKt2HeMsIG3Hq+FPRcWgMHV07uGFqj4UXlow ZIFSItxszVyEfdAn5IiA4hdehPJuS6o+JOtAvJJKoNVvY4ozZFAV1ArsSlxioKNW+xLZ B8Hw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681858718; x=1684450718; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bvYEot/5uPo2yTiZUwgFRQERBlrKMXbkA+R9nVGIhZs=; b=jfycwsFv/rYpL2LvqoxPxL4l2fU7tGjxkAkxw3WFzQq1BrwOwp3KhjXmium0Cx+YPe xo+1luuQvbiEhm9NzCL6CoO7e6Ks4sweQuxq7koz2awzaMLiYsELIMpCGDOcsmr6qYg+ dQBwD7Xv6tEPh5RO4jdpE8Zy5T51bO0Tz51U2+n8PQDGKL1FEPhNQai2W5U+6uHdUzRD nEeoO7zc6wxMe2EaGrCBekI1kSQbMUcI+5avF4EgMOO5wE5ZSV5ILY/0cE52A8VOvQsm fg+wVYcnKUli2UcjWkelCR5/ugqUg9eEMvTrpgAtNaD99o3LI0+9qgCCapCUcs+h//eh Fzyw== X-Gm-Message-State: AAQBX9eCXpInq62jyOoDFOdcL2WZdcwUvjtFD1UZWfqebyu5VEfQlBV6 FqcOk0g7t4nNioehb1SoGzaJv645Bb+U6yHBz5BW+A== X-Google-Smtp-Source: AKy350Ydxh/rfN3DOvIJciBOMR/gM/ter7C1iQatB8w0ubPTTGvXRRGsqJwj5QM5lpxB4b2fb6n1juma6aZLrz2vPmg= X-Received: by 2002:a25:bc5:0:b0:b67:54e8:417d with SMTP id 188-20020a250bc5000000b00b6754e8417dmr602902ybl.0.1681858717526; Tue, 18 Apr 2023 15:58:37 -0700 (PDT) MIME-Version: 1.0 References: <20230415000818.1955007-1-surenb@google.com> In-Reply-To: From: Suren Baghdasaryan Date: Tue, 18 Apr 2023 15:58:26 -0700 Message-ID: Subject: Re: [PATCH v2 1/1] mm: do not increment pgfault stats when page fault handler retries To: Peter Xu Cc: Matthew Wilcox , akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com, jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, michel@lespinasse.org, liam.howlett@oracle.com, jglisse@google.com, vbabka@suse.cz, minchan@google.com, dave@stgolabs.net, punit.agrawal@bytedance.com, lstoakes@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: BB0CF1A001C X-Rspam-User: X-Stat-Signature: trorzamsa6gtfejck9akmwwmo9jsp4cg X-HE-Tag: 1681858718-845901 X-HE-Meta: U2FsdGVkX1/+CNpRhHA3AGaydsQYdrwLLKnJJjjh7DdEmrvmpqWYQHuaVKgRCxAaKa/2URG6Ug6XhmZTu20Xz89f1HwytRUOVr6hYA3Cz/wuP97eIg06fgAghkN/28Iwql62b7dONa7wDeGqsvvxK8FgO4mbQvXQAxY9tEjTqi3jjvK1mvHK+kipPkotNfYHwocqTscxZP0FNiplUft1jK37uwNLQtPuwhk8i2ISt3ULFCPSf6gFJRhevGStwzJY+Z58hSZXhYCtMZOJYOTVGV4ENrei9deqReCGW6/TPd2s+zP4wS0dDUnS37Fl/jVMoEwVsfvvB7NDxhZ3RKa69RJjPEoGQ/dji3YFuYjgn67tKQhMDg9sIMXPVBYNDRRDxP0aoBNA6U5EWE4x/HY/st9gEkcjoOOp0BrRy1Xs4fZIEF0eMCgHb4vHR6LRcpxykCnuI0nlKD0frqvIvYnFHUPX0qyJr+TbPN/6qiIatr9wPsT1oU1IMWQlHZ/rdqyHl+iOfr3g8qr20gw8MCNpNua/GKc5zPvOCyZ0d0O6CJxS8a44EmAWs1IfXqynzvexAIB1mqgoFzN4IciumUIL7G2Eg+XetUIgfswUIk6wZMyX0YWXP/aJQP0VmHMOIlJcBIuUqdx+RbTBpidFjVci+YZsEFoXx+CTe9Tp3KjAdq3PXGinEXlFmD5b/sr2Q/jqOofHqzQdCeCcbdL+lR5sNqQYrboIeyZbxWpUKuoqEjCx6F/6hzaBfOXrptuNdU6bLHjgK01Hc6R64Ku4odDWbFLcKHKBRyNh63sjHfQu9JdjDNWmv/XH0gIedid+mhnUqdRpkAiQHlNKRzRhdUcCCBlWnA+iWGTIS3PZSsrAGTvIv+yOQVA1nwV2iTzunaoi0R0tIFct5tYJrP/aWJOB+4RQC5LPIJ39m9HdIolanopnnZmQ4o7aA+UcCv5cpoMzD8QW7Zhpj9WcfIlnLCI 524YDEYG 4LKKy1k97hTObLnc6vUrTtoYzbYpTkeGA4ciSOOObiygEo/nG+awBaXkUGUSul8vhFY5q4EROpGCUIncfK7sJIvHeQshkMh2UYjgbLAsXX6oS9Hof1r+yNV18I4EYXWy02K9dUGjbgHDPcZyy2I1H5VdssmSxhfxMkBgWKCJYHL2QNPonm8yAVKVuCd/r7SBixH8W/rkh4zXhbRjFSBGPNRMG2nyf6S6IiIAmtVRZmVEovW7lKDNTLvFejz54xLbduO2aRjkbkuBBGcDolMSKrlkIrlHHcb+Fg7seHiZ2+AxHOdbuX+I7aa99uD5C4/b80b8L8k4EDRxbK/bGzgBl6bCJ/6sNtlYqa8/suocgzuSJ1TVPuHU4jol5HoXO5bZjjeHxYjNIfgJXSBNrXsBo72gnImSodBmDDWDO X-Bogosity: Ham, tests=bogofilter, spamicity=0.000029, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Apr 18, 2023 at 3:45=E2=80=AFPM Peter Xu wrote: > > On Tue, Apr 18, 2023 at 02:48:58PM -0700, Suren Baghdasaryan wrote: > > On Tue, Apr 18, 2023 at 10:17=E2=80=AFAM Suren Baghdasaryan wrote: > > > > > > On Tue, Apr 18, 2023 at 9:06=E2=80=AFAM Peter Xu = wrote: > > > > > > > > On Tue, Apr 18, 2023 at 04:08:58PM +0100, Matthew Wilcox wrote: > > > > > On Tue, Apr 18, 2023 at 07:54:01AM -0700, Suren Baghdasaryan wrot= e: > > > > > > On Tue, Apr 18, 2023 at 7:25=E2=80=AFAM Matthew Wilcox wrote: > > > > > > > > > > > > > > On Mon, Apr 17, 2023 at 04:17:45PM -0700, Suren Baghdasaryan = wrote: > > > > > > > > On Mon, Apr 17, 2023 at 3:52=E2=80=AFPM Matthew Wilcox wrote: > > > > > > > > > > > > > > > > > > On Mon, Apr 17, 2023 at 03:40:33PM -0400, Peter Xu wrote: > > > > > > > > > > > /* > > > > > > > > > > > - * We don't do accounting for some specific fault= s: > > > > > > > > > > > - * > > > > > > > > > > > - * - Unsuccessful faults (e.g. when the address w= asn't valid). That > > > > > > > > > > > - * includes arch_vma_access_permitted() failing= before reaching here. > > > > > > > > > > > - * So this is not a "this many hardware page fa= ults" counter. We > > > > > > > > > > > - * should use the hw profiling for that. > > > > > > > > > > > - * > > > > > > > > > > > - * - Incomplete faults (VM_FAULT_RETRY). They wi= ll only be counted > > > > > > > > > > > - * once they're completed. > > > > > > > > > > > + * Do not account for incomplete faults (VM_FAULT= _RETRY). They will be > > > > > > > > > > > + * counted upon completion. > > > > > > > > > > > */ > > > > > > > > > > > - if (ret & (VM_FAULT_ERROR | VM_FAULT_RETRY)) > > > > > > > > > > > + if (ret & VM_FAULT_RETRY) > > > > > > > > > > > + return; > > > > > > > > > > > + > > > > > > > > > > > + /* Register both successful and failed faults in = PGFAULT counters. */ > > > > > > > > > > > + count_vm_event(PGFAULT); > > > > > > > > > > > + count_memcg_event_mm(mm, PGFAULT); > > > > > > > > > > > > > > > > > > > > Is there reason on why vm events accountings need to be= explicitly > > > > > > > > > > different from perf events right below on handling ERRO= R? > > > > > > > > > > > > > > > > > > I think so. ERROR is quite different from RETRY. If we = are, for > > > > > > > > > example, handling a SIGSEGV (perhaps a GC language?) that= should be > > > > > > > > > accounted. If we can't handle a page fault right now, an= d need to > > > > > > > > > retry within the kernel, that should not be accounted. > > > > > > > > > > > > > > > > IIUC, the question was about the differences in vm vs perf = accounting > > > > > > > > for errors, not the difference between ERROR and RETRY case= s. Matthew, > > > > > > > > are you answering the right question or did I misunderstand= your > > > > > > > > answer? > > > > > > > > > > > > > > Maybe I'm misunderstanding what you're proposing. I thought = the > > > > > > > proposal was to make neither ERROR nor RETRY increment the co= unters, > > > > > > > but if the proposal is to make ERROR increment the perf count= ers > > > > > > > instead, then that's cool with me. > > > > > > > > > > > > Oh, I think now I understand your answer. You were not highligh= ting > > > > > > the difference between the who but objecting to the proposal of= not > > > > > > counting both ERROR and RETRY. Am I on the same page now? > > > > > > > > > > I think so. Let's see your patch and then we can be sure we're t= alking > > > > > about the same thing ;-) > > > > > > > > IMHO if there's no explicit reason to differenciate the events, we = should > > > > always account them the same way for vm,perf,... either with ERROR > > > > accounted or not. > > > > > > > > I am not sure whether accounting ERROR faults would matter for a mp= rotect() > > > > use case, because they shouldn't rely on the counter to work but th= e SIGBUS > > > > itself to be generated on page accesses with the sighandler doing w= ork. > > > > > > For that example with GC, these are valid page faults IIUC and curren= t > > > PGFAULT counters do register them. Do we want to change that and > > > potentially break assumptions about these counters? > > > > > > > > > > > One trivial benefit of keep accounting ERROR is we only need to mod= ify vm > > > > account ABI so both RETRY & ERROR will be adjusted to match perf,ta= sk > > > > counters. OTOH we can also change all to take ERROR into account, = but then > > > > we're modifying ABI for all counters. > > > > > > So, not accounting them in both vm and perf would be problematic for > > > that GC example and similar cases. > > > Are we left with only two viable options?: > > > 1. skip RETRY for vm and skip ERROR for both vm and perf (this patch) > > > 2. skip RETRY for both vm and perf, account ERROR for both > > > > > > #2 would go against the comment in mm_account_fault() saying that we > > > don't account for unsuccessful faults. I guess there must have been > > > some reason we were not accounting for them (such as access to a > > > faulty address is neither major nor minor fault)? > > > > I did some digging in the history and looks like the check for ERROR > > was added after this discussion: > > https://lore.kernel.org/all/20200624203412.GB64004@xz-x1/ and IIUC the > > reason was that previous code also skipped VM_FAULT_ERROR. Peter, is > > that correct? > > I think so. > > It's a few years ago, what I remember is that we did change some of the > counters at least on some archs, and I don't remember anything is broken > for real after that. > > > > > It seems this discussion is becoming longer than it should be. How > > about we keep the behavior of all counters as they are to avoid > > breaking any possible usecases and just fix the double-counting issue > > for RETRY cases? > > Sure, though again I hope we can add some comment explaining, because the > outcome of the code can look a bit weird on handling different counters > differently. > > IMHO it can be as simple as "we did accounting differently on different > types of counters for historical reasons, here just to make them compatib= le > with old kernels", maybe it will still help a bit when we read again on > this chunk of code? Sure. How about replacing my prior "Register both successful and failed faults in PGFAULT counters." comment with "To preserve the behavior of older kernels, PGFAULT counters record both successful and failed faults, as opposed to perf counters which ignore failed cases" ? > > Thanks, > > -- > Peter Xu >