From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A23A5C77B75 for ; Tue, 18 Apr 2023 22:45:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DEDB08E0002; Tue, 18 Apr 2023 18:45:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D770E8E0001; Tue, 18 Apr 2023 18:45:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF0948E0002; Tue, 18 Apr 2023 18:45:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id A84C98E0001 for ; Tue, 18 Apr 2023 18:45:48 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 66B27C030B for ; Tue, 18 Apr 2023 22:45:48 +0000 (UTC) X-FDA: 80695995576.09.26F883F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf25.hostedemail.com (Postfix) with ESMTP id 3D7D9A000E for ; Tue, 18 Apr 2023 22:45:46 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Ggx4y8UN; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf25.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681857946; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rny7/IdykDsSisA51FPjfejtZJ1cCD5wOnVMPzuiZB0=; b=SnK+WGAo+lbJd2XmouGA85J/HKQatQcdizuWybptv9mm1k4NCMlJlsnyhwN1Cyb7PoDqOZ D3suGe1R4bvLd6NkohK1r54oplKYGoqoo9ojlbBNOGi0807IKSxOUygIMLczJ165I7M5cH IcgdwA1QkNvwHwIqnzhDG6gAN/Mx3wo= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Ggx4y8UN; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf25.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681857946; a=rsa-sha256; cv=none; b=CbDwxessglTRTYAmuZGYqqGdTDJC9JCH6N781KBfvnIo3mK55aMCUXkIUWaPECmomcJfUN nqkqYRzfUcXJM1UG31RTB6jHQVF14w3isbCnTlsxVKxSsWPbpE8nDuQAaEQw6ULJgN+yZg JS1b/sAd1d5NCrKoDIL4iw/1yvN9gZA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681857945; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rny7/IdykDsSisA51FPjfejtZJ1cCD5wOnVMPzuiZB0=; b=Ggx4y8UNlV2oLMgxmnrYKrHQh0+ZmhD5tKSw4bb7P7ZjeHWfOopDGI4nhM3ao4CuhidtSp 6dFOKSpvTtPANib/Heg4zXF6MyEfWZttt0kxUVFImvhzco/+NQ950hARajK+FdhrrRBmVw cratrTikyPITj3Pj6eln0uBGsRnzRD0= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-74-bSGtWIEQPPyrT-J7bNYM7Q-1; Tue, 18 Apr 2023 18:45:42 -0400 X-MC-Unique: bSGtWIEQPPyrT-J7bNYM7Q-1 Received: by mail-qk1-f198.google.com with SMTP id af79cd13be357-74a7c661208so133156285a.1 for ; Tue, 18 Apr 2023 15:45:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681857942; x=1684449942; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rny7/IdykDsSisA51FPjfejtZJ1cCD5wOnVMPzuiZB0=; b=f0Qu0gZ6SGBv/iJYZ7XY8Mtw+ms8FUwPjqwFIKemD4TLkMuzClGwf0JSu9zGiEZgRY 8dL708U1xX6GFNUdWZKK5j+AWmTfl5BA3DgQQz1Gld9GiSf1Rs2dgFMpuFj4EoSmnUQi NKux3AwR4eAAZF76dZ0CMzVeS2V8jSCT7T2xUjoZzCKlzvuzqiD7gc74svDt+IGAkJ3k my5b+3ffZ+FXIGksZf5jWEaszEC7F+hJqMMFpf5BvdImI1WYG9BFimU2aXQw895gqrua T3L1h8CRDqbfNfULIU5Fhf+sadKukSOMGptLKwFGBJAXgdauXmCqJS3XtA8dwZNfJYtZ o30g== X-Gm-Message-State: AAQBX9cjFZwOpQb3pGw2irGDKF7ArAtLlpMLd3TaQLQg7SNbCxm0G7mu kPAn9YGZ3nRwi/5VDBiDxiXQpCeHfMLdjZ8ydruz9W9/ApiUX6oF3gtVJsMN8rlZ5FHsd5l+rtw vwlVNBUE0Zc4= X-Received: by 2002:a05:6214:4109:b0:5ac:325c:a28f with SMTP id kc9-20020a056214410900b005ac325ca28fmr24452924qvb.0.1681857941990; Tue, 18 Apr 2023 15:45:41 -0700 (PDT) X-Google-Smtp-Source: AKy350YThGSid/DuTDWFRNylgzhR7W4SKoj9l/AH0Htc2g0G7OGNK6dtrt0w6/zjnmshXEL55TUivA== X-Received: by 2002:a05:6214:4109:b0:5ac:325c:a28f with SMTP id kc9-20020a056214410900b005ac325ca28fmr24452900qvb.0.1681857941654; Tue, 18 Apr 2023 15:45:41 -0700 (PDT) Received: from x1n ([70.52.229.124]) by smtp.gmail.com with ESMTPSA id ei18-20020ad45a12000000b005eac706d223sm4004767qvb.124.2023.04.18.15.45.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Apr 2023 15:45:41 -0700 (PDT) Date: Tue, 18 Apr 2023 18:45:35 -0400 From: Peter Xu To: Suren Baghdasaryan Cc: Matthew Wilcox , akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com, jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, michel@lespinasse.org, liam.howlett@oracle.com, jglisse@google.com, vbabka@suse.cz, minchan@google.com, dave@stgolabs.net, punit.agrawal@bytedance.com, lstoakes@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Subject: Re: [PATCH v2 1/1] mm: do not increment pgfault stats when page fault handler retries Message-ID: References: <20230415000818.1955007-1-surenb@google.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 3D7D9A000E X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: pu3ib74ikffxm9qihs3xhuciaq1joqbr X-HE-Tag: 1681857946-580266 X-HE-Meta: U2FsdGVkX1/3f2Q+S2TQu6kgh4Np9lpNf+3pgwXIwgOSFLtzjLhHA2LALmcaXfGi9Gxcn3+ru1lH/zW2X/tDH4q1avJlGG+8lRDZbzbvcnjnhkEWUlbi2AXxfTULygq/9LKxSqWZGcTCjSl3ksu7ijb2Nl5/YQAtTwGu9NwN708QeLJz6iXJAIB53X0Je3tb0aGhhbvKDJ5mNggoyGB1rUu1MTWM8yF8KloYOq+K7RB0I/544JJbFNW/+mf4Wca0tvqBnUj5tQ5IIBvXmssl2dQG+fsPjfIfOb6vMfa5HLQyp3H+tCPyzq/VZh42iVZciLj8jySJR4TfMehHka26MoZ4bevaYD+HauC7qrVOuVyImoT0B/njmzROpULvZ+IrkAdKxcl0tN4w7sWWmqphcfnphCB1b/AxOisyX4ILpUXuUqVaU6YrBrKTuJ04iPLv0ljizKTNCtSSW35BTeb+3ZQhluJBWTzEvMPC9vSoRD6wsuzgXGyCWFa6DySxKkDBYjTZo8s6paIeYZ1W5JZokAGeOjsDKNqsMGUQB/NB5UpavFhMy0hCbIHJ7t8hiJbWKzVW1T5TrLTHi7/BjJJ3jXg0nQxK9zQDPysG5C+gRlTpg+41QX5hTAG3bOiAXc4lTdgcHGMk3WbSK5H4UaCgRRvA9px2+3NoLNcyYvo4ZDTpTdKswuYR2izf8d5wgOBtmuc8ZtOlq9KlZTwsNfW/+nF5/UfHuIknWotNK9Hb/medDLVn1rOyIivnel+/OIV0BorKi+7a7a+2G2TZaOfETHuYB4xxpEiIvtlASptKQKGQ/cBGgLl3E7A9NnoSSfPcGYSIt0nl8zYo5quZQ9Ffjz+2FDvJ/YZFwHFW2I3YzhTqXA4gue4GwZEoK8utL14eYmyZTpV6W/YbYxEvPutERNbROmbM2Ndq64gnotepBh+SoZtg60tie+ZMxpYMeyTVbJPIqBz9CvvlUH4Ketm vMJVcCEs lOWLxlirihTxAPqA3DpbfGbpiBWsb6g0BPDrqPXgJDK8r+wmF/gPCChJeRxHxS3BSFA5ubtI4GmsF4dP61QVyxwoUWJ6bmQ8VnWTM5IIac6JKsQ8ziGtDkmbkTQ3ocZLLRayQi9xXrxE/j+H3ig6M/6sU9qYHPkzBO75N/KsHb/kc/NqGQQ2aL0R1gXSyvKaZI68W9xiodagj3gmLyPNvw+wXN1+uLUT+dVfi4jVJRQSinH8xeOcz2/kkO3+tsuspxqptiFdzRgfCNBZHoQWH/JalP14TWp4LGaiw12h2kHGJXbDM2MthfgEtKyvJnIeXSwDEtyN+o2ow3GDOmrNugMjd6EytACl6Pg9FLYMmKRmKKKbOP6ndlbqxtFB2dQ2AfZhkeUMrh9EM/bgz+z4sXP3TBRUA5gyhOadsUPLQI/8zdCaTFtQHK534W/TYe5jJlO1/9YENcS90P/Jr7zWwcoyRhgtuHWqxpb0soPnz8sHXpWS7+tadZ6XpP1oga8W/sWveKkUSv3D5uhS3BJDD5dv3mcf8BQvnc6dJ2e0n+/4FCdA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Apr 18, 2023 at 02:48:58PM -0700, Suren Baghdasaryan wrote: > On Tue, Apr 18, 2023 at 10:17 AM Suren Baghdasaryan wrote: > > > > On Tue, Apr 18, 2023 at 9:06 AM Peter Xu wrote: > > > > > > On Tue, Apr 18, 2023 at 04:08:58PM +0100, Matthew Wilcox wrote: > > > > On Tue, Apr 18, 2023 at 07:54:01AM -0700, Suren Baghdasaryan wrote: > > > > > On Tue, Apr 18, 2023 at 7:25 AM Matthew Wilcox wrote: > > > > > > > > > > > > On Mon, Apr 17, 2023 at 04:17:45PM -0700, Suren Baghdasaryan wrote: > > > > > > > On Mon, Apr 17, 2023 at 3:52 PM Matthew Wilcox wrote: > > > > > > > > > > > > > > > > On Mon, Apr 17, 2023 at 03:40:33PM -0400, Peter Xu wrote: > > > > > > > > > > /* > > > > > > > > > > - * We don't do accounting for some specific faults: > > > > > > > > > > - * > > > > > > > > > > - * - Unsuccessful faults (e.g. when the address wasn't valid). That > > > > > > > > > > - * includes arch_vma_access_permitted() failing before reaching here. > > > > > > > > > > - * So this is not a "this many hardware page faults" counter. We > > > > > > > > > > - * should use the hw profiling for that. > > > > > > > > > > - * > > > > > > > > > > - * - Incomplete faults (VM_FAULT_RETRY). They will only be counted > > > > > > > > > > - * once they're completed. > > > > > > > > > > + * Do not account for incomplete faults (VM_FAULT_RETRY). They will be > > > > > > > > > > + * counted upon completion. > > > > > > > > > > */ > > > > > > > > > > - if (ret & (VM_FAULT_ERROR | VM_FAULT_RETRY)) > > > > > > > > > > + if (ret & VM_FAULT_RETRY) > > > > > > > > > > + return; > > > > > > > > > > + > > > > > > > > > > + /* Register both successful and failed faults in PGFAULT counters. */ > > > > > > > > > > + count_vm_event(PGFAULT); > > > > > > > > > > + count_memcg_event_mm(mm, PGFAULT); > > > > > > > > > > > > > > > > > > Is there reason on why vm events accountings need to be explicitly > > > > > > > > > different from perf events right below on handling ERROR? > > > > > > > > > > > > > > > > I think so. ERROR is quite different from RETRY. If we are, for > > > > > > > > example, handling a SIGSEGV (perhaps a GC language?) that should be > > > > > > > > accounted. If we can't handle a page fault right now, and need to > > > > > > > > retry within the kernel, that should not be accounted. > > > > > > > > > > > > > > IIUC, the question was about the differences in vm vs perf accounting > > > > > > > for errors, not the difference between ERROR and RETRY cases. Matthew, > > > > > > > are you answering the right question or did I misunderstand your > > > > > > > answer? > > > > > > > > > > > > Maybe I'm misunderstanding what you're proposing. I thought the > > > > > > proposal was to make neither ERROR nor RETRY increment the counters, > > > > > > but if the proposal is to make ERROR increment the perf counters > > > > > > instead, then that's cool with me. > > > > > > > > > > Oh, I think now I understand your answer. You were not highlighting > > > > > the difference between the who but objecting to the proposal of not > > > > > counting both ERROR and RETRY. Am I on the same page now? > > > > > > > > I think so. Let's see your patch and then we can be sure we're talking > > > > about the same thing ;-) > > > > > > IMHO if there's no explicit reason to differenciate the events, we should > > > always account them the same way for vm,perf,... either with ERROR > > > accounted or not. > > > > > > I am not sure whether accounting ERROR faults would matter for a mprotect() > > > use case, because they shouldn't rely on the counter to work but the SIGBUS > > > itself to be generated on page accesses with the sighandler doing work. > > > > For that example with GC, these are valid page faults IIUC and current > > PGFAULT counters do register them. Do we want to change that and > > potentially break assumptions about these counters? > > > > > > > > One trivial benefit of keep accounting ERROR is we only need to modify vm > > > account ABI so both RETRY & ERROR will be adjusted to match perf,task > > > counters. OTOH we can also change all to take ERROR into account, but then > > > we're modifying ABI for all counters. > > > > So, not accounting them in both vm and perf would be problematic for > > that GC example and similar cases. > > Are we left with only two viable options?: > > 1. skip RETRY for vm and skip ERROR for both vm and perf (this patch) > > 2. skip RETRY for both vm and perf, account ERROR for both > > > > #2 would go against the comment in mm_account_fault() saying that we > > don't account for unsuccessful faults. I guess there must have been > > some reason we were not accounting for them (such as access to a > > faulty address is neither major nor minor fault)? > > I did some digging in the history and looks like the check for ERROR > was added after this discussion: > https://lore.kernel.org/all/20200624203412.GB64004@xz-x1/ and IIUC the > reason was that previous code also skipped VM_FAULT_ERROR. Peter, is > that correct? I think so. It's a few years ago, what I remember is that we did change some of the counters at least on some archs, and I don't remember anything is broken for real after that. > > It seems this discussion is becoming longer than it should be. How > about we keep the behavior of all counters as they are to avoid > breaking any possible usecases and just fix the double-counting issue > for RETRY cases? Sure, though again I hope we can add some comment explaining, because the outcome of the code can look a bit weird on handling different counters differently. IMHO it can be as simple as "we did accounting differently on different types of counters for historical reasons, here just to make them compatible with old kernels", maybe it will still help a bit when we read again on this chunk of code? Thanks, -- Peter Xu