From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C783C04FF6 for ; Fri, 19 Apr 2024 15:17:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 994986B0098; Fri, 19 Apr 2024 11:17:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 94DFA6B0099; Fri, 19 Apr 2024 11:17:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7BDE16B009A; Fri, 19 Apr 2024 11:17:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 5B4A96B0098 for ; Fri, 19 Apr 2024 11:17:37 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id F3649A2000 for ; Fri, 19 Apr 2024 15:17:36 +0000 (UTC) X-FDA: 82026635712.09.CCCE593 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf11.hostedemail.com (Postfix) with ESMTP id DBA8C40004 for ; Fri, 19 Apr 2024 15:17:34 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Wxkpitbj; spf=pass (imf11.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713539855; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7m9LtbQCR/L382sHEcH3ceR+QwrQChvKCapn5qe+3wk=; b=U2a1UzjP1OmKb8E0X/MqLkkhsclO9HmIAxvLJRV1gdnR45QPccwILAGWynAtYALxrzdEO/ bkF/3DF7WAoY+vLACMPzc7uts3EQS+vYqWyON+nvBL01/pHdvJ6+2ZdxTBdXWcvLxot932 KTW5I8vxGdjiX/nbq9Acn9TTet6bObE= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Wxkpitbj; spf=pass (imf11.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713539855; a=rsa-sha256; cv=none; b=RFAcuqOUrGFi7xwJmfJ+NitHPsVmQDA6tNA87rvdyNft2nirvgOXuZwHeRiW7DlVjXnqOl nHKAp5MA9HeKLWfUwIxPiQLPQ0aEoQ6+dwrLO3bEy5xj0OSJEAvLavwDomrYu+RcnKXX63 De3k70+w+415o3IWe/54aWnqZVLUFF8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1713539854; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=7m9LtbQCR/L382sHEcH3ceR+QwrQChvKCapn5qe+3wk=; b=WxkpitbjSn6sBIVWJME75mQQIxiC9bpENk8Gburp5J7oVOLRJwxUCV3e/qdU84zKavfyI3 coVOsDcHcGjyXh+FCmDrQ202gm2tAsRb/8AdzQ1jOcK3+CAOozdMSEfyugMB0pJriyw8K8 oD9o0JMJtYEOqwcdUYGn2Te+TXK3ONs= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-75-CukNqbhWNiC0OtcAgR571w-1; Fri, 19 Apr 2024 11:17:30 -0400 X-MC-Unique: CukNqbhWNiC0OtcAgR571w-1 Received: by mail-qt1-f200.google.com with SMTP id d75a77b69052e-4379bbdc9b4so8601051cf.2 for ; Fri, 19 Apr 2024 08:17:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713539850; x=1714144650; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=7m9LtbQCR/L382sHEcH3ceR+QwrQChvKCapn5qe+3wk=; b=X8wQfjS0pnmFjV14YlEuWb5Kz5fKYVSsT0mcUXCojXf26OSj7Qz+JvYKqYuJBiu+QR 40ePu4wMVCU5k7NudEKezZpXfImshsVA/MI6tt+mbVurQQ0cFM6SKlA2IBiLNLsH6eo7 cX4iJTMXsoGUHdBK3/D1InYWW2i94tXGqThYniu6x+YQKV38B3VETSN7GpWdcM3sJmgM hZgGlq/QR7Y+7AXnQsEt7+aN3vs7SKLA3Hv29Yida9Zb2vIy31BRrqVFaSOfuHuvinD2 2BOLAVuHZ4fSmCsw569OezQUhJqxYrdi1yXaork3wfMg6VP0WQ+b52qQ/OkPohd/PJyG x94g== X-Forwarded-Encrypted: i=1; AJvYcCXhk+mR+0QGHxzX/X1R/PyBCdU4+KxcmUxoZ8Ysq9op1MOOLl0GAl3amnS5eKBc1yrmpwlKN/zk1xDkFsps0v8ATeQ= X-Gm-Message-State: AOJu0YyzCNUWx+joX/4ntFy4+SG0Zsm5fyNRgcayBEpDMjJJlTXuTonP fUBzeVG1J5SzGYZ/OdCVYHztW5GLGVxkERRdpTy56DQnc1TmewBlTOM9czX29itidNidsgtf6j2 KWafTQeovrmYlpw1sL/r8QkJRZ8lmbbStGWhLjiI/QmCYOlT5jtBWhe5+ X-Received: by 2002:ac8:5fd2:0:b0:437:9744:783e with SMTP id k18-20020ac85fd2000000b004379744783emr2878787qta.2.1713539849838; Fri, 19 Apr 2024 08:17:29 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHRim2yhF+boJy538TD+XRNLFFj0MihE6nWZvnZT3ciAZ4/e4MORhnCY2PeqyoP210KkhW88g== X-Received: by 2002:ac8:5fd2:0:b0:437:9744:783e with SMTP id k18-20020ac85fd2000000b004379744783emr2878757qta.2.1713539849140; Fri, 19 Apr 2024 08:17:29 -0700 (PDT) Received: from x1n (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id n9-20020a05622a11c900b00437a0c8e662sm1506821qtk.20.2024.04.19.08.17.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Apr 2024 08:17:28 -0700 (PDT) Date: Fri, 19 Apr 2024 11:17:27 -0400 From: Peter Xu To: Kefeng Wang Cc: Andrew Morton , linux-mm@kvack.org Subject: Re: [PATCH v2] mm: memory: check userfaultfd_wp() in vmf_orig_pte_uffd_wp() Message-ID: References: <20240418120641.2653165-1-wangkefeng.wang@huawei.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Queue-Id: DBA8C40004 X-Stat-Signature: 5t41t6thjr6rweks81dmhd5hzdzmi3cp X-Rspam-User: X-Rspamd-Server: rspam07 X-HE-Tag: 1713539854-697353 X-HE-Meta: U2FsdGVkX18rnfBRqY71mwuw5CRSpDb4vNUXYwpHv3o3dsSMVqAfIRRyTSzF6kHNb6uMSQX9h3fJA1xZZOjijLbmA/4v6Ya9jO/s/IXc22m4Dt7yVGWF8s8Vk+vu/T06oXub5LI6jeZYrBMywQRVbmhhQc2t2rG8Td7KSjnv8CVOfgM+KfjWerFE1JbGs3wZxjHkLF5vkAFpnFgHEDqQ5BoNXnD9ETzsq/T+TkYQmyMc+eXQfJY5oh0hmhBgdM10w+lLJAcuyqpCjk7IUDhol+91bbZ17WnsIJGju3M6Z++q4MKd8zRASpSoZxI3dY/KqEThNByYhJ1U+rRd9lXlvW0a0qNY9hdQMv+9gdBoOhPZqIoZsOvWvWM2XZEgmwO7KZFKlDBqdts+PNMGpyPr0YDKh50y1RURU6TIQseQfzjD1PYC3VqHMAqfrVSOYftcYOIfwn8/DNFCrc+IvnHCixi2Ofah+YLHT9w5RfWTkvbpb/1Ot0OaDQsolIttbJzasr8zYgv/dmjAYdkusPOBdWXToABX8xUxms2zTL99Oads+Ph/45wFS+Zpq227iqbg+e9Eqy7wjWFe2Ym94LZSimIcL5nUJ5NtSoMMAw0kAjeXgRlQ8jjlCcc6d0rHZkYWCWq/6SuEi/H0b9p38Oj+ivwiOprIbozzh+XIAT9Fv9FNUyj+Y0mW0PxrDZCK1kZGt+KP07343RXqScUH3Stft7/em1teRKUQu7n9mVk14+FEmLgeklqUmIl+MMIhXf9u2yMm7v/IlObo0okY1oty5USeuk7TFEw+qi689GIjZTtUqwMyS2F0V6uFfkHAE+Nmy5g+KmYRWqYEbVy9ABb+iINODmjRN56haDk0dIHg9gUpWVQq6mXOmd7xkWwllFSoeq6rLfe8gUxHyNAIDmQSu/4hXi2Y3pNCb5P+dvznCWmAlahkB2WBWlWQSoEqn6Z2t0Wt/3Bak8oi7ereG9b eZTuXtVJ 2wh2yCsLqI5QJDK3YdFj0ewja++YGrLxJuDKeojdHLjhHr2yqX63jDDx9CsyDvaFRiQJfljUqr1yhZBQLRbP/cA+OFz+RUDxXQQHlKt6vPD2+nPFHKsC61cTabLyEAzY9omQsXp1FRLeuHvphSXtjuXih6Epw+BwCSLdC3Du1WSCK4JkDQUqkLRXq/sF5IgCCtV/7xvBhmJKqIH3O3bIEp5FM4SDVv/bBhAuycKPDFUfsvuNBsC6qu2oCI5QraxW6zCNQXBFOcZOoPeBchnZhhmrVdsnnc3UUwwC1XZaoF4NvVoiwyEoPxUS0p9ltpjs+F1jht/7n8U+f/GvVusUWxhei6HWAaH/sY6SXG4VRcxPWrKGPwpn1Yl/sFZ2K0jj+qc9upYxzGVkJbyMhMvUpjnzBXaPc9JCCWJn0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 19, 2024 at 11:00:46AM +0800, Kefeng Wang wrote: > > > On 2024/4/19 0:32, Peter Xu wrote: > > Hi, Kefeng, > > > > On Thu, Apr 18, 2024 at 08:06:41PM +0800, Kefeng Wang wrote: > > > Add userfaultfd_wp() check in vmf_orig_pte_uffd_wp() to avoid the > > > unnecessary pte_marker_entry_uffd_wp() in most pagefault, difference > > > as shows below from perf data of lat_pagefault, note, the function > > > vmf_orig_pte_uffd_wp() is not inlined in the two kernel versions. > > > > > > perf report -i perf.data.before | grep vmf > > > 0.17% 0.13% lat_pagefault [kernel.kallsyms] [k] vmf_orig_pte_uffd_wp.part.0.isra.0 > > > perf report -i perf.data.after | grep vmf > > > > Any real number to share too besides the perf greps? I meant, even if perf > > report will not report such function anymore, it doesn't mean it'll be > > faster, and how much it improves? > > dd if=/dev/zero of=/tmp/XXX bs=512M count=1 > ./lat_pagefault -W 5 -N 5 /tmp/XXX > > before after > 1 0.2623 0.2605 > 2 0.2622 0.2598 > 3 0.2621 0.2595 > 4 0.2622 0.2600 > 5 0.2651 0.2598 > 6 0.2624 0.2594 > 7 0.2624 0.2605 > 8 0.2627 0.2608 > average 0.262675 0.2600375 -0.0026375 > > The lat_pagefault does show some improvement(also I reboot and retest, > the results are same). Thanks. Could you replace the perf report with these real data? Or just append to it. I had a look at the asm and indeed the current code will generate two jumps when without this patch, and I don't know why.. 0x0000000000006ac4 <+52>: test $0x8,%ah <---- check FAULT_FLAG_ORIG_PTE_VALID 0x0000000000006ac7 <+55>: jne 0x6bcf 0x0000000000006acd <+61>: mov 0x18(%rbp),%rsi ... 0x0000000000006bcf <+319>: mov 0x40(%rdi),%rdi 0x0000000000006bd3 <+323>: test $0xffffffffffffff9f,%rdi <---- pte_none() check 0x0000000000006bda <+330>: je 0x6acd 0x0000000000006be0 <+336>: test $0x101,%edi <---- pte_present() check 0x0000000000006be6 <+342>: jne 0x6acd 0x0000000000006bec <+348>: call 0x1c50 0x0000000000006bf1 <+353>: mov 0x0(%rip),%rdx # 0x6bf8 0x0000000000006bf8 <+360>: mov %rax,%r15 0x0000000000006bfb <+363>: shr $0x3a,%rax 0x0000000000006bff <+367>: cmp $0x1f,%rax 0x0000000000006c03 <+371>: mov $0x0,%eax 0x0000000000006c08 <+376>: cmovne %rax,%r15 0x0000000000006c0c <+380>: mov 0x28(%rbx),%eax 0x0000000000006c0f <+383>: and $0x1,%r15d 0x0000000000006c13 <+387>: jmp 0x6acd I also don't know why the compiler cannot already merge the none+present check into one shot, I thought it could. Also surprised me that pte_to_swp_entry() is a function call.. but not involved in this context. So I think I was right it should bypass this when seeing it pte_none, however that includes two jumps. And with your patch applied the two jumps are not there: 0x0000000000006b0c <+124>: testb $0x8,0x29(%r14) <--- FAULT_FLAG_ORIG_PTE_VALID 0x0000000000006b11 <+129>: je 0x6b6a 0x0000000000006b13 <+131>: mov (%r14),%rax 0x0000000000006b16 <+134>: testb $0x10,0x21(%rax) <--- userfaultfd_wp(vmf->vma) check 0x0000000000006b1a <+138>: je 0x6b6a Maybe that's what contributes to that 0.x% extra time of a fault. So if we do care about this 0.x% and we're doing this anyway, perhaps move the vma check upper? Because afaict FAULT_FLAG_ORIG_PTE_VALID should always hit in set_pte_range(), so we can avoid two more insts in the common paths. I'll leave that to you too if you want to mention some details in above and add that into the commit log. Thanks, -- Peter Xu