From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C20DC43334 for ; Mon, 18 Jul 2022 20:05:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 36F1C6B0071; Mon, 18 Jul 2022 16:05:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2F8156B0073; Mon, 18 Jul 2022 16:05:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 173A38E0001; Mon, 18 Jul 2022 16:05:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 014A26B0071 for ; Mon, 18 Jul 2022 16:05:32 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id CD975E84 for ; Mon, 18 Jul 2022 20:05:32 +0000 (UTC) X-FDA: 79701300504.06.FE6F785 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf08.hostedemail.com (Postfix) with ESMTP id 5ED4C16004E for ; Mon, 18 Jul 2022 20:05:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1658174731; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=pR5BXi6lWJExY+GAViTkg5jKA2lArOAXioNSqFSZ7WA=; b=OE0PWflNH6QKEgvpQ82EK9qHg5YTFd9ZnFblqZ36ftwo0aec77a2X7U7Y4g+14K7R50/qL LOcdoFT15FlRsOEV6EaqgzHQWSudB4Kv6El1O8v8gQrdLfrTbLwtzTSOeNPfDlJysqX2Am 27T3nq02KBvN3X8DLLokP+1vWYJmuXw= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-93-mWbB4p03NXeFw58mOPuiTA-1; Mon, 18 Jul 2022 16:05:28 -0400 X-MC-Unique: mWbB4p03NXeFw58mOPuiTA-1 Received: by mail-qv1-f70.google.com with SMTP id u15-20020a0ced2f000000b004732a5e7a99so6084670qvq.5 for ; Mon, 18 Jul 2022 13:05:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=pR5BXi6lWJExY+GAViTkg5jKA2lArOAXioNSqFSZ7WA=; b=w6Io2F87lSuGCGsPK2qLqe01aVxoZ6GN6uhF/iNk9vpcCy2/vttNY7LyT/aLn4Kmyy gFXftEBsU8tUFh4/RqzLy03b5EdN4KHIX7wJxv38OWaFONXlDZ0sRckY6b/MgHGYZiie 34qwFvBrVpDuc9WlPo6XC8mdaV52Qgzj64XmNKnMZbxPQzWIr1lAP/TvKsOeDsMhKyse SldFuEDfW76peH+7Q2C7F+g/+90xF1D54j6Zm+h/pKlSnRDNcV0fI5iF7QetoJArI/dr iZcoOj5AE/UtgLVT/zslmwkPItsJi9SFwKyU33TO4Kpp/KuqpVbuA/rCexgoZBTEPWl1 A+Vg== X-Gm-Message-State: AJIora+ZIvLDyhncfwdOxs067/nYKHf90MYnRz2CxJYtkR2CUk0/XQWA edZT648W97al8YuBZ55TsCrO0XwjTsa/BLWUVqz0We4BsOeRdtLkdHMTPSvrA1bAqzmXd7fqIpJ erAWJs7v6nXk= X-Received: by 2002:ac8:5a50:0:b0:31e:f587:f891 with SMTP id o16-20020ac85a50000000b0031ef587f891mr3137726qta.10.1658174728181; Mon, 18 Jul 2022 13:05:28 -0700 (PDT) X-Google-Smtp-Source: AGRyM1u9jTswsAr0Rn+bZDv1cjDiia0Qylbqdy1X2e5gwIBmPB42bT029JimQL2mSu1DVj78y9OHog== X-Received: by 2002:ac8:5a50:0:b0:31e:f587:f891 with SMTP id o16-20020ac85a50000000b0031ef587f891mr3137710qta.10.1658174727951; Mon, 18 Jul 2022 13:05:27 -0700 (PDT) Received: from xz-m1.local ([74.12.30.48]) by smtp.gmail.com with ESMTPSA id u5-20020a05620a430500b006b5bf5d45casm13447078qko.27.2022.07.18.13.05.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jul 2022 13:05:27 -0700 (PDT) Date: Mon, 18 Jul 2022 16:05:22 -0400 From: Peter Xu To: Nadav Amit Cc: linux-mm@kvack.org, Andrew Morton , Nadav Amit , Mike Kravetz , Hugh Dickins , Axel Rasmussen , David Hildenbrand , Mike Rapoport Subject: Re: [PATCH v2 2/5] userfaultfd: introduce access-likely mode for common operations Message-ID: References: <20220718114748.2623-1-namit@vmware.com> <20220718114748.2623-3-namit@vmware.com> MIME-Version: 1.0 In-Reply-To: <20220718114748.2623-3-namit@vmware.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=OE0PWflN; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf08.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658174732; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pR5BXi6lWJExY+GAViTkg5jKA2lArOAXioNSqFSZ7WA=; b=CRLtySRsM7qDoikbGtClRvwfDi4APJIgszPi7EomrQAfGT+dQdMb9LHJBw6czTaakQBNuW /dqlO7wWvWIoMZ3iLI5o1nlsCz5gFrEeOCMG45NTICu3mIzwBs+uGIX9YpTMoLTavrdztl EE6oXyIZ4YgjrIn+iIiMUwB2NbR35DQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658174732; a=rsa-sha256; cv=none; b=KqYmo0fk/Ah/0G9PupLoGLLW7tufOzk2j6Ddph1mAAEtCFdTE+OlxHatsXris7seBxD0VX Cd2fOaZmrewYlTbiuBP0dQoo9jFBA6hzNKqHObYy9fepRrKKREUGPfM0c6jIcZ93CgRQeG 36WQoWAbD3I4Nq/wk6T9i3vExRbBb3c= X-Stat-Signature: 733ey1nfska7epp4i9rtwcjtj67udh9d X-Rspamd-Queue-Id: 5ED4C16004E X-Rspam-User: Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=OE0PWflN; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf08.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com X-Rspamd-Server: rspam11 X-HE-Tag: 1658174732-244456 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jul 18, 2022 at 04:47:45AM -0700, Nadav Amit wrote: > @@ -261,6 +272,7 @@ struct uffdio_copy { > struct uffdio_zeropage { > struct uffdio_range range; > #define UFFDIO_ZEROPAGE_MODE_DONTWAKE ((__u64)1<<0) > +#define UFFDIO_ZEROPAGE_MODE_ACCESS_LIKELY ((__u64)1<<1) Would access hint help zeropage use case? I remembered you used to comment around and said it won't help since we won't reclaim zero page anyway. It won't help either even if this flag is only used for the follow up WRITE_HINT (since then there'll be a CoW) because when WRITE_HINT attached it doesn't make sense to not have ACCESS_HINT, then it seems the WRITE_HINT itself would be enough for ZEROPAGE to me. [...] > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c > index 421784d26651..c15679f3eb6a 100644 > --- a/mm/userfaultfd.c > +++ b/mm/userfaultfd.c > @@ -65,6 +65,7 @@ int mfill_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, > bool writable = dst_vma->vm_flags & VM_WRITE; > bool vm_shared = dst_vma->vm_flags & VM_SHARED; > bool page_in_cache = page->mapping; > + bool prefault = !(uffd_flags & UFFD_FLAGS_ACCESS_LIKELY); I think it's okay to name it "prefault" as a temp var, but ideally IMHO we shouldn't assume what the user app is doing - it is only installing some uffd pgtables with !ACCESS_LIKELY and it does not necessarily need to be a prefault process.. > spinlock_t *ptl; > struct inode *inode; > pgoff_t offset, max_off; > @@ -92,6 +93,11 @@ int mfill_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, > */ > _dst_pte = pte_wrprotect(_dst_pte); > > + if (prefault && arch_wants_old_prefaulted_pte()) > + _dst_pte = pte_mkold(_dst_pte); > + else > + _dst_pte = pte_sw_mkyoung(_dst_pte); Could you explain why we couldn't unconditionally mkold here even for x86? It'll be a pity if this feature bit will only be useful on arm64 but not covering x86 (which is so far still the majority I think). IMHO it's slightly different here comparing to kernel prefaults - the uesr app may not be aware of kernel prefaults, but here !ACCESS_HINT it's user-aware, and it's what user app explicitly provided. IMO it's a stronger proof of a cold page already. The other thing I got confused here is arch_wants_old_prefaulted_pte() returns true if arm64 supports hardware AF. However for all the rest archs (including x86_64 which, afaict, support AF too in most models) it'll constantly return false. Do you know what's the rational behind? > + > dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); > > if (vma_is_shmem(dst_vma)) { > @@ -202,7 +208,8 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, > static int mfill_zeropage_pte(struct mm_struct *dst_mm, > pmd_t *dst_pmd, > struct vm_area_struct *dst_vma, > - unsigned long dst_addr) > + unsigned long dst_addr, > + uffd_flags_t uffd_flags) > { > pte_t _dst_pte, *dst_pte; > spinlock_t *ptl; > @@ -495,7 +502,7 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, > uffd_flags); > else > err = mfill_zeropage_pte(dst_mm, dst_pmd, > - dst_vma, dst_addr); > + dst_vma, dst_addr, uffd_flags); > } else { > err = shmem_mfill_atomic_pte(dst_mm, dst_pmd, dst_vma, > dst_addr, src_addr, > -- > 2.25.1 > -- Peter Xu