From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFDA0C433F5 for ; Wed, 23 Feb 2022 04:58:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 35A268D0002; Tue, 22 Feb 2022 23:58:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 309738D0001; Tue, 22 Feb 2022 23:58:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A9BF8D0002; Tue, 22 Feb 2022 23:58:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.26]) by kanga.kvack.org (Postfix) with ESMTP id 0A56E8D0001 for ; Tue, 22 Feb 2022 23:58:57 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id BF6EA1201FD for ; Wed, 23 Feb 2022 04:58:56 +0000 (UTC) X-FDA: 79172839872.08.C7DE40A Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) by imf30.hostedemail.com (Postfix) with ESMTP id 3E6BC80004 for ; Wed, 23 Feb 2022 04:58:56 +0000 (UTC) Received: by mail-pj1-f49.google.com with SMTP id v4so1639626pjh.2 for ; Tue, 22 Feb 2022 20:58:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=pRwboAIHggTZUI4lWUnmJUyntnY7VGmNSn53sM0enXk=; b=UkEfTdzDrvN6L3JUbuw2UpMFrqosIc5Hu1svCE/Uq8D6QCsfPzDgUqoBXMbe1MhYnE DCRBWyfqON3RfZmp6f8fTsdi6PMTM/sxzvfdHWEC1j4DAWaRIEQ2NjQnZ82AC2bws2Xt SiJ/3RXWKqC93RO0BYDb6qPznhGoojD2Wi7OaxyjSjXu1v98kwQ3+u4JIQZBqELQRs/D 9EbitzwB+fvIrMgUZjm6Uqa8yBB2CLBVOJN8SidaeprR9cGPcru9J6cObzwukt7Yh/hx quNElgjJY/9Hn0N5GBkkxi6yBNhRCslfRZ7QDJCfOWJnJ0P11Mh64qnWvL8N+JWpndi9 3t4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=pRwboAIHggTZUI4lWUnmJUyntnY7VGmNSn53sM0enXk=; b=gYXn3t9tk0FGYH0hDkgIslhUvSTSoCzqtz1tHE2C0tTdE9koI5QrnEVWHh96aZoIXb zVl4tPJrPfM7OJPsmRVAUuSkWJHAWqFvjsTbGhion33xcbmQRQ6Lq0uJ/RMiEH+20Wx+ ayrhdyISLA2NyavmABvtXDa8z20SETY0IGLfSCJxpZsWxku2vrm+d8n9ZeV1VHDjlAo6 iu9YloWOu78q8Dt2vjDcsWGL7MnVxCYjAN2SnL1km8UQPBPIWL+Pt5Tk4mi0WPnq6wNQ Pnb+fZhz5GYuRnzWuqo2bSVygiRLWcfR20zlxMEc21Aw+sk0OfhXcs8qDgt/qoFboYJf RusA== X-Gm-Message-State: AOAM531hAf7J/oj1DRq9bCRUb+mMG/z6Cz2/Kbbl1s5EhZDnTbl7u+7x s+k6b1DNbfW4ShmuF9yTVP8= X-Google-Smtp-Source: ABdhPJyc41YFKGQ5S/mJVF3Q/kR0zoMUg54IhOOagncqkcVI3w6YDRYLeeFVHQkPCaJLkrzAEpmuuw== X-Received: by 2002:a17:90a:fa95:b0:1bc:509f:c668 with SMTP id cu21-20020a17090afa9500b001bc509fc668mr7464993pjb.189.1645592334905; Tue, 22 Feb 2022 20:58:54 -0800 (PST) Received: from smtpclient.apple (c-24-6-216-183.hsd1.ca.comcast.net. [24.6.216.183]) by smtp.gmail.com with ESMTPSA id z17sm15974302pfe.29.2022.02.22.20.58.53 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 22 Feb 2022 20:58:54 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.60.0.1.1\)) Subject: Re: [PATCH v2] userfaultfd: provide unmasked address on page-fault From: Nadav Amit In-Reply-To: <20220222090002.24lg2fdhrihquzaj@quack3.lan> Date: Tue, 22 Feb 2022 20:58:53 -0800 Cc: Andrew Morton , Linux-MM , David Hildenbrand , Andrea Arcangeli , Mike Rapoport , Peter Xu Content-Transfer-Encoding: quoted-printable Message-Id: References: <20220218041003.3508-1-namit@vmware.com> <20220222090002.24lg2fdhrihquzaj@quack3.lan> To: Jan Kara X-Mailer: Apple Mail (2.3693.60.0.1.1) X-Rspam-User: Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=UkEfTdzD; spf=pass (imf30.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.216.49 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 3E6BC80004 X-Stat-Signature: aaotm1ugg88z9higpdn8icyey7x6apf8 X-HE-Tag: 1645592336-312009 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Feb 22, 2022, at 1:00 AM, Jan Kara wrote: >=20 > On Fri 18-02-22 04:10:03, Nadav Amit wrote: >> From: Nadav Amit >>=20 >> Userfaultfd is supposed to provide the full address (i.e., unmasked) = of >> the faulting access back to userspace. However, that is not the case = for >> quite some time. >>=20 >> Even running "userfaultfd_demo" from the userfaultfd man page = provides >> the wrong output (and contradicts the man page). Notice that >> "UFFD_EVENT_PAGEFAULT event" shows the masked address (7fc5e30b3000) >> and not the first read address (0x7fc5e30b300f). >>=20 >> Address returned by mmap() =3D 0x7fc5e30b3000 >>=20 >> fault_handler_thread(): >> poll() returns: nready =3D 1; POLLIN =3D 1; POLLERR =3D 0 >> UFFD_EVENT_PAGEFAULT event: flags =3D 0; address =3D = 7fc5e30b3000 >> (uffdio_copy.copy returned 4096) >> Read address 0x7fc5e30b300f in main(): A >> Read address 0x7fc5e30b340f in main(): A >> Read address 0x7fc5e30b380f in main(): A >> Read address 0x7fc5e30b3c0f in main(): A >>=20 >> The exact address is useful for various reasons and specifically for >> prefetching decisions. If it is known that the memory is populated by >> certain objects whose size is not page-aligned, then based on the >> faulting address, the uffd-monitor can decide whether to prefetch and >> prefault the adjacent page. >>=20 >> This bug has been for quite some time in the kernel: since commit >> 1a29d85eb0f1 ("mm: use vmf->address instead of of = vmf->virtual_address") >> vmf->virtual_address"), which dates back to 2016. A concern has been >> raised that existing userspace application might rely on the = old/wrong >> behavior in which the address is masked. Therefore, it was suggested = to >> provide the masked address unless the user explicitly asks for the = exact >> address. >>=20 >> Add a new userfaultfd feature UFFD_FEATURE_EXACT_ADDRESS to direct >> userfaultfd to provide the exact address. Add a new "real_address" = field >> to vmf to hold the unmasked address. Provide the address to userspace >> accordingly. >>=20 >> Cc: David Hildenbrand >> Cc: Andrea Arcangeli >> Cc: Mike Rapoport >> Cc: Peter Xu >> Cc: Jan Kara >> Signed-off-by: Nadav Amit >=20 > Yeah, I'm sorry for breaking this :-| The patch looks good except: >=20 >> diff --git a/mm/memory.c b/mm/memory.c >> index c125c4969913..aae53fde13d9 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -4622,6 +4622,7 @@ static vm_fault_t __handle_mm_fault(struct = vm_area_struct *vma, >> struct vm_fault vmf =3D { >> .vma =3D vma, >> .address =3D address & PAGE_MASK, >> + .real_address =3D address, >> .flags =3D flags, >> .pgoff =3D linear_page_index(vma, address), >> .gfp_mask =3D __get_fault_gfp_mask(vma), >=20 > At least mm/hugetlb.c:hugetlb_handle_userfault() also initializes vmf = and > calls handle_userfault() so it should initialize real_address? >=20 > Also there are a few other places that initialize vmf but they use vmf = only > for swapin so probably they don't reach to userfault code. Still it = seems a > bit fragile to not initialize real_address there? Not strong opinion > there... Ideally we would not misuse vmf in those places but that's a > larger cleanup. Thanks for catching it. I will send v3. So we have: hugetlb_handle_userfault() - will fix. unuse_pte_range() - does not appear to be used for any actual page fault. I will initialize real_address to be on the safe side. __collapse_huge_page_swapin() - another abuse and real_address is not used, but to be on the safe side, I would initialize it. shmem_swapin() - address is zero and not used for any faulting related activity (although it appears to me that you might have the page located on the wrong NUMA node, but it is out of the scope of this patch). I will not change it.