From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E710C5AD49 for ; Wed, 28 May 2025 01:25:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DE8FE6B0088; Tue, 27 May 2025 21:25:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D99FF6B0089; Tue, 27 May 2025 21:25:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C88F86B008A; Tue, 27 May 2025 21:25:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id AA5516B0088 for ; Tue, 27 May 2025 21:25:25 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 6197EE8719 for ; Wed, 28 May 2025 01:25:25 +0000 (UTC) X-FDA: 83490573810.13.2446437 Received: from mailout1.samsung.com (mailout1.samsung.com [203.254.224.24]) by imf06.hostedemail.com (Postfix) with ESMTP id E05D1180004 for ; Wed, 28 May 2025 01:25:21 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=samsung.com header.s=mail20170921 header.b=qiep15tq; spf=pass (imf06.hostedemail.com: domain of hyesoo.yu@samsung.com designates 203.254.224.24 as permitted sender) smtp.mailfrom=hyesoo.yu@samsung.com; dmarc=pass (policy=none) header.from=samsung.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748395523; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Lim2OVcLdhvOvAO3Z/csrfeMMgKJWl0puJuBEUpxHIg=; b=SwjxFpTn6trYRfcvNsVQ8xEuUK94+fniJTDWPt29EC6ZO+wPtVqzak6L71yUnINv+wi+2E Q0vs9vOU68ct0ayCdqzlt62psPku7UddbYkVNCliv2Bz6B2bjLXbLNOpfpKqLXCJhKOkPu eOQOXrSu92fv/I5Ox5Jpu8C0h5eYR+I= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=samsung.com header.s=mail20170921 header.b=qiep15tq; spf=pass (imf06.hostedemail.com: domain of hyesoo.yu@samsung.com designates 203.254.224.24 as permitted sender) smtp.mailfrom=hyesoo.yu@samsung.com; dmarc=pass (policy=none) header.from=samsung.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748395523; a=rsa-sha256; cv=none; b=gezJQB0LrVj/jlMhltzTJl7vBF7+uGuu3YAkZ8jCno4aNIh4153vaEwVqeBq1aNS+h6aK1 S0yLqEGkWTyypM2clM9vXJLLQsUyKaHzS/sMmn8DtVyC4ibR2sJoSBlRFg2+PtFLCGE4dI U6Q8RCL3epLitxxp4sGryKi45OBcvqE= Received: from epcas2p1.samsung.com (unknown [182.195.41.53]) by mailout1.samsung.com (KnoxPortal) with ESMTP id 20250528012518epoutp01060f25f2d63baf9eba9a7a4d101dd5d8~DjHX4tXpo2128621286epoutp01h for ; Wed, 28 May 2025 01:25:18 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout1.samsung.com 20250528012518epoutp01060f25f2d63baf9eba9a7a4d101dd5d8~DjHX4tXpo2128621286epoutp01h DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1748395518; bh=Lim2OVcLdhvOvAO3Z/csrfeMMgKJWl0puJuBEUpxHIg=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=qiep15tqRaCg47yK1iwQUiIT8zJl1sd0dWDtoJfZtywoptq4Yfeqt9HkRrRmDdRnO p9xJ6AaP6kSmABNb5pz7v5BAN6BwlYTTYmKoDq1voU4otDlLWJ3QHzrATNnUlQIWx1 yYVp5t06pusbA6CUByXuPw4G5467lYT72gLuo0EI= Received: from epsnrtp04.localdomain (unknown [182.195.42.156]) by epcas2p4.samsung.com (KnoxPortal) with ESMTPS id 20250528012518epcas2p4c1c69ffc4b2019f6a80f788c3b6f9732~DjHXe3Xw71101211012epcas2p4T; Wed, 28 May 2025 01:25:18 +0000 (GMT) Received: from epcas2p3.samsung.com (unknown [182.195.36.101]) by epsnrtp04.localdomain (Postfix) with ESMTP id 4b6Wyj3T1Wz6B9mL; Wed, 28 May 2025 01:25:17 +0000 (GMT) Received: from epsmtrp2.samsung.com (unknown [182.195.40.14]) by epcas2p3.samsung.com (KnoxPortal) with ESMTPA id 20250528012516epcas2p30e262aaa4eb019b8029aae7a03ba699f~DjHWKI0GD1571015710epcas2p3n; Wed, 28 May 2025 01:25:16 +0000 (GMT) Received: from epsmgms1p1new.samsung.com (unknown [182.195.42.41]) by epsmtrp2.samsung.com (KnoxPortal) with ESMTP id 20250528012516epsmtrp2dc27c2c0f5d222692f5cff00c8cf6a8e~DjHWJZZkF2655226552epsmtrp2M; Wed, 28 May 2025 01:25:16 +0000 (GMT) X-AuditID: b6c32a29-55afd7000000223e-c9-683665fcc322 Received: from epsmtip2.samsung.com ( [182.195.34.31]) by epsmgms1p1new.samsung.com (Symantec Messaging Gateway) with SMTP id 61.8D.08766.CF566386; Wed, 28 May 2025 10:25:16 +0900 (KST) Received: from tiffany (unknown [10.229.95.142]) by epsmtip2.samsung.com (KnoxPortal) with ESMTPA id 20250528012516epsmtip272a4bcb4d0a89688214d2e840f76f94a~DjHV7LxM80958809588epsmtip2V; Wed, 28 May 2025 01:25:16 +0000 (GMT) Date: Wed, 28 May 2025 10:23:29 +0900 From: Hyesoo Yu To: Zhaoyang Huang Cc: jaewon31.kim@samsung.com, David Hildenbrand , John Hubbard , "zhaoyang.huang@unisoc.com" , "surenb@google.com" , "Steve.Kang@unisoc.com" , Jaewon Kim , "linux-mm@kvack.org" , Jang-Hyuck Kim Subject: Re: reply: [RFC] pin_user_pages_fast failure count increased Message-ID: <20250528012329.GA1545287@tiffany> MIME-Version: 1.0 In-Reply-To: X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrLLMWRmVeSWpSXmKPExsWy7bCSvO6fVLMMgxnLtC2+rv/FbDH77C8W i+7NMxktet+/YrLYPKfYYuPTRewW99b8Z7VYsKef2WLypQVsFpNaepgduDx2zrrL7rFgU6nH pk+T2D16m9+xebzfd5XNo2/LKkaPw+1n2QPYo7hsUlJzMstSi/TtErgyDr6ZzFbwQKvi3+eS BsYXCl2MnBwSAiYSly7+ZAOxhQR2M0p82BkOEZeUmPX5JBOELSxxv+UIK0TNY0aJ7T3+XYwc HCwCqhKzd1mDhNkE1CVObFnGCGKLCOhIzH6wAqycWaCZWeJsky+ILSzgKrH0CEgNBwevgJ7E 2t1VXYxcQBPvskh8/zQNrJdXQFDi5MwnLBC96hJ/5l1iBqlnFpCWWP6PAyIsL9G8dTYziM0p ECixZtk3tgmMgrOQdM9C0j0LoXsWku4FjCyrGCVTC4pz03OLDQsM81LL9YoTc4tL89L1kvNz NzGCY0hLcwfj9lUf9A4xMnEwHmKU4GBWEuFtsjfLEOJNSaysSi3Kjy8qzUktPsQozcGiJM4r /qI3RUggPbEkNTs1tSC1CCbLxMEp1cBk6yPGtWpqrt2dqCClFektJXHHSj7tejS5hc9LwO2S nMz+ZWHB/7y9ZmwqcqtX1o+dfH5L7HMFGe6E1n7l8xeTayP+s276Udvtso97smWjfuU1/0mR Ljpl8od+rXPhcH59OMDlsodXgtoCQzaB1g5r7YaOd30/2jc5PeXPs96RV3Wiin+SUF7Ey+3z jq5MnsK7cCqrxC6eJPO+XUw63yql/gtuqenKmC6j//odu35H+G837uLAiFnXN7zYsXRq2n3l HRMfrZQ1fn1bu7muVS0jYMPHWesqzzIsF4270qY35fpsIXEdh6uTCvP9WlgmruFbWSHeLfj3 pKxS+Lxjk65fUnq13md2VnLFhx9bZJRYijMSDbWYi4oTAX5SwycQAwAA X-CMS-MailID: 20250528012516epcas2p30e262aaa4eb019b8029aae7a03ba699f X-Msg-Generator: CA Content-Type: multipart/mixed; boundary="----BnPAXQDl7pOtB3LPNoK81-MA98f-KJZk6NW2BxnKY8bzeWl0=_d1e27_" X-Sendblock-Type: AUTO_CONFIDENTIAL CMS-TYPE: 102P cpgsPolicy: CPGSC10-234,Y X-CFilter-Loop: Reflected X-CMS-RootMailID: 20250522130101epcas1p435244c12cfc9bb7895008b8ea98af064 References: <20250522144418epcms1p2a31c1a5c95b1937077bddf1b30495e83@epcms1p2> <20250523023709epcms1p236d4f55b79adb9366ec1cf6d5792b06b@epcms1p2> <4e2305d6-b067-4963-b16a-367a254d22c1@nvidia.com> <20250526074845.GA2848800@tiffany> <20250526093258.GA3489925@tiffany> <20250526111744epcms1p89d664f5cebd1e690730f32b66c24e3c0@epcms1p8> X-Rspam-User: X-Rspamd-Queue-Id: E05D1180004 X-Rspamd-Server: rspam09 X-Stat-Signature: trgnh8u99tb3h4en3b645jq3a5tzi84n X-HE-Tag: 1748395521-510591 X-HE-Meta: U2FsdGVkX19aMfaFJxjZh+ahrM6KyAQDnG2nk0HHHCUI+IbbbtXIjtG3Go/gvIETIj77Px3hRfsVxR49pGxKO8Kfd7wTMZ2w1mNSbuCIJ2tWCJZ0g4b4aOKOesQWvu1PLLmmsqZD3bhqy3y4yTAkxg6Xj8gFwnPogR8VAFRSvjb//W7yZ4oBBhsPrkKRP83oMg9+NTVkLlnvqwv8cdPNAR1SlSkKCOZJJRObVc6IMjwmM+SQMaPrd6o34ZaLD4XrgLSFbXEudg3P3AeeskYwtSFtAVFbs6FkV+qs2qipqV0ZqyW7S8gqzsagrBgapSIerYsR+evsz7PUrLwujlejpwFod8yoqC2FPnernFAGbWA/wOjMpLZgWvc7k2Uo4RCkgISSnpkqg0jMNXzVpOUTh/+HEEcmbt8BuTGPxEKsJ6f9rlTGP442kE0i4UYBFcfKkoyBNQCNMBb1byFlF7ZA2nfAPkw5dAWg0H+cqdSGWRdvomK+ojd2bxZLf2tJVi6JQl6hOo40s3Wza7HkD5/ixdlAXLtjgqGXT6rzFeTzBqUFk+G1RaWb9En3ujkA6PAaqjb2SEOUbFcEEY+ycG7LZpJKaXZjJnW7q9H3sski+AOIGe/TE+2hANWa9StmGmDKB4U2NHPH1uuoESgWh7TZ3Et7l7fkr6BC6n9wJL2d38l5F4qQbgli6H2RKZX5ekS5rUWC9/WyYiLb7euCDdZ1thKqciBy1i8IBxgDT7fTE+L6OdN2irQArAxvPzO5G/mJ60weRYUSwHDbnZ2oQro9rZXk3cGH3EbX4Ry+UUp4Wkr+bi5VAkgiHpAeYyuXYkGB1XZIIFL7KaeOkrhOkiRVyB2wdJDvoJLfaC+2yCSp9rmT3hHe1zMGHZer8MCn+0gtHBZQQ7hTDvJZaOA3PM+GDOyYZS2rpmg8amG68/4JurjzNDp6OBzfGc2yM7MT61FUGfwW21S2+Vhi2/LxODj MfhKHUcm X594eFZzqueS0ZQ/mDbY0NeyJl0+73VXhlT5DPiAUNOyw6NjG4PlfCHNeNU9JzmUpblcrhP3fHZFLcDl+SExukkIsn0hKvm4ig5SmNDxjpC6zfP2rggkO+ULtI0byxqG9ihHAwXcKS3fV82AqHOS+jTk3gRascmWAoYMeoUlJz43JsRB1IRcbWTVaBW1QlyOYJR0/qqy4B1PvjfkuTIc+Jvu5jelPfOKovbFOyLnmdEfplzL1Y4jCrxC/CvoX7WjqaI8TtniD15W3YvL8YrJjsWorp95lJ6J1EKwwmKLvfONyeLo8pbhkU+LiOgQAHd5QyDaNTo/TNjcdbjbFuolXPbsPX/t2PT/ARDUy X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: ------BnPAXQDl7pOtB3LPNoK81-MA98f-KJZk6NW2BxnKY8bzeWl0=_d1e27_ Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Content-Disposition: inline On Mon, May 26, 2025 at 07:49:57PM +0800, Zhaoyang Huang wrote: > On Mon, May 26, 2025 at 7:17 PM Jaewon Kim wrote: > > > > >On 26.05.25 11:33, Hyesoo Yu wrote: > > >> On Mon, May 26, 2025 at 04:05:16PM +0800, Zhaoyang Huang wrote: > > >>> On Mon, May 26, 2025 at 3:50?PM Hyesoo Yu wrote: > > >>>> > > >>>> On Thu, May 22, 2025 at 07:52:41PM -0700, John Hubbard wrote: > > >>>>> On 5/22/25 7:37 PM, 김재원 wrote: > > >>>>> ... > > >>>>>> I think this is what you meant, please let me know if you have an idea to make this nicer. > > >>>>>> We may be to able to prepare the patch next week. > > >>>>>> > > >>>>>> static long > > >>>>>> check_and_migrate_movable_pages_or_folios(struct pages_or_folios *pofs) > > >>>>>> { > > >>>>>> + bool any_unpinnable; > > >>>>>> LIST_HEAD(movable_folio_list); > > >>>>>> > > >>>>>> - collect_longterm_unpinnable_folios(&movable_folio_list, pofs); > > >>>>>> - if (list_empty(&movable_folio_list)) > > >>>>>> - return 0; > > >>>>>> + any_unpinnable = collect_longterm_unpinnable_folios(&movable_folio_list, pofs); > > >>>>>> + if (list_empty(&movable_folio_list)) { > > >>>>>> + if (any_unpinnable) > > >>>>>> + pofs_unpin(pofs); > > >>>>> > > >>>>> I think this is correct, although as I mentioned in the other thread, > > >>>>> that implies that commit 1aaf8c122918 (which didn't add nor remove > > >>>>> any pof unpinning) is probably not the true or only culprit, right? > > >>>>> > > >>>>>> + return any_unpinnable ? -EAGAIN : 0; > > >>>>> > > >>>>> Ha, the "?" operator almost always does more harm than good. > > >>>>> > > >>>>> Here, for example, it has obscured from you the fact that any_unpinnable > > >>>>> is being checked twice, when you could have merged those into a single "if". > > >>>>> > > >>>> > > >>>> Hello, > > >>>> > > >>>> I was wondering if the original problem - an infinite loop when pages allocated by > > >>>> cma_alloc() in vm_ops->fault are passed to GUP - still remains unresolved. > > >>>> (To be honest, I'm not quite sure how such pages end up being pinned via GUP. > > >>>> Is that the expected behavior, or could it possibly indicate a bug ?) > > >>> The original problem arises from applying CMA as guestOS's memory > > >>> slots for kvm which use GUP to setup its 2nd stage mapping(HVA->PFN). > > >>> You can check KVM code if you are interested. > > >>> > > >> > > >> Thanks for the kind explanation. While I'm not deeply familiar with KVM, my understanding > > >> is that there are cases where GUP is used on CMA. > > >> > > >> So does that mean pinning memory from the CMA was actually intended to succeed ? > > > > > >Careful: KVM uses ordinary GUP, not GUP-longterm. > > > > Hi. David and Zhaoyang > > > > If possible, could you kindly explain the situation where the 1aaf8c122918 was addeded? > > If KVM does not user FOLL_LONGTERM, then why the function, > > collect_longterm_unpinnable_folios, was changed at that time? > > > > First of all, I'm not a KVM expert. After reading Zhaoyang's mail, > > I thought CMA free page was initially allocated then migrated by FOLL_LONGTERM, > > during the get_user_page for KVM's guest OS. If KVM does not use FOLL_LONGTERM, > > I am confused. > > > > Actually I did not understand the infinite loop situation. I thought few times of -EAGAIN > > might happen during the gup. But calling lru_add_drain_all by collect_longterm_unpinnable_folios > > would put the page to LRU. And other cma_alloc context or migration context, I guess, > > put the pages back to LRU if there was race. > Actually, it is pkvm which was introduced by google in AOSP. I am > afraid I can just brief the callstack here for security reasons. The > pin_user_pages will setup the 2nd stage mapping for the hva by the > vm_ops->fault which is registered by kvm memfd driver and all PFNs are > from CMA area. The driver will keep the pages out of the LRU which hit > the original bug as it is counted but have the movable_page_list be > empty and lead to infinite loop within __gup_longterm_locked > > pkvm_xxx_xxx(equal to user_mem_abort in kvm) > { > unsigned int flags = FOLL_HWPOISON | FOLL_LONGTERM | FOLL_WRITE; > ... > ret = pin_user_pages(hva, 1, flags, &page); > __gup_longterm_locked > do { > nr_pinned_pages = __get_user_pages_locked(mm, > start, nr_pages, > pages, locked, gup_flags); > rc = > check_and_migrate_movable_pages(nr_pinned_pages, pages); > } while (rc == -EAGAIN); > } Hello, Zhaoyang. I don't believe commit 1aaf8c was just intended to prevent an infinite loop. The commit was introduced to allow pinning CMA memory in the pKVM on AOSP. That leads me to question whether the assumption that CMA can be long-term pinned is actually valid. In my opinion, it might be more appropriate to revert that commit 1aaf8c and instead ensure that pKVM avoids using CMA for memory that requires long-term pinning through GUP ? Alternatively, instead of changing the current logic that prevents longterm GUP from pinning CMA, it would be better to propose a new patch that specifically addresses the pKVM scenario like adding new FOLL_flags ? Thanks, Regards. ------BnPAXQDl7pOtB3LPNoK81-MA98f-KJZk6NW2BxnKY8bzeWl0=_d1e27_ Content-Type: text/plain; charset="utf-8" ------BnPAXQDl7pOtB3LPNoK81-MA98f-KJZk6NW2BxnKY8bzeWl0=_d1e27_--