From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BBE43C54FB3 for ; Mon, 26 May 2025 11:50:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DAF896B007B; Mon, 26 May 2025 07:50:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D60186B0082; Mon, 26 May 2025 07:50:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C4E0D6B0083; Mon, 26 May 2025 07:50:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A72636B007B for ; Mon, 26 May 2025 07:50:13 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 501DC1C9110 for ; Mon, 26 May 2025 11:50:13 +0000 (UTC) X-FDA: 83484890706.03.54A8DAE Received: from mail-lf1-f42.google.com (mail-lf1-f42.google.com [209.85.167.42]) by imf13.hostedemail.com (Postfix) with ESMTP id 5C4A520008 for ; Mon, 26 May 2025 11:50:11 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=YCbd7VN8; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf13.hostedemail.com: domain of huangzhaoyang@gmail.com designates 209.85.167.42 as permitted sender) smtp.mailfrom=huangzhaoyang@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748260211; a=rsa-sha256; cv=none; b=dxLo1bRG9JYw9I0rBDUBDTbUbWt9GexII523ZldqGJd2jFAgU/xeXFGDFDIcvlGvM2kcsI eHcTX5bIji5bbbFLfInh+vA3Pe950ew5YO33g+mWNXEQSVttLr+x/TBmgVKaQyilC9pPLM Qobbd4AE00q86JaCOANtTuQpsOD2ZwE= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=YCbd7VN8; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf13.hostedemail.com: domain of huangzhaoyang@gmail.com designates 209.85.167.42 as permitted sender) smtp.mailfrom=huangzhaoyang@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748260211; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WqEtpky1izXZEYMkfLZ9vOhTbLBwkcOiOm6CwS7olkE=; b=T8coto/TV8mCnKMyS9Fl3nCbIP447QP9J/peyEZKMs1B54nhoLn5ddQvTZqtRC+w6CL1EE wgcu/YoEm61IkVo09OVZ/IITOAkefPJq4icRAnqW3VeW/T5SehvILfMmgPzlvf+/OJdqpz J4a3i1ocaDjm5jh+dA2ZSfV4jyEMmvE= Received: by mail-lf1-f42.google.com with SMTP id 2adb3069b0e04-551ece14fbaso399927e87.2 for ; Mon, 26 May 2025 04:50:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1748260209; x=1748865009; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=WqEtpky1izXZEYMkfLZ9vOhTbLBwkcOiOm6CwS7olkE=; b=YCbd7VN85lD6cN0rzgH+/AcRQiyy5oBs3NhTofbp/DDku/CC0EGipZG4ISnQ20fCZB SAu7OyMmrCGq7Suno0YlgQM2fBQlp+SzLzOZBMUfkKe/aX3XiZ7+z/QOU7eGSNeE3BgN cv36Ghe6SEy6ZCLPzo+QL5tPsi/nEyPgoT9jCIE8UaECNy+7oO/E6uxQANRvv6wQMdET YvHKtyUs1x8T08epR/+yUkoWaUevO7/0W+xN9+uuCkmo7APssWKZSWog5WYAx03nGIJd PvNQU6SCh4VWybfdy2/0iQGVsF4Z724B/9URTINGFSK3S17uTfNdB/8xXlx4RGgsU0YJ VcUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748260209; x=1748865009; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WqEtpky1izXZEYMkfLZ9vOhTbLBwkcOiOm6CwS7olkE=; b=NQSFRgX77f3WEW5bDUG5fRcpqBntbJWd78kh096o5oVT8VJ5xML2hWf0LKWPqO8fje cp3aSn5/tXa17+jpU91AFJm5yMAhnbUF/aOP+atlPEGaBCJamvG6fnK0t9Y8QpyHKNtk lhsFju+LVRK9lRoUU2l89pPy4iEzPakizHzM8qgiQ/m7ZQdRpYg846NPN0FiID1dDv+D EtDD10u3A4NBTVbdWKt6CfU+sB2XCiVKsm5GAe0Go8jFiCL4LAufKeUYJ2PpWDNFlXsH TY0mrLxUje0wwmCvXG5pDQTsc4JV6RDbLemSg8WU13mZ7YvGApmbKsteXu5tGZ58koTR 6qjA== X-Forwarded-Encrypted: i=1; AJvYcCXyWOOi1FsUn+J1mxUyodzhYOB1TPKoOM8gljXr1A2Z8+vrFF9lSlABjz5yC+lOgcGTMedTqh+hJw==@kvack.org X-Gm-Message-State: AOJu0Yz3dWmjO13n5a/9h8Ey7+aPDnsOSpVbCfhdShGCcTeOTaxOnYI4 fdUqTaqz1T9b7zeVBW4RilQM2z8XJ0KeEpxOaiiXcOtCs2I6X2lrAT7VwGScHGR0esiE7EdSzPN 4vNE+slc2OmJBtSjOnKjbaWSq0RKDApM= X-Gm-Gg: ASbGncuUhNumxCOcB9pVgDlZmUenABkW1P2iwGGf/vKpNKxXCTJeSDNUePnYGwSoaKS 9j+kLhzt9MtjxQfJ4P2p8WpYFJ3g6nCaIx65zeEUawoKSpOSa7wLvfEzbkjqA8vNpf26rpf7Ig6 DKlvtlnALj81xtU+hl/VWsBPdec2we+8/xgQ== X-Google-Smtp-Source: AGHT+IHtMR5WGvZUMuhnDAAfM5g+nonNW8hSDRcIuMDIwM7N9iqx3Na6Zg1ycx9+4EHzhyh0ZHDjpkiNejekHyTCcZ0= X-Received: by 2002:a2e:a993:0:b0:30d:b409:9be8 with SMTP id 38308e7fff4ca-3295b9ba25dmr7908371fa.4.1748260209151; Mon, 26 May 2025 04:50:09 -0700 (PDT) MIME-Version: 1.0 References: <99ae448a-5c5e-4491-8cf7-1325f47e225e@redhat.com> <20250522130901epcms1p31d757b179fbb3563cad6bef4a1829235@epcms1p3> <20250522144418epcms1p2a31c1a5c95b1937077bddf1b30495e83@epcms1p2> <20250523023709epcms1p236d4f55b79adb9366ec1cf6d5792b06b@epcms1p2> <4e2305d6-b067-4963-b16a-367a254d22c1@nvidia.com> <20250526074845.GA2848800@tiffany> <20250526093258.GA3489925@tiffany> <20250526111744epcms1p89d664f5cebd1e690730f32b66c24e3c0@epcms1p8> In-Reply-To: <20250526111744epcms1p89d664f5cebd1e690730f32b66c24e3c0@epcms1p8> From: Zhaoyang Huang Date: Mon, 26 May 2025 19:49:57 +0800 X-Gm-Features: AX0GCFsTEwIEKD-hfSS5NFhseMiDdMtOdkTzvnzZ22tZiae2wtBAxjjXpT9EzYA Message-ID: Subject: Re: reply: [RFC] pin_user_pages_fast failure count increased To: jaewon31.kim@samsung.com Cc: David Hildenbrand , Hyesoo Yu , John Hubbard , "zhaoyang.huang@unisoc.com" , "surenb@google.com" , "Steve.Kang@unisoc.com" , Jaewon Kim , "linux-mm@kvack.org" , Jang-Hyuck Kim Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 5C4A520008 X-Stat-Signature: fera7y1ndtofcijczpo4j3b3r83ntuit X-Rspam-User: X-HE-Tag: 1748260211-405810 X-HE-Meta: U2FsdGVkX190QhWMDPupq4NrBno8RWcKvKqa5E9i4wN8qDr2w4TTw64SIZ8LCG2hxPOlylByu3lWx6pbxnV9oz1dBZ9JQztnCIZfTDA9B5R2gfTeXVePEULa6ZsgYeq35blv8B9q35SBvtTrUhsulDowOSSTKVtX+5Bwg2ys0lV9xtqxll8bItud+m35zVGHTuSyzC9wnDGSVDmEhkMIdiCPZo2ouztUJM/WrOz1mIcYZzWm6FDbb4N4criBSgdMyMbB1jFrCqDwBHFDgSE6ktg5xhLpaIOE+FvQsBfwTw+zaEcOf9CpsxwfTOLjTAWBZhR6jBEXnyiiWRQm+gd7piN7WoRou6YUOEkwsesE6k6uoWWpJg66qyWhP6yEWWxAA6Z0jzY3rqf205gNn7OvgtNGSvNSCkXt2733E77wFQkFA76cqHeTMPVK7FQTJiNVKjpFclDzDHeTfvz4Et54JfKrdRMGKZB1h//RW7CsrnmCDR65WL6Pddqb/dqwRv3XtUZd060MPZSD3EEdLLBEz1EDPxl0uHbe//Pwyr93MDE5LcN81O2CEoawX5InzWvTsiJOTxjvvZ8SvXPK+BTmzlY9fz7jiHQwFD2BuDpAvZm0732/wHYA+cjKxnEonJjk0Tnh0KZcE4qkkkrWxCSyAYtbGBEmdrD6ZNajfJKPU2+yigEExh9ZRqf9Qt5WY2/h/69mJLFQvDULycnxl/QXJVbPebEOqFI8e8i4QSMqHQyqxAJHJg5W6HLOpZO3x8W4l1F2YIv17YrKsNRWOTbDGfHb/ifg6TM1nMURo0Csj0K7/yc+2qlvGSV7THOU4lmMOPQw53TO8iVyqqZlbCTz8z1I8LS9k3cWEysTSFrrbDWqPm4ypRz+lds3NyNedAslyqfbeoP6vqPLjtiqK/3fS3wga5fRc/Zp4E1gXPJ8AGH9UQWH4Q4xjji+0HoGTT6HSaf8TIXJDfUBUPmUNqX EuWz5z7+ TdVui8MFD9lEhXQz+LO6C6vw/V+CYPDU9RUO4OZbXSnaCqlacglTjQnZUx+ZJzSRxbUR3JJmdgLOOofazLUUx05e9T9Wzw6gfYfQJ0o/o3QJs2OGjeKnY8SmTgRFcDGMejBqecw3sOoCLdlznZ/1Jy3jzaPTC9TA0KfxJZZ5/d8AcrEgJ9ElJRPU0Fh/IMkGYLsZL9+O37moNR860UpMO/vcyaNgrUsSo029oQCqnsq3DbFTlFNvKtBV57xH8LuJY7ZrCgGWsBq/g6ItNC6KEGPCb1jbj4+o9K53sLxdYWFfWqEkvIL//j1BGi66dPMPaKfX+wu72esmXPIaR1Ula1NZCc6k9u/uKbd1Ft7CudVrIYNlBH1yQJ3uhMtCmHP2kOijqm7qxXg/hLh3lAhJASMj70AKURfALz0+tboSHwCVJ16PrO5vAu5y5r0J/B/cI1vle0ZQoJ6BzapjLAvLUR4d56G3n0HbpthMXcUN7tp1aW3knrqfRYCS5Eg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, May 26, 2025 at 7:17=E2=80=AFPM Jaewon Kim wrote: > > >On 26.05.25 11:33, Hyesoo Yu wrote: > >> On Mon, May 26, 2025 at 04:05:16PM +0800, Zhaoyang Huang wrote: > >>> On Mon, May 26, 2025 at 3:50?PM Hyesoo Yu wro= te: > >>>> > >>>> On Thu, May 22, 2025 at 07:52:41PM -0700, John Hubbard wrote: > >>>>> On 5/22/25 7:37 PM, =EA=B9=80=EC=9E=AC=EC=9B=90 wrote: > >>>>> ... > >>>>>> I think this is what you meant, please let me know if you have an = idea to make this nicer. > >>>>>> We may be to able to prepare the patch next week. > >>>>>> > >>>>>> static long > >>>>>> check_and_migrate_movable_pages_or_folios(struct pages_or_folios= *pofs) > >>>>>> { > >>>>>> + bool any_unpinnable; > >>>>>> LIST_HEAD(movable_folio_list); > >>>>>> > >>>>>> - collect_longterm_unpinnable_folios(&movable_folio_list, po= fs); > >>>>>> - if (list_empty(&movable_folio_list)) > >>>>>> - return 0; > >>>>>> + any_unpinnable =3D collect_longterm_unpinnable_folios(&mov= able_folio_list, pofs); > >>>>>> + if (list_empty(&movable_folio_list)) { > >>>>>> + if (any_unpinnable) > >>>>>> + pofs_unpin(pofs); > >>>>> > >>>>> I think this is correct, although as I mentioned in the other threa= d, > >>>>> that implies that commit 1aaf8c122918 (which didn't add nor remove > >>>>> any pof unpinning) is probably not the true or only culprit, right? > >>>>> > >>>>>> + return any_unpinnable ? -EAGAIN : 0; > >>>>> > >>>>> Ha, the "?" operator almost always does more harm than good. > >>>>> > >>>>> Here, for example, it has obscured from you the fact that any_unpin= nable > >>>>> is being checked twice, when you could have merged those into a sin= gle "if". > >>>>> > >>>> > >>>> Hello, > >>>> > >>>> I was wondering if the original problem - an infinite loop when page= s allocated by > >>>> cma_alloc() in vm_ops->fault are passed to GUP - still remains unres= olved. > >>>> (To be honest, I'm not quite sure how such pages end up being pinned= via GUP. > >>>> Is that the expected behavior, or could it possibly indicate a bug= ?) > >>> The original problem arises from applying CMA as guestOS's memory > >>> slots for kvm which use GUP to setup its 2nd stage mapping(HVA->PFN). > >>> You can check KVM code if you are interested. > >>> > >> > >> Thanks for the kind explanation. While I'm not deeply familiar with KV= M, my understanding > >> is that there are cases where GUP is used on CMA. > >> > >> So does that mean pinning memory from the CMA was actually intended to= succeed ? > > > >Careful: KVM uses ordinary GUP, not GUP-longterm. > > Hi. David and Zhaoyang > > If possible, could you kindly explain the situation where the 1aaf8c12291= 8 was addeded? > If KVM does not user FOLL_LONGTERM, then why the function, > collect_longterm_unpinnable_folios, was changed at that time? > > First of all, I'm not a KVM expert. After reading Zhaoyang's mail, > I thought CMA free page was initially allocated then migrated by FOLL_LON= GTERM, > during the get_user_page for KVM's guest OS. If KVM does not use FOLL_LON= GTERM, > I am confused. > > Actually I did not understand the infinite loop situation. I thought few = times of -EAGAIN > might happen during the gup. But calling lru_add_drain_all by collect_lon= gterm_unpinnable_folios > would put the page to LRU. And other cma_alloc context or migration conte= xt, I guess, > put the pages back to LRU if there was race. Actually, it is pkvm which was introduced by google in AOSP. I am afraid I can just brief the callstack here for security reasons. The pin_user_pages will setup the 2nd stage mapping for the hva by the vm_ops->fault which is registered by kvm memfd driver and all PFNs are from CMA area. The driver will keep the pages out of the LRU which hit the original bug as it is counted but have the movable_page_list be empty and lead to infinite loop within __gup_longterm_locked pkvm_xxx_xxx(equal to user_mem_abort in kvm) { unsigned int flags =3D FOLL_HWPOISON | FOLL_LONGTERM | FOLL_WRITE; ... ret =3D pin_user_pages(hva, 1, flags, &page); __gup_longterm_locked do { nr_pinned_pages =3D __get_user_pages_locked(mm, start, nr_pages, pages, locked, gup_flags); rc =3D check_and_migrate_movable_pages(nr_pinned_pages, pages); } while (rc =3D=3D -EAGAIN); } > > BR > Jaewon Kim > > > > >-- > >Cheers, > > > >David / dhildenb > >