From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1163C3ABB2 for ; Wed, 28 May 2025 10:59:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2FF036B0088; Wed, 28 May 2025 06:59:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2D6D56B0089; Wed, 28 May 2025 06:59:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 213E36B008A; Wed, 28 May 2025 06:59:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id F192A6B0088 for ; Wed, 28 May 2025 06:59:54 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 66B45E724A for ; Wed, 28 May 2025 10:59:54 +0000 (UTC) X-FDA: 83492021508.25.0B82B2F Received: from mail-lj1-f169.google.com (mail-lj1-f169.google.com [209.85.208.169]) by imf14.hostedemail.com (Postfix) with ESMTP id 3AA3310000C for ; Wed, 28 May 2025 10:59:51 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Vn4V+qdm; spf=pass (imf14.hostedemail.com: domain of huangzhaoyang@gmail.com designates 209.85.208.169 as permitted sender) smtp.mailfrom=huangzhaoyang@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748429992; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zvTRs+6GP27CqjCOcic7t5IuKH4nzT7/jJcEOkkcaNs=; b=jUZUjLLwIWHAZP9Og7nfTR7OHn3FbhW4FGZRnoOC3C5ZZnAvKVem8JQATtUOnEtwzAHCiw uNMu7lYK6bZVR1VjK4SxWQOAP6TCBtaJpLYZ1PgLVkalhJp2c2iZxUGa0OJxCNjnOTPo68 1Hoa+QmpMupM/oKf4VH2Xd+MgQ/q7oE= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Vn4V+qdm; spf=pass (imf14.hostedemail.com: domain of huangzhaoyang@gmail.com designates 209.85.208.169 as permitted sender) smtp.mailfrom=huangzhaoyang@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748429992; a=rsa-sha256; cv=none; b=Rl94tbFhvdeUPh9Jo1EW9azXkiG243nERAKV/ul5fAOlGIjAfMLM1XCYX5HqniBdqp9i/y BlrfjYv6y4BNr8qH2eznsGfsklUv5r38uZiLc2qWA8YvSsmcqkIw8NHl7DTTxlJVA4RINY j8jYyRrnb5xDd8oOeeQRMasMRr5r+Mg= Received: by mail-lj1-f169.google.com with SMTP id 38308e7fff4ca-326b52268dcso2043121fa.2 for ; Wed, 28 May 2025 03:59:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1748429990; x=1749034790; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=zvTRs+6GP27CqjCOcic7t5IuKH4nzT7/jJcEOkkcaNs=; b=Vn4V+qdmbXnThVqyotj0udXeWBO1bmhvOQe/kFpcw16OJxbsjz5DiPRy3bDpbn0eX0 xmGdnBNq0GO1XLWVG4P8UWbUOkJLgLyaeAIjDK0/hLyjrJi9QqsQ5HqB2n3WVdM4Jaea DCu7rynY+u2RwITa6eaHAqJ1tGLqNJxIKtf5Go9krUJWEfqEGsMx45GssvJdv6gdazoV ePueOuf6mxv1gq3qfEje2ozcuEXxOgSSuuKTLTKdrxvwtBctg9+0ZiP6/yVbgn8zxEfV J23CnqduUNoC3yMlCzHhK+UpLyEpKdWEQDsCbDrjoyXHAkQ4JU7SbkhCe1C9tmybi/Hp STxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748429990; x=1749034790; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zvTRs+6GP27CqjCOcic7t5IuKH4nzT7/jJcEOkkcaNs=; b=D5bw6sZ/r+5n7eq42S4Z8YFlFh1BnJmbRtN4vaNIKebN8bWQAaTqpaDJUT/1Ug0rJu uuQTNkACDmsFLhvxo7HjQEC4ve/Yki0sM8/StGpa8xE2+0MjRXtveXlqtWue9NhcmPo3 gnbULYHJU2NdEijxqSYNtrNP80KbVg9DQUCZz+fGrqF+T84s+O1lddW5DxUXZZjaLw9M 3Y5jdTgo3jS8hf+lQQOkkEy4mb7zGo++6+pTD2Q5vRWbo+2Ad5JUNGxDqRAJ+KcPBJgm IRhpYS3Qzk/x7TlWGKec0ow+9rL5WOfvS830XqbYZ2rmJxvQtDY8CaCxd2yRHRSiRqX/ oOAg== X-Forwarded-Encrypted: i=1; AJvYcCUuG8GKc/atWC7+mvhpurltLP1VvArQuRP41JkzUUWRmK3u75ft0/l71RwCupnJZb+3vZqpAvsVcA==@kvack.org X-Gm-Message-State: AOJu0YwhXej3mra6xDMdhOpnPRroTkWHZBG3P9Rj+LsK6wpDDAzeVl/W KkgZ9W2oMGhOH7MsLHV+pktLu15mXnmNqAUnZpa4MXLZxtpOQn+QIq8mN4kMfZAknt5A/XLYAuS Lzb8tQVq56+gd/sBbz+KxRTLI3grhwxs= X-Gm-Gg: ASbGncuLTBzWYcAr+axGji4GH5EwREPAgCeptnwkryOnDcREi4r4AWRTU5t7tFN+sSQ cV885MNAT/+/lkJs+HA0jgGejxcoVlTW3COyphgu8atjtNUNpDVEqOo8WY7TwPbKhT1GgPjpKKh bzUtGWBVEk53o9ZjTM3cFd2rzBN+tQ4maelw== X-Google-Smtp-Source: AGHT+IFh+jCTxeREgCbimFm9N+tfEOL7T9gO7Um3MFx1KfOQgG7lk/adm20apjS1ABNALqXexG123qLNEp1XjsjJkMw= X-Received: by 2002:a05:651c:1446:b0:329:176d:c195 with SMTP id 38308e7fff4ca-32a777f0415mr2303931fa.12.1748429989945; Wed, 28 May 2025 03:59:49 -0700 (PDT) MIME-Version: 1.0 References: <4e2305d6-b067-4963-b16a-367a254d22c1@nvidia.com> <20250526074845.GA2848800@tiffany> <20250526093258.GA3489925@tiffany> <20250526111744epcms1p89d664f5cebd1e690730f32b66c24e3c0@epcms1p8> <20250528012329.GA1545287@tiffany> <20250528033626.GA1607193@tiffany> In-Reply-To: From: Zhaoyang Huang Date: Wed, 28 May 2025 18:59:37 +0800 X-Gm-Features: AX0GCFvMli7f05EggAAVxA4uHjS0lU4iKCnV_KkaGm__T02ivfixkUvLuylFhcg Message-ID: Subject: Re: reply: [RFC] pin_user_pages_fast failure count increased To: David Hildenbrand Cc: Hyesoo Yu , jaewon31.kim@samsung.com, John Hubbard , "zhaoyang.huang@unisoc.com" , "surenb@google.com" , "Steve.Kang@unisoc.com" , Jaewon Kim , "linux-mm@kvack.org" , Jang-Hyuck Kim Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 3AA3310000C X-Stat-Signature: 36j9xuefgx8driiyxgtd1314zswicbyd X-Rspam-User: X-HE-Tag: 1748429991-66229 X-HE-Meta: U2FsdGVkX19z0wYnAEdZ5YjXhAWqgaR254GctFstuUZYMhN5VCB+z86pXhs109eF4+eVoohkwi+UArd1KC+q6lzIVHrNlKr3cj7fki0vs66p/O/e/m3gzUwO16Kfr9hBjXogRIIHi7vo0gT7aNE7EWTD4Tmn+F3dafxM/dxaxK0Iie+9S7CYJR9o3o/Edk0MsDjs65G6cAzadZ30fwjCF53Bk/KUDzP8hZPntENZcH3gVh9xYmvmn+Of3PNfOPrVMciKbtFuFcWJNpdEEcp9CNH9bVP4tvsnH4iCN9xesYkNkrTC4Ya79kkAW/G7k67UN6iFhaBS7ev8So9IcaQgN8ScPbenVRk8arA/YpgM4nFtpJ4INBh9elmt4S3LS7OFlz2cRQINmidC0wQvJtRZfIgX9oSQX77O54DxA1UnrGIOTmQKS4yVN35+094PNUuHrHF1lETunfkKLA+iXKuDgEVgWclOLFD8cXmAw9OLUu0LvDhsNBOgW1/KN5fMZevTmspPFhOJ7lc4bLJy/nCn9yjwY5wOQX8desyTI2464gVACmE5Wfi7zWuqol1uOFF0Uxwe9VJjGYxQQB7fZdHaby0VN5x6E+PBMAXwsIDBB76b16LgiMJj/c2mAIUVbQUaHE44NwBi5YVy72CScqWuCjYacGcmqTqeUUYAUKfYfAR85jNBCN92OI2927GM8hbGa0NVbsHAp8u3bKREP+M5GC4ECSNVGrv6qe/QYM5l4GZj2gnR0MmI6OVBycbVogcVUTpav1SpSEY7ZBQAMNnm+obzUIcQyyQ6p0lN19VZk8RsEYoQS7DAWu4eqNQPsv9s+/g67YV/++NmRtMQCQQvyE8NffYpB2j1zHBIvFbss7AnLdh8UeqOjXkzt48POIKWT62W/X/oBkIAAUrFcgLY48wsq7U9xcJB7KAU8Z6CvTnPVLgn5DEa/QAirI4mLKNofH0KYBIbnJT0G0Jn23k u2c3faFQ MBLnzoGbfWE5AsGbrJpJBohtrLkmAboLX9GK/GZRV1qgYvPPSAJyaYV4ExBGHDcnSfeHCAVz9INTftd+Uwea5B7WTJx8XVnAkVvKdyqwjtXlSCLTmvLShoELlIoHlVXZ22R5HxUYzleTbJh+Afx31O9WsZuncutQKL1KIaBEEkQ9UrUru0Y16ZEhg3N60LaaLBLPxwcwPLkzlM77tg7pvutOUv10iEhBKb/zXaiKZl3rvzlAFXDGy0ToGnJQcXccX4UppyQjgoZgZzGly7w/tzEV3mrPJdBTx6Afi3BUKYlL8NwEovtqEFfqCqAqqI8zqlxZQHj3z5OaT5jqQ2QNhUl+OYp6/MECo/2IM6KnDBRsj6wGEiaLHD5fLtUDvMZjHTZa0iwSZRtdZLRJq1vxvOJBF+V1aR7Kxl1lVBt5ZxJpbstZYtA1CUE0TXyiSvjKsysNwU32rUIWYJSiZNv5ZBC+mucinRnX/hdUZD5EbyMVnYoa11Xhkr7k5CRzxL8kmvtwzUDoLX6+rHeUPzOUdFifcyA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, May 28, 2025 at 3:55=E2=80=AFPM David Hildenbrand wrote: > > On 28.05.25 05:36, Hyesoo Yu wrote: > > On Wed, May 28, 2025 at 10:49:36AM +0800, Zhaoyang Huang wrote: > >> On Wed, May 28, 2025 at 9:25=E2=80=AFAM Hyesoo Yu wrote: > >>> > >>> On Mon, May 26, 2025 at 07:49:57PM +0800, Zhaoyang Huang wrote: > >>> > >>> Hello, Zhaoyang. > >>> > >>> I don't believe commit 1aaf8c was just intended to prevent an infinit= e loop. > >>> The commit was introduced to allow pinning CMA memory in the pKVM on = AOSP. > >>> > >>> That leads me to question whether the assumption that CMA can be long= -term pinned is actually valid. > >> That depends on the user of CMA, yes for my scenario since it worked > >> for the guest os. For common scenario such as the file/anon mapping, > >> the page will be judged as unpinnable for long-term and be migrated > >> out of CMA area. > > > > Your scenario and the common scenarios can not be distinguished from th= e kernel API's perspective. > > Even in common cases, the page may be in a non-LRU state temporiarily, = and in such situations, > > pinning CMA can lead to bugs - we've encountered multiple issues becaus= e of this. > > > > Right. We just disallow long-term pinning CMA pages, because we don't > know who the real owner is that would be okay with long-term pinning them= . > > >>> > >>> In my opinion, it might be more appropriate to revert that commit 1aa= f8c and instead ensure > >>> that pKVM avoids using CMA for memory that requires long-term pinning= through GUP ? > >> It is not a pkvm issue but a defect of applying FOLL_LONGTERM over > >> non-LRU CMA pages. > > > > In include/linux/mm_types.h, the CMA should be migrated when FOLL_LONGT= ERM. > > > > * In the CMA case: long term pins in a CMA region would unnecessarily f= ragment > > * that region. And so, CMA attempts to migrate the page before pinning= , when > > * FOLL_LONGTERM is specified. > > > > Given this, would it make sense to avoid using FOLL_LONGTERM in this co= de path ? > > If something is unbounded in time, FOLL_LONGTERM is the right thing to us= e. > > >>> > >>> Alternatively, instead of changing the current logic that prevents lo= ngterm GUP from pinning CMA, > >>> it would be better to propose a new patch that specifically addresses= the pKVM scenario like adding new FOLL_flags ? > >> I don't think so. pin_user_pages is an exported API which can't make > >> assumptions over the caller. > > > > My point is not to base the patch on assumptions about the caller, > > but to define a clear mechanism that ensures safe behavior in the inten= ded scenario. > > > > For example, you can add FOLL_NO_MIGRATION and skip to migrate unpinnab= le pages. > > Not sure which exact semantics you have in mind. But failing if we would > have to migrate might be ok. Not sure if the caller should worry about > that, though: the caller should not have to worry about page placement > in general. With going over the whole thread, I think the root cause is collect_longterm_unpinnable_folios() hit the race window between lru_add_drain_all() and folio_isolate_lru() by chance and returned with ret=3D0 which finally have the CMA page pinned, right? However, I find the proposed patch below will fail the PKVM scenario(FOLL_LONGTERM set with non-LRU CMA pages) again as the CMA pages never go to LRU which will have the __gup_longterm_locked loop in do while(ret =3D=3D -EAGAIN) as it did before 1aaf8c. I think the key point is to find a way to distinguish the temporary(on the way to LRU) and permanent CMA pages within collect_longterm_unpinnable_folios. static long check_and_migrate_movable_pages_or_folios(struct pages_or_folios *pofs) { + bool any_unpinnable; LIST_HEAD(movable_folio_list); - collect_longterm_unpinnable_folios(&movable_folio_list, pofs); - if (list_empty(&movable_folio_list)) - return 0; + any_unpinnable =3D collect_longterm_unpinnable_folios(&movable_folio_list, pofs); + if (list_empty(&movable_folio_list)) { + if (any_unpinnable) + pofs_unpin(pofs); + return any_unpinnable ? -EAGAIN : 0; + } return migrate_longterm_unpinnable_folios(&movable_folio_list, pofs= ); } > > -- > Cheers, > > David / dhildenb >