From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AAEE1C83F17 for ; Mon, 14 Jul 2025 15:52:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4A2836B0099; Mon, 14 Jul 2025 11:52:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 453296B009B; Mon, 14 Jul 2025 11:52:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 341726B00A2; Mon, 14 Jul 2025 11:52:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 251E36B0099 for ; Mon, 14 Jul 2025 11:52:35 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id CE78456E57 for ; Mon, 14 Jul 2025 15:52:34 +0000 (UTC) X-FDA: 83663312628.12.1B39C89 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf06.hostedemail.com (Postfix) with ESMTP id 6866C180002 for ; Mon, 14 Jul 2025 15:52:32 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="Uw2a3lT/"; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf06.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752508352; a=rsa-sha256; cv=none; b=MIL6opnswbKkpzRje0pvkNG/Csir2nWsjFwbsxn2NVlEkcGm7UG03AD6ITbucjFInP3IZx 7CYE0QcchFHTmDBLewpz6HOS2+0hAkHAlefon3HZFGFaesvWjMQ6yMCKCuXffgdT8TdJUI 4b618Hhx+2RWH8lmgrIeR0D8mZkd0bo= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="Uw2a3lT/"; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf06.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752508352; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UUAQ+S8PQte+3MeUTt8fF2yvbeCL2cplSD/8uQid4zs=; b=KxMHoVeXg4XGqgS0+whgdA1khlAw7Fl5RdUcllggHo/7sn0IBJYMCSwz8t2pEaqjzFBN/u TT+FsmXbNLOTfqtiVsix3jDq27Cv7gkqshBPK4wYN4ilTo+qIGp10nFeuVdn6Gwh/Xaq0t asvq537R/HjfTNAC3FmR0156ZkcezWo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1752508351; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UUAQ+S8PQte+3MeUTt8fF2yvbeCL2cplSD/8uQid4zs=; b=Uw2a3lT/xP2jaw0oKb3dyzAPoqzBxqpmD7V9s278dSFe/gTgodwUk/mFdEjRyp8yxj3yJw noQhSdtsw5VP8wHDLV6dIRoz/0TJyCDQCTI7gR3y+CNWpKnkuBDI7mEa7taJd26DLWirqR T/isEbhiSV5j9Lgf6SxaNTWAwWN1x6c= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-463-iCfu4tWdMCehn_4KHDG1Jw-1; Mon, 14 Jul 2025 11:52:29 -0400 X-MC-Unique: iCfu4tWdMCehn_4KHDG1Jw-1 X-Mimecast-MFC-AGG-ID: iCfu4tWdMCehn_4KHDG1Jw_1752508349 Received: by mail-wr1-f69.google.com with SMTP id ffacd0b85a97d-3a50816cc58so1526622f8f.3 for ; Mon, 14 Jul 2025 08:52:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752508348; x=1753113148; h=content-transfer-encoding:in-reply-to:organization:content-language :from:references:cc:to:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=UUAQ+S8PQte+3MeUTt8fF2yvbeCL2cplSD/8uQid4zs=; b=a3/blL+QemDbViZKv14c6AZW85/H85k8m9Wh9vcs0ya4RHjSouxUPtrqVoOMXO97qi s99OxVDMU6sthBsLRGV8N+maVLSykb17eN2zZ6jenTMo/SsjXXucf9CEw9p8LLBVcOWO 72UvB7LZ34fA0gGUcJV+WUBC3pEzdR8B2T4VbudvkERyVsKqTl5LcIImFkhJJ7Z27lxF Z/tgVak44mcopwKG/kxXxf2fA+IP/5x4vEt1BmaiSw2A+aMowwVdys9XlpgkdWT6Uscm YqyxXdapKsAlZpSLqAF8SDxcWg14KOmT0buirrqvUz+CbOWfAxUaFMgka0EInnjwdJ9m jf2g== X-Forwarded-Encrypted: i=1; AJvYcCXGtIkT7/UNJ2/XOSDYw30AJKm9Cjgih92GDrkyiim37jS+ZUhDkQq9DpUUjknv4tXzZtTVKSifJA==@kvack.org X-Gm-Message-State: AOJu0YyKQ1xai1WE62TQwA/emHlGtH31qXszSKOelMz7J07DQ89IXHmi eo8RmfSY7+TL5ClKYXZu4d66yDl6TgvcvwkLW72T1Z4KT9SeS7xNgjIFB5tw8ptX1N0bnRPr9iW 5y/JDjmqoJwDYQtOciUZKuIcpGQkPwT2dsLxZLM5P5MdCyT76SjWagoUeuuRGaFw= X-Gm-Gg: ASbGnctxSMSHITdv/Kqr8YTc/+elGH97qSNASWVcubEuvo4kz+fvTysus0ABZUlk2cx /3qRlEhJO0/DJr5oNxjOja+0q3d5A1eQFwoG6gVbYCVNgHYrAAxbvnL1+jzvWpNbyIMoYEy5X4y C7I8NFL/tb1uFujW4DmnWBOjFaO6zdfmqGGFs93HogS8ylbm9JsNpmMciBabByno0YzyXW7phlF 67ckUO0LbMEjhkmIVxWWOb5hOpMGDjyKYljCOxDP8zo1m/IwHf407HrNWqlSBvv90esf34Kz3t7 4qYkAKYGZM9g4d5EE4G4KztZzuM+MBEXsYDhsQsynM2r5c5JHhjZDT9n8excrvxdIMIKWcGx0Yu mzJYsrEwZDAwLF5aPq8N1O88HfEoOBpGl7yPYVdePttXbgeBfqyIr8NnwP2Y5E2gs X-Received: by 2002:a05:6000:18a5:b0:3b6:936:92fa with SMTP id ffacd0b85a97d-3b6093692fcmr381903f8f.52.1752508348484; Mon, 14 Jul 2025 08:52:28 -0700 (PDT) X-Google-Smtp-Source: AGHT+IENc+6iFxrpw72OoKivhBsFx5k/Ng3Tewkd+tZGCe19paYKl0TYyB88K38ZypShJ1Ufr/knKw== X-Received: by 2002:a05:6000:18a5:b0:3b6:936:92fa with SMTP id ffacd0b85a97d-3b6093692fcmr381871f8f.52.1752508347934; Mon, 14 Jul 2025 08:52:27 -0700 (PDT) Received: from ?IPV6:2003:d8:2f38:ca00:ca3a:83da:653e:234? (p200300d82f38ca00ca3a83da653e0234.dip0.t-ipconnect.de. [2003:d8:2f38:ca00:ca3a:83da:653e:234]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3b5e8bd1a2bsm12996540f8f.14.2025.07.14.08.52.26 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 14 Jul 2025 08:52:27 -0700 (PDT) Message-ID: <66bc7274-ec2a-423a-8094-b8d4cc9438fe@redhat.com> Date: Mon, 14 Jul 2025 17:52:26 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 2/2] mm/memory_hotplug: fix hwpoisoned large folio handling in do_migrate_range To: Zi Yan Cc: "Pankaj Raghav (Samsung)" , Matthew Wilcox , Luis Chamberlain , Jinjiang Tu , Oscar Salvador , akpm@linux-foundation.org, linmiaohe@huawei.com, mhocko@kernel.org, linux-mm@kvack.org, wangkefeng.wang@huawei.com References: <61325284-d1d6-a973-8aa7-c0f226db95fa@huawei.com> <7b2c054b-fc33-4127-aaa9-9edf6a63e142@redhat.com> <924d9d25-e53c-f159-6ec0-e1fd4e96d6e2@huawei.com> <4c5d4fd5-5582-11d8-9fee-24828ac1913d@huawei.com> <8c9719f0-c072-40bb-b7f6-6f2cc41a31dc@redhat.com> <1D589FE5-3515-4ED5-B12E-D5CE23BA5D13@nvidia.com> <641F5B0B-2B48-46FA-AC58-3A8A4BEB1448@nvidia.com> <3702f6b0-27a9-4ca1-adbd-fb1e2985b2d3@redhat.com> <345f7ae6-b2d6-44cd-b8b6-2bdd4b33e9d6@redhat.com> From: David Hildenbrand Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: sPI-J1LE9EbHe8Qvdydo2v2uyNQfNfirJs3qTk3azAw_1752508349 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 6866C180002 X-Stat-Signature: qh3u1ojjoej5e74eocw4oagaa6ehq94k X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1752508352-950084 X-HE-Meta: U2FsdGVkX1+Mm5CpILhtLKFwTgPV9eG641wmfl5HACLDBKreLLDwil65hALra1igABUefFAqaHii3OSw9gBx4xk5zWPf6waDNJUb/KTuMzrisNLXwsCKQtfB6xTYTp93LbceG9akx/BjyUAA6OhJFfVehqWP2azFn5nnycZ+754QyU+cTMLDL6Je/Xx1q6M5sSiJzTUVwUrAx5+sSLeJdId5vHdftEj4REzCPFi/Bts8aZVl4O5c3HzAS0f/Zy7DFhHjL8QpoQNPUTkjoLGPGKnKTGgsEVrBEYFaZgKflrh9RYCvfe/Vvb79ZyTOIHmSgEvI2CRaHITivquV6pfcVcxhRbo6mkZ71E1/M4yfrII6E5THG9z+fTDUFNDkqgoC6CLCA7VLIi6bDS0WDH4AYIFMyeorExHxP8S8PnbWYQGhNxTLA8z4xH9woKGu7nH65sumN0FXbdYMvUfjSCOxtLAzW2+Jy2fCmEauOauoIO00sLBaUMJ8GfUV6qnnzukvYO98auX3UXpwNYz1BbJmg31dLItet+mf3XPsGNgwPFC/c/rs499r5EQSkItvVbqDZ8d8uxI5ikXu3x5CjIToNbNjvLcKAv2z9i8v4U6nQ5kYKd56gW8O1ZP0xnmubwCuUus43yCy4cAbn+LRepjdSRC6XsODWj1W9EEAaNgChvRwjK53bRph11VuFnUqRBapMhdGMy5MGR8GAPtuw9s2o3Synzt6e4e9+9FUbMEOnsO9GPnuKB/f6s5XgIulbY8f/sJYsID9BWBZ+yecz9yTUU0OoWrQzeeXRvC+ZmMv8wNa1f52V6W4qFTbFD55P71O0WEhnd8BfzcOIAOe9xH3FKfvvjQrK8+PMcf/vWexWqY+f+ML6/Ldah8YYIfPkmqFtLzox3vzAXPRfOAPNmVhJZhvlNo600lonwSYej6t27tGQdfX1zyxeV4kWfcx4v5SYKD/AtItm9ZRxksdLWq mcEZgI+L mSQYBvei4oHNhdgkfbImBEjUUuXcQRGnfzP2th5kmPq6xSzJn9PeB3Y3TS/rnWFg+iPPcMYKBuow/WYplLmYoOEbKyT0YXc4rTfEX4NcZmxkpyluimZFsikr50hLZGSECnO7stwGxlYRKCTO0AK8uhxHiCntLC6wZ+siz9fEVB6BjmyFBHIuHDA9Hx4GEfmUDl0hidJoCsLeEcEE8m5z0Zbu5hAZMnzNiCTwpsB65r5kZLECmdfUz36bDE6lXG/66NYbF+Ubmq5AIH8kp9Ss4+QRCntzpQ0XExnzvatra35XUiAo50SOPLp5lT/kz1Iz/unDTv3mVkB3xvOuY5ukjVxk3B0GHrGrMnQoE5X2L003RQcKdP9zNhCcjIjM+Nc6GALyPJbFEd3z14suQrJdpqporYyMZfc2MQNxHu68Fg9FrFYaVVJU0YnB285stnEb7XxCW8M5EcvkjkLJFfWzp5gufxJmitaMJuez5 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 14.07.25 17:44, Zi Yan wrote: > On 14 Jul 2025, at 11:33, David Hildenbrand wrote: > >> On 14.07.25 17:28, Zi Yan wrote: >>> On 14 Jul 2025, at 11:25, Zi Yan wrote: >>> >>>> On 14 Jul 2025, at 11:14, David Hildenbrand wrote: >>>> >>>>> On 14.07.25 17:09, Pankaj Raghav (Samsung) wrote: >>>>>>>>>> So we will need to take care of madvise cold or pageout case? >>>>>>>>>> >>>>>>>>>> Hi Matthew, Pankaj, and Luis, >>>>>>>>>> >>>>>>>>>> Is it possible to partially map a min-order folio in a fs with LBS? Based on my >>>>>>>>> >>>>>>>>> Typically, FSs match the min order with the blocksize of the filesystem. >>>>>>>>> As a filesystem block is the smallest unit of data that the filesystem uses >>>>>>>>> to store file data on the disk, we cannot partially map them. >>>>>>>>> >>>>>>>>> So if I understand your question correctly, the answer is no. >>>>>>> >>>>>>> I'm confused. Shouldn't this be trivially possible? >>>>>>> >>>>>> Hmm, maybe I misunderstood the question? >>>>>> >>>>>>> E.g., just mmap() a single page of such a file? Who would make that fail? >>>>>>> >>>>>> >>>>>> My point was, even if you try to mmap a single page of a file, page >>>>>> cache will read the whole block (that corresponds to min order folio). >>>>>> >>>>>> Technically we can mmap a single page of file, but FS will always read >>>>>> and write **at least** in min folio order chunks. >>>>> >>>>> Okay, so it can be partially mapped into page tables :) What happens in the background (page cache management) is a different story >>>> >>>> David, thanks for getting to the bottom of this. >>>> >>>> OK. So we will see deadlock looping in madvise cold or pageout case. >>>> I wonder how to proceed with this. Since the folio is seen as a whole >>>> by fs, it should be marked cold/paged out as a whole. Maybe we should >>>> skip the partially mapped region? >>> >>> Actually, it is skipped, since split_folio() bumps new_order to the min >>> order, and if the folio order is already at min order, split code return >>> -EINVAL. This makes the madvise cold or pageout code move to the next >>> address. >> >> But what if the folio order is 2x min_order etc? > > If folio order is greater than min_order, the code is suboptimal. > It will first split the original folio to min_order, then > loop through after-split folios. If the after-split ones are fully > mapped, madvise ops will be performed. Otherwise, the code will > try to split after-split ones again, fail, and end up with skipping > the partially mapped range. Indeed, thanks for checking, madvise() is fine then! -- Cheers, David / dhildenb