From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E39C6C25B50 for ; Tue, 24 Jan 2023 02:32:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 230A46B0078; Mon, 23 Jan 2023 21:32:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E1116B007D; Mon, 23 Jan 2023 21:32:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0AA096B007E; Mon, 23 Jan 2023 21:32:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id EC3676B0078 for ; Mon, 23 Jan 2023 21:32:21 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C19CC8058A for ; Tue, 24 Jan 2023 02:32:21 +0000 (UTC) X-FDA: 80388118482.03.9C1610D Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf08.hostedemail.com (Postfix) with ESMTP id B0F71160006 for ; Tue, 24 Jan 2023 02:32:19 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=RIRW6FzX; spf=none (imf08.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674527540; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OzPDZ4GQurQ3Wi5M3nuD8SPGO947lucDtaqwO/v+ugk=; b=XoSRqtfwTDeiYmHIS2WlSaaSnROffn3o2za20iCUuAPUA9/UwGJnNfTGOwrO/2OBRSzjw2 JAvEIqY9xvmnw8id4AFz+L3GhGBO5/pxds6TWPBPB5we8vG1iLbm5b1S7/38kMOVycYQcz WzmoNGSKy+Z1QofthwJOGWyKirJOd6o= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=RIRW6FzX; spf=none (imf08.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674527540; a=rsa-sha256; cv=none; b=HD41IX9WBjOTQnEgKce4D1UlmjYOTRE4EgrmfTr79ldbFxirzRswEMdAE5iCkCf83qAXbg npvhB1SixoEZIu8bxxXPR32RVT+Va1Kmvx7SGvKG886vC6h8G9x9ikUgEv9wrCHp5vUoX9 ct5f6xHJhjib2eH5hNQT/y38kBlUNJA= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=OzPDZ4GQurQ3Wi5M3nuD8SPGO947lucDtaqwO/v+ugk=; b=RIRW6FzX1L2Ilb64mAF0WuK3ZC IBKJEySwbhjJJTDy7YouhVBaxSKPoQa37sOqk72zNmcLnk3SLsnWsOYmk+k230wp3vUpUNUt8tFwM IVaNu+mhaCne7fWB7xV5LybzEeLVpWLfqa9G5BABDRyV/JhG1zhgqT98+wrRQJA+29bf52YO7s0Io p0cS3Lf8PwxSQj0V9z7cOM2SF5sbB5gJ3M38hD5vQZwTIA2NHq+30Ocr7GslJdzE2tP/iaKQN8Ize 9UVfUzqtmiZEGM5IAqHrO8HCFrjev+UIC0G+y9rShQ5AMJyxpHd0rUoolIWSws82CeBfh2WiJYIu5 cDWOa1yQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1pK96Z-004hsI-Bs; Tue, 24 Jan 2023 02:32:11 +0000 Date: Tue, 24 Jan 2023 02:32:11 +0000 From: Matthew Wilcox To: Sidhartha Kumar Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, david@redhat.com, osalvador@suse.de Subject: Re: [PATCH 1/2] mm/memory_hotplug: remove head page reference in do_migrate_range Message-ID: References: <20230123202347.317065-1-sidhartha.kumar@oracle.com> <5c40ed66-1e51-78fa-193c-0eb0db814b01@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5c40ed66-1e51-78fa-193c-0eb0db814b01@oracle.com> X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: B0F71160006 X-Stat-Signature: 8i7m8f3mcr1mawkdp6qb1rs9koakgden X-Rspam-User: X-HE-Tag: 1674527539-539289 X-HE-Meta: U2FsdGVkX1+tselMd34Lc3rUvY40Obn8iesQ7Uy7ocHT2iJ5p07A4BT+mINPVt+1Z3YmO/qc2nEYgD5Z8Q9WZ2h0lKMgxGHaTLjk8a6+1SmW7QQhwD+vw5YNsonvBTQOci2VY4n7yGq+lKs3u0246EO0zN6p5FGWOGox0r67RY9LFWO99gYaIzapLnToKC0TZBJigq5/9DfiJHiMGfI/kPTQPyXPPdnC8MnL2x6ZHPCiFWPMvLEA5a9wPcIt4QHa6XkiiIgWvBL5yFYtLpf5MXGD6mZOoWFSNRBCNbS3rEcWv4zmyyLYsZCYV9djaBwdB6K1wIUa0x1Tbb8z7tJ9f2nu2U4zOVqRp/U+8DU85J+rj957C37605S94mcVoUJEdrsx/++VaPJ3RehNt6eqNzLrFxvAllJUFkHESPfJrVrZujXynkNQmPKpKcLK052eu3NR6TtX5F0DqQRzPuNYexJYWctU7golxecTwX+1uglgPtqOV3IfKERw96dobvMR6YT7d4J5DnUxKeuvqSwuAJ8Q3UMc4eRA1l+ueopA30ojXH1Y9l0WUrbyYJMPqJ2PmClTcdMwxixCAXAOcYJLX+YuXatN7xwWXQzNzFaK8eZZZR1eJCKKjNAadMqqrCX7r7mZ8mMbJqcV0dLH3FlrLkHatWcn9h1+YHBOlsoUCtiMpM+fgfZekJTtAv+daKThy6qakttI3avibs8Y8eaKDUc49aiQGUj4d2LyxG6SE7dOsyblzd1WLp5LPr+CixRge37jxS9Cuo8VNs9/6Yv6aUsDqZtvy7zwGObRLiE9JZAvPA31urC2SKV9gGukeYBSrvuBPMrcTL90i5g4GBWqU97f1yNQMdNy9uoMxJQfxIzy8Dx4zfCqTH7gsMC3lYWByLyYan8T7FaJElQMHa9nUHTSbfSXLpW8AMNP3oVypNlVkVnp8GNKe4vKp6uSxDWR1XDwqpszxVlDu8qUTQT lSNOSvFg epZ0f3EZjj+48erfLh4NJBqcH1eiaPgA82Ajocw84pq0aGdDhE1dQke13dQXD1du98HoWhLLTjXJbkvEWxUHWElk7EjGbgImELNnslQpvbiwhgA7WSTibPPgfw1kCXzcP489hFHF1QYnA7Omxr0KIwl9UWzwdOF2tFLXd19ZXXSrJvnIXMaW2exARKA5HfqHJTlyEcZcI4hkzZno= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jan 23, 2023 at 01:08:49PM -0800, Sidhartha Kumar wrote: > On 1/23/23 12:37 PM, Matthew Wilcox wrote: > > On Mon, Jan 23, 2023 at 12:23:46PM -0800, Sidhartha Kumar wrote: > > > @@ -1637,14 +1637,13 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn) > > > continue; > > > page = pfn_to_page(pfn); > > > folio = page_folio(page); > > > - head = &folio->page; > > > - if (PageHuge(page)) { > > > - pfn = page_to_pfn(head) + compound_nr(head) - 1; > > > + if (folio_test_hugetlb(folio)) { > > > + pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1; > > > isolate_hugetlb(folio, &source); > > > continue; > > > - } else if (PageTransHuge(page)) > > > - pfn = page_to_pfn(head) + thp_nr_pages(page) - 1; > > > + } else if (folio_test_transhuge(folio)) > > > + pfn = folio_pfn(folio) + thp_nr_pages(page) - 1; > > > > I'm pretty sure those two lines should be... > > > > } else if (folio_test_large(folio) > pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1; > > > > But, erm ... we're doing this before we have a refcount on the page, > > right? So this is unsafe because the page might change which folio > > it is in. And the folio we found earlier might become a tail page > > of a different folio. (As the comment below explains, HWPoison pages > > won't, so it's not unsafe for them). > > > > Thanks for the explanation of why this is unsafe. Would it be worth to put > this code block inside the > > if (!get_page_unless_zero(page)) > continue; > > put_page(page); > > block found lower? My motivation for this series is the HPageMigratable call > in patch 2 is the last user of the huge page flag test macros so a > conversion would allow for the removal of the macro. I thought I could also > remove the head page references found in this function, but if that would > cause too much churn in a complicated sub-system it can be dropped. I think we just have to be very careful when working without a page ref. Now, specifically to the matter of converting HPageMigratable(), I think that's fine. Your folio_test_hugetlb_##flname macro does not have a VM_BUG_ON_PGFLAGS(PageTail(page), page) in it, unlike folio_flags(). So it looks like even if your folio becomes a tail page in the middle of scan_movable_pages(), you won't hit a BUG(). Now, should we go further? Possibly. But I'm more concerned that we haven't really figured out which functions should be checking this. Maybe we should drop the BUG entirely and rely more on the type system (and people not casting) to prevent errors. We could go a lot further with the type system and define a new type for "might be a folio but we don't have a refcount on it". But we don't do a lot of work with unreferenced folios, so I'm not inclined to go to all that effort. Perhaps we want a folio_maybe_X() series of functions that don't warn if the folio has morphed into not-a-folio. I don't know.