From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA567C3DA4B for ; Wed, 17 Jul 2024 06:51:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 39D046B0082; Wed, 17 Jul 2024 02:51:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 34D376B0083; Wed, 17 Jul 2024 02:51:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 214DE6B0085; Wed, 17 Jul 2024 02:51:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 0351A6B0082 for ; Wed, 17 Jul 2024 02:51:50 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 44458160956 for ; Wed, 17 Jul 2024 06:51:50 +0000 (UTC) X-FDA: 82348324380.16.1206D9F Received: from szxga05-in.huawei.com (szxga05-in.huawei.com [45.249.212.191]) by imf15.hostedemail.com (Postfix) with ESMTP id 2C618A0022 for ; Wed, 17 Jul 2024 06:51:46 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; spf=pass (imf15.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.191 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721199088; a=rsa-sha256; cv=none; b=TerbBqIFXi1YVm3PWSBBUmPY1r5sRgfazmm/86f1G4gdCcsrk9T5Qi5VZD9tSQeSesVLmB 6iv7AtZUBm9cikJwKmM4HT4iQnKXa449KWqbsKtKlyJ74lgyjdCuvkoSVzzRQWTpzoQ+SV /evdZB8ibIVA4jfNADvJatRI6XuWmMk= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; spf=pass (imf15.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.191 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721199088; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CN1T9ajfQWskG0+vOTa9PMCPs3hDALz26Ts29JVHlrg=; b=1b1Sl7wvIwjtOHTMp+djM07gmTsXkDb8PVFFHMJETyosuWIQ6Ftl5YDZOJYA5CAXKUOYyK UpfSIF8iqkVwDkCvTkJ8ztfEf8RGA/e5Pq19NEQlfG2n0YkrgNugy+ryysVMX8ajDnSulQ ff6HFa2Hc1irXL4ACapPCKU6voJaH3o= Received: from mail.maildlp.com (unknown [172.19.163.17]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4WP63m6ZkHz1HFL0; Wed, 17 Jul 2024 14:49:08 +0800 (CST) Received: from kwepemd200019.china.huawei.com (unknown [7.221.188.193]) by mail.maildlp.com (Postfix) with ESMTPS id 37D911A0188; Wed, 17 Jul 2024 14:51:42 +0800 (CST) Received: from [10.173.127.72] (10.173.127.72) by kwepemd200019.china.huawei.com (7.221.188.193) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 17 Jul 2024 14:51:41 +0800 Subject: Re: madvise_inject_error skips pages To: Jane Chu , Matthew Wilcox CC: Naoya Horiguchi , Linux-MM References: <9e25253e-7141-493c-9846-dc473284d975@oracle.com> From: Miaohe Lin Message-ID: <2c12b802-82f6-2eb7-7f05-adf5821902f6@huawei.com> Date: Wed, 17 Jul 2024 14:51:40 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <9e25253e-7141-493c-9846-dc473284d975@oracle.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [10.173.127.72] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemd200019.china.huawei.com (7.221.188.193) X-Stat-Signature: c9sax4nnjjtjtf98hu9zijghkuoj7bn4 X-Rspamd-Queue-Id: 2C618A0022 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1721199106-385415 X-HE-Meta: U2FsdGVkX19DNzK0KUgVG6Jcx3GHhMOtN8hftaRG4Qc5yYpRZC0FUQDYwLDdC3YzgnkSYdlHsg64eXkSoJvFsxa+FxVhiivHgmdZTFJF5pkednmmuuHhWc+KuW0e7xUsTrDftcnmEj4ZHBDxd91r7czr8xCdSTcc60EUZLOkBfnbIeqkCJb5kGT/uLfebR8JsbzZcP5xo1bWIlCy7CFrHFPXYWf4E8CIZePoNoZFkBHrqPqIJvVvLPFKayuckWn0ZLfw9yzo6jQXsEjDXz2n8rsA8PettKZOxRhlMIJuY6gj4bnO5n9KpGddJzjHA6RlnftvtB2LXeJ6aZBOFbDRpyeRCbm94vratQwBrbCBoaws+LRJ13ldGhHQfBRliP4gcAwGejcAcO1kcJz7QTM1tn8NaMOMqntB9eL+bbKB2BFQ4fMxDohXB3ZIWkVml+vRM/BiDo0uwUYnbx2EU0nlvFmQ4L3yCD1io41CWof8bnQf+smF5zLnEYMibo8enOq0jDyuO3fy4FYDRn08znJNZ29rZ58SDzmP7Jj74J5wyt03TxnZRKIVIgxMUU0lf01n7Q+j9C4sWjdOTHBaQqCyQsb54gPka+Bt/6aVBpJwQeZ+WICjQJPVM8PfJUnDRhAJPvqHEAUtWadF0nBc5mBlt75QHvTbnrLm7KKtaQBlsCC9O/DIEmtD4rNPTJv7Z9ypjd0sKUR3xrK34ndBE97sFH90bvaKJ0R2ND8QVkknDuuogz+/0HS1SLSD59syqQH8wWgIrj9YYySlR/Fj/DUHaWTMZ1lKIc2oyCdNMAAYFDinhoaK5Tgp3DY9gFR2Lt7VjDlX6MaTwc4q4Ae55thwz7EqHTsdjUqPK3Mo/rvkMek+E71d2oxaLNW1rP21WBodeCo1E2KMVwXro/P5qesG0Q8SXBTAHo705eCbhaA30bDXx3g8UrtZRnl/0ngKbIOCAIscKfLNUIRoCTBRinE cpfoOKFU o7AEdhgp+LdUYrEiVcz4J62PJaBbSaGX1XsTQiUFmKD2AcS59v+EmLP0fyDrqns7OwI607F1y7LGNg+OpzI27JVtAXiRNvO+ogr80WiJ4zMs3WeQzc07Tf6MF6yEsf43tfZIhAQjZYLfVFlg1NeocJfVRnF5KAQcY5f/MF3WvgsKPnU25lJ/LI/8GCfmWzc5npoofJJoXU6ZfrAaGTK+lGkVaYDxpSe8urxl1jgqGzS1h8CaQ0WYsQ1aNC1XmN4fP2JBrt8YQzZuO12M= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/7/17 5:33, Jane Chu wrote: > Hi, Matthew, > > On 7/16/2024 8:12 AM, Matthew Wilcox wrote: >> I was going to send this patch: >> >> +++ b/mm/madvise.c >> @@ -1136,9 +1136,10 @@ static int madvise_inject_error(int behavior, >>                  /* >>                   * When soft offlining hugepages, after migrating the page >>                   * we dissolve it, therefore in the second loop "page" will >> -                * no longer be a compound page. >> +                * no longer be part of a large folio. >>                   */ >> -               size = page_size(compound_head(page)); >> +               size = folio_size(page_folio(page)); >> +               start = start & ~(size - 1); >> >>                  if (behavior == MADV_SOFT_OFFLINE) { >>                          pr_info("Soft offlining pfn %#lx at process virtual address %#lx\n", >> >> because right now if you start in the middle of (e.g.) an order-4 folio >> followed by an order-0 folio, you'll skip the order-0 folio immediately >> following it. >> >> But then I realised that we can come to this path in the middle of >> a large file-backed folio that's mapped misaligned and has a COW page >> in the middle, and the whole thing is just misguided.  So I gave up. >> Anyone want to take a crack at fixing & testing this? > > Thanks for the report. Thanks both for your thoughts. > > My understanding is, we should run folio_test_large() upfront, and treat both ends as special case > > unless they're folio_size aligned, and iterate in folio_size for the middle.   I think this should work for > > hugetlb and large folio in general, but I need to confirm this with tests. > > Any thoughts? I might be wrong but there might be some corner cases even if they're folio_size aligned. For example, if pte[0] points to a pte mapped thp and pte[1..n] are cowed pages, these cowed pages might be skipped in next loop because folio_size covers them but they doesn't belong to this thp? Will it be better to decide @size upon pte entry? Thanks. .