From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 860EEC83F01 for ; Wed, 30 Aug 2023 06:27:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CCA1D280031; Wed, 30 Aug 2023 02:27:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C53DC8E0009; Wed, 30 Aug 2023 02:27:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B1B19280031; Wed, 30 Aug 2023 02:27:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id A1A608E0009 for ; Wed, 30 Aug 2023 02:27:48 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 590B1160234 for ; Wed, 30 Aug 2023 06:27:48 +0000 (UTC) X-FDA: 81179790216.02.4BD8EA9 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) by imf14.hostedemail.com (Postfix) with ESMTP id 5CC91100029 for ; Wed, 30 Aug 2023 06:27:44 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=none; spf=none (imf14.hostedemail.com: domain of shikemeng@huaweicloud.com has no SPF policy when checking 45.249.212.56) smtp.mailfrom=shikemeng@huaweicloud.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693376866; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NkoBBLm0TxHRBFaXC9wtukUawJi/Ntj4phulDuQyd+s=; b=U0XxB4Be52ApU8cqQTyH/2bTuw+tvn/yhnWWgRj5oibPi5vbDPBpITb36lBmunAaAECqc5 iLXP1ziCSuEfBBBhWekOV2XOT6LeVhLieZhgacpJUYdRlkWHz7xxQb7SLRTD+M22kkibsn 9Knj16FvNAa3mgeE89h94B+Vkt/I8s4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693376866; a=rsa-sha256; cv=none; b=0CcyAYBIYyYz9bpfv0VnYQtDw8vhBqtdZlaiGLtTT+VscLr/VxABtjzMJW4nUzwft6iDek oik9V0RwB0nmLwYsbiQjLXU0jeidn1FKBQaAizjfeeYCt2DvrY04EQxglGjdsaCyJWt7yI gPUvNewR+/A4Y+CctzM8+ZOG9nwl6kA= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=none; spf=none (imf14.hostedemail.com: domain of shikemeng@huaweicloud.com has no SPF policy when checking 45.249.212.56) smtp.mailfrom=shikemeng@huaweicloud.com; dmarc=none Received: from mail02.huawei.com (unknown [172.30.67.143]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4RbDqS16Rcz4f3mHx for ; Wed, 30 Aug 2023 14:27:32 +0800 (CST) Received: from [10.174.178.129] (unknown [10.174.178.129]) by APP4 (Coremail) with SMTP id gCh0CgD3zqFW4e5ku2KyBw--.65512S2; Wed, 30 Aug 2023 14:27:35 +0800 (CST) Subject: Re: [PATCH v2 1/3] mm/page_alloc: correct start page when guard page debug is enabled To: Naoya Horiguchi Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, willy@infradead.org, naoya.horiguchi@nec.com, osalvador@suse.de References: <20230826154745.4019371-1-shikemeng@huaweicloud.com> <20230826154745.4019371-2-shikemeng@huaweicloud.com> <20230828152113.GA886794@ik1-406-35019.vs.sakura.ne.jp> From: Kemeng Shi Message-ID: Date: Wed, 30 Aug 2023 14:27:33 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.5.0 MIME-Version: 1.0 In-Reply-To: <20230828152113.GA886794@ik1-406-35019.vs.sakura.ne.jp> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-CM-TRANSID:gCh0CgD3zqFW4e5ku2KyBw--.65512S2 X-Coremail-Antispam: 1UD129KBjvJXoWxWw48XFW8XFW7Gr45ur18AFb_yoWrAFy8pa 4xC3WYyw4kt3y3Can7Za9rCr1ftws09FWUCryfZw1rXw13tryak3s7Kr17uF18ur15GFW8 XF4qvr93Za4DAa7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUyEb4IE77IF4wAFF20E14v26r4j6ryUM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4 vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7Cj xVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x 0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG 6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFV Cjc4AY6r1j6r4UM4x0Y48IcVAKI48JMxk0xIA0c2IEe2xFo4CEbIxvr21l42xK82IYc2Ij 64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x 8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48JMIIF0xvE 2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE42 xK8VAvwI8IcIk0rVWrZr1j6s0DMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIE c7CjxVAFwI0_Jr0_GrUvcSsGvfC2KfnxnUUI43ZEXa7IU1zuWJUUUUU== X-CM-SenderInfo: 5vklyvpphqwq5kxd4v5lfo033gof0z/ X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 5CC91100029 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: c3rmk11uabud65ygu436tobi4jsqe6in X-HE-Tag: 1693376864-54815 X-HE-Meta: U2FsdGVkX1/5AB1arWyZzn5T20i2qTFtmt91sKeNQFOfVlYWb9P5zfx86DQYz/Gq60i8mypB2T0xYVucolT4AvMFLW84IXw8zbyoW3p60SywR9ZadIzv1QIoIwj/MWv49aJa8EYz7C3D5gWNTYFrowZUH379/td1AK2m9ca0GQxay8F1oNLuD46UAwG2Y2syf3rBjq1FZ5Q91mr+p/g5LxQ2jyhoaaKEqxXCKV2Sd10IvgHnWzXD4mOM4FGyIpLFq5NsT4KRCVPYGKeoxqu684NrGwT7FXhW1dE7K8FhgZ9LJk35FNrJBiU9nBbBvJkH3iCYk0PWj3k37FI0uBuyGRi9XO98NMWJWQmzdh1Yy++A4CGQnKLb8K2NBELM+sZKrWaaQbq9RkbGL6D4cG8IOPY/yXhIi4lz5EsW/zj5lvYtsTR2euaCgqkXzp/K09IbeAmhz9jKcCMnpRIoLJKHspmLSWK3MhcIDdylxY36uLj8sYzHE5Bo+Ekp66Sb0W77BWISOG3RtI9rha7JIaMqmxY/VYAW2o78X6JLk6F7UBZ7E0iZCQBaL1RsyoTyd8c0VuchyQLQolz0U+1Z8dzFtKuIlDkN+KtD+HavdOtUa4YgACv+0N85uDX2h0qxF3nCeHaZCH5LP9HyQHVzjxrMWBYrcPxkpoOK4aR5bWbG3oNIlNF0dDtwIDk6LhLRzLjFAMNA5jZKYe9zZsBAUvLvcQpcp7ZVqc/1ye9tOQzmJgGZFGZzAhGLJBGyKab05Pi5wTbYeHel1Gr5/7eYA5YfNa5ibrAyOu/NqbLKMryLxo2cXG23KOwFGYSfQ79qUkAkJlEb0RfR4Fe40DYVGFBgoNRe+MjaQTt8d/DHEK32Y2TA5VYAFz1ccmKEJ5fx4nHeKtu49MV85UXGcXQIPpzrcABoY2iwk8Lyx2QaqnFm/Qq0JxwWb0tD4sn8omt53UGhVlFBCPF4Oes9bX4kqiA dS3RErwo VtGTowD40dVLdfgRc8wBW/qUYXJka9rMlt1IoHgcvdTGcQ7C9twh70eAQ/EqudTnl5SwHQevUFV9CJ40H09b3Qb5OOyIqyS6oi09E9JpNuMeX+HE4bye0JfAf1QzyYYX6l9HCWipac7vXvuOkfd49oW5PfpPIImPjssrvtQFiw8RhPT4DNXjjV5adaNMXb6l0k2GETV8r25YBi6+Y8TtcKDOUdjY0sJO+8RPyEGLb5cvBYISpBnRssFCKEQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: on 8/28/2023 11:21 PM, Naoya Horiguchi wrote: > On Sat, Aug 26, 2023 at 11:47:43PM +0800, Kemeng Shi wrote: >> When guard page debug is enabled and set_page_guard returns success, we >> miss to forward page to point to start of next split range and we will do >> split unexpectedly in page range without target page. Move start page >> update before set_page_guard to fix this. >> >> As we split to wrong target page, then splited pages are not able to merge >> back to original order when target page is put back and splited pages >> except target page is not usable. To be specific: >> >> Consider target page is the third page in buddy page with order 2. >> | buddy-2 | Page | Target | Page | >> >> After break down to target page, we will only set first page to Guard >> because of bug. >> | Guard | Page | Target | Page | >> >> When we try put_page_back_buddy with target page, the buddy page of target >> if neither guard nor buddy, Then it's not able to construct original page >> with order 2 >> | Guard | Page | buddy-0 | Page | >> >> All pages except target page is not in free list and is not usable. >> >> Fixes: 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages") >> Signed-off-by: Kemeng Shi > > Thank you for finding the problem and writing patches. I think the patch > fixes the reported problem, But I wonder that we really need guard page > mechanism in break_down_buddy_pages() which is only called from memory_failure. > As stated in Documentation/admin-guide/kernel-parameters.txt, this is a > debugging feature to detect memory corruption due to buggy kernel or drivers > code. So if HW memory failrue seems to be out of the scope, and I feel that > we could simply remove it from break_down_buddy_pages(). > > debug_guardpage_minorder= > [KNL] When CONFIG_DEBUG_PAGEALLOC is set, this > parameter allows control of the order of pages that will > be intentionally kept free (and hence protected) by the > buddy allocator. Bigger value increase the probability > of catching random memory corruption, but reduce the > amount of memory for normal system use. The maximum > possible value is MAX_ORDER/2. Setting this parameter > to 1 or 2 should be enough to identify most random > memory corruption problems caused by bugs in kernel or > driver code when a CPU writes to (or reads from) a > random memory location. Note that there exists a class > of memory corruptions problems caused by buggy H/W or > F/W or by drivers badly programming DMA (basically when > memory is written at bus level and the CPU MMU is > bypassed) which are not detectable by > CONFIG_DEBUG_PAGEALLOC, hence this option will not help > tracking down these problems. > > If you have any idea about how guard page mechanism helps memory_failrue, > could you share it? > Hi Naoya, thanks for feedback. Commit c0a32fc5a2e47 ("mm: more intensive memory corruption debugging") menthioned we konw that with CONFIG_DEBUG_PAGEALLOC configured, the CPU will generate an exception on access (read,write) to an unallocated page, which permits us to catch code which corrupts memory; Guard page aims to keep more free/protected pages and to interlace free/protected and allocated pages to increase the probability of catching corruption. Keep guard page around failrue looks helpful to catch random access. Wish this can help. > Thanks, > Naoya Horiguchi > >> --- >> mm/page_alloc.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index fefc4074d9d0..88c5f5aea9b0 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -6505,6 +6505,7 @@ static void break_down_buddy_pages(struct zone *zone, struct page *page, >> next_page = page; >> current_buddy = page + size; >> } >> + page = next_page; >> >> if (set_page_guard(zone, current_buddy, high, migratetype)) >> continue; >> @@ -6512,7 +6513,6 @@ static void break_down_buddy_pages(struct zone *zone, struct page *page, >> if (current_buddy != target) { >> add_to_free_list(current_buddy, zone, high, migratetype); >> set_buddy_order(current_buddy, high); >> - page = next_page; >> } >> } >> } >> -- >> 2.30.0 >> >> >> >