From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F14C2ECAAD5 for ; Tue, 30 Aug 2022 03:30:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 67865940007; Mon, 29 Aug 2022 23:30:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6283C6B0074; Mon, 29 Aug 2022 23:30:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4EF87940007; Mon, 29 Aug 2022 23:30:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 391796B0073 for ; Mon, 29 Aug 2022 23:30:28 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 12927A0D7E for ; Tue, 30 Aug 2022 03:30:28 +0000 (UTC) X-FDA: 79854831336.22.706A4CB Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by imf30.hostedemail.com (Postfix) with ESMTP id AA68980004 for ; Tue, 30 Aug 2022 03:30:26 +0000 (UTC) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.53]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4MGt7S4bZtzHnVw; Tue, 30 Aug 2022 11:28:36 +0800 (CST) Received: from [10.174.177.76] (10.174.177.76) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Tue, 30 Aug 2022 11:30:21 +0800 Subject: Re: [PATCH 4/4] mm/memory-failure: Fall back to vma_address() when ->notify_failure() fails To: Dan Williams CC: Shiyang Ruan , Christoph Hellwig , Naoya Horiguchi , Al Viro , Dave Chinner , Goldwyn Rodrigues , Jane Chu , Matthew Wilcox , Ritesh Harjani , , , , , , References: <166153426798.2758201.15108211981034512993.stgit@dwillia2-xfh.jf.intel.com> <166153429427.2758201.14605968329933175594.stgit@dwillia2-xfh.jf.intel.com> From: Miaohe Lin Message-ID: <76fb4464-73eb-256c-60e0-a0c3dc152e78@huawei.com> Date: Tue, 30 Aug 2022 11:30:21 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <166153429427.2758201.14605968329933175594.stgit@dwillia2-xfh.jf.intel.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.76] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661830227; a=rsa-sha256; cv=none; b=SuSystLgIACzzVRfibA1sT7oeLJWVNPEgKMoMabsDm41Sjt748fev28/QoX1jGNwiFR1RH 35YUCfrVp/mEEFn9XKutdNIbFE+L2C5PSJ/3qjdoYjV4ZqHqIytX5EX80rXuvWQd9d54r0 z8iDioRMuyd5HhHmDcEfSJFIQggghzY= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; spf=pass (imf30.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661830227; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V4DQMIKc/4sNPY43yU8frUkEZ7KrbT8nT03iJaIxE6Y=; b=CFzS0+vJl3akzvtV60SgYLnUSUj369JlotOzweQpolVobVz/yCz6o8STyi5msQZCEkm8Qm SZ6j4IewYdO4QdpfAT0HbGjpGoE96W5Y7BiTuQo4ui/FKsAyADWUdIyDByvOut7EMHSqj1 T4QxKqzelsH4V84o7N3CoEnGVEJOrkU= Authentication-Results: imf30.hostedemail.com; dkim=none; spf=pass (imf30.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com X-Rspamd-Server: rspam05 X-Stat-Signature: bc33n6cnxjurbszik18qsrw45of8anyu X-Rspamd-Queue-Id: AA68980004 X-Rspam-User: X-HE-Tag: 1661830226-789099 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2022/8/27 1:18, Dan Williams wrote: > In the case where a filesystem is polled to take over the memory failure > and receives -EOPNOTSUPP it indicates that page->index and page->mapping > are valid for reverse mapping the failure address. Introduce > FSDAX_INVALID_PGOFF to distinguish when add_to_kill() is being called > from mf_dax_kill_procs() by a filesytem vs the typical memory_failure() > path. Thanks for fixing. I'm sorry but I can't find the bug report email. Do you mean mf_dax_kill_procs() can pass an invalid pgoff to the add_to_kill()? But it seems pgoff is guarded against invalid value by vma_interval_tree_foreach() in collect_procs_fsdax(). So pgoff should be an valid value. Or am I miss something? Thanks, Miaohe Lin > > Otherwise, vma_pgoff_address() is called with an invalid fsdax_pgoff > which then trips this failing signature: > > kernel BUG at mm/memory-failure.c:319! > invalid opcode: 0000 [#1] PREEMPT SMP PTI > CPU: 13 PID: 1262 Comm: dax-pmd Tainted: G OE N 6.0.0-rc2+ #62 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 > RIP: 0010:add_to_kill.cold+0x19d/0x209 > [..] > Call Trace: > > collect_procs.part.0+0x2c4/0x460 > memory_failure+0x71b/0xba0 > ? _printk+0x58/0x73 > do_madvise.part.0.cold+0xaf/0xc5 > > Fixes: c36e20249571 ("mm: introduce mf_dax_kill_procs() for fsdax case") > Cc: Shiyang Ruan > Cc: Christoph Hellwig > Cc: Darrick J. Wong > Cc: Naoya Horiguchi > Cc: Al Viro > Cc: Dave Chinner > Cc: Goldwyn Rodrigues > Cc: Jane Chu > Cc: Matthew Wilcox > Cc: Miaohe Lin > Cc: Ritesh Harjani > Cc: Andrew Morton > Signed-off-by: Dan Williams > --- > mm/memory-failure.c | 22 ++++++++++++---------- > 1 file changed, 12 insertions(+), 10 deletions(-) > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 8a4294afbfa0..e424a9dac749 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -345,13 +345,17 @@ static unsigned long dev_pagemap_mapping_shift(struct vm_area_struct *vma, > * not much we can do. We just print a message and ignore otherwise. > */ > > +#define FSDAX_INVALID_PGOFF ULONG_MAX > + > /* > * Schedule a process for later kill. > * Uses GFP_ATOMIC allocations to avoid potential recursions in the VM. > * > - * Notice: @fsdax_pgoff is used only when @p is a fsdax page. > - * In other cases, such as anonymous and file-backend page, the address to be > - * killed can be caculated by @p itself. > + * Note: @fsdax_pgoff is used only when @p is a fsdax page and a > + * filesystem with a memory failure handler has claimed the > + * memory_failure event. In all other cases, page->index and > + * page->mapping are sufficient for mapping the page back to its > + * corresponding user virtual address. > */ > static void add_to_kill(struct task_struct *tsk, struct page *p, > pgoff_t fsdax_pgoff, struct vm_area_struct *vma, > @@ -367,11 +371,7 @@ static void add_to_kill(struct task_struct *tsk, struct page *p, > > tk->addr = page_address_in_vma(p, vma); > if (is_zone_device_page(p)) { > - /* > - * Since page->mapping is not used for fsdax, we need > - * calculate the address based on the vma. > - */ > - if (p->pgmap->type == MEMORY_DEVICE_FS_DAX) > + if (fsdax_pgoff != FSDAX_INVALID_PGOFF) > tk->addr = vma_pgoff_address(fsdax_pgoff, 1, vma); > tk->size_shift = dev_pagemap_mapping_shift(vma, tk->addr); > } else > @@ -523,7 +523,8 @@ static void collect_procs_anon(struct page *page, struct list_head *to_kill, > if (!page_mapped_in_vma(page, vma)) > continue; > if (vma->vm_mm == t->mm) > - add_to_kill(t, page, 0, vma, to_kill); > + add_to_kill(t, page, FSDAX_INVALID_PGOFF, vma, > + to_kill); > } > } > read_unlock(&tasklist_lock); > @@ -559,7 +560,8 @@ static void collect_procs_file(struct page *page, struct list_head *to_kill, > * to be informed of all such data corruptions. > */ > if (vma->vm_mm == t->mm) > - add_to_kill(t, page, 0, vma, to_kill); > + add_to_kill(t, page, FSDAX_INVALID_PGOFF, vma, > + to_kill); > } > } > read_unlock(&tasklist_lock); > > > . >