From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0148DC00140 for ; Mon, 8 Aug 2022 17:49:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 56E538E0001; Mon, 8 Aug 2022 13:49:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 51C6E6B0073; Mon, 8 Aug 2022 13:49:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E49F8E0001; Mon, 8 Aug 2022 13:49:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 306D86B0072 for ; Mon, 8 Aug 2022 13:49:33 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id EA6E5A0396 for ; Mon, 8 Aug 2022 17:49:32 +0000 (UTC) X-FDA: 79777162584.28.8A94AC0 Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) by imf27.hostedemail.com (Postfix) with ESMTP id 917AE40171 for ; Mon, 8 Aug 2022 17:49:32 +0000 (UTC) Received: by mail-pj1-f44.google.com with SMTP id o5-20020a17090a3d4500b001ef76490983so9843983pjf.2 for ; Mon, 08 Aug 2022 10:49:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=ZHWhwTp8cIhDH0JfdO8R0xF7w8OjHfJJkLv1GLih11s=; b=V3qpWh53Mb5omKbZFtYiKWx2IlfCzqrn2o0A6HAQqtW4byq7mD/Yc48ryXkobrMC5d Fu2qbq1u6w24Ot59MOlna2X89sB7ZDYuOOmxjJbdEom3A74VqYfJI+d+eDHRUzW1K+tJ BLoK62sYHEj6YXe4cd991bFly/gChC6VBxTk3vrjiv/ojNsRKuzN9uW73E/k2V+VbAXz Fb2kNu0R/4us6CNGH1ryfV0kBHzhEFUS3g1MxncFscVEGI4McbaJUv6Hwj/hIvW/3Jxw Cu+JtMFVQvUnp2jsl3H3iJMBdavVd32owUbbch2BgtLDn9RrMDI2Lgae6n882fnxLrvN fttA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=ZHWhwTp8cIhDH0JfdO8R0xF7w8OjHfJJkLv1GLih11s=; b=BgjG369VO+pz6k6bXkXnsI8+1Kb9TQCH0TZlOad02Ou10yaNhgKEJwUz9Ffb10jnig mithCraipCpkiLCmsXsdBxtEG4d7C49Zxv9NVYlM6u7R62bc5FdNS8u0JeHVfueL4jKx KMWDNsyf6IhRg/cIQbmPhb5FEgCixhEt/pgM78bV3mKYmPqE7YSWQkZAaC/cOdL4ROaP COlJL8mKoAWo0+xqGX5uCS64OV3MxTkvuHcj/5X7qINWRe3kLtHZNz5ehOn/Uicx8SKu lJSvURAVCh+SV/1OZjofd5hkuG1eCiU7FiffFI4xg/hUWi8osWoNmKvKB+FGk9hNBo4t jCDA== X-Gm-Message-State: ACgBeo22b0Kr6j+itjYhstiJM5HvwzpfRnBwHqCVdSotzCpJMjiip2Vq eI68wMYk/OTVm4NEFqBLUGQ1XdU1p3EEtZHKxp8= X-Google-Smtp-Source: AA6agR4CDTkHJJhfCwEqND0r4mksNo1alm+WRQm6swYX8edu2ahtFBga5SuKJQeBJ0Z6oGs420i14eINB5nycOhBwr0= X-Received: by 2002:a17:90b:1b45:b0:1f3:1974:eb8 with SMTP id nv5-20020a17090b1b4500b001f319740eb8mr21689633pjb.200.1659980971499; Mon, 08 Aug 2022 10:49:31 -0700 (PDT) MIME-Version: 1.0 References: <20220805062844.439152-1-fengwei.yin@intel.com> In-Reply-To: <20220805062844.439152-1-fengwei.yin@intel.com> From: Yang Shi Date: Mon, 8 Aug 2022 10:49:19 -0700 Message-ID: Subject: Re: [PATCH v2] mm: release private data before split THP To: Yin Fengwei Cc: linux-mm@kvack.org, naoya.horiguchi@nec.com, linmiaohe@huawei.com, willy@infradead.org, aaron.lu@intel.com, tony.luck@intel.com, qiuxu.zhuo@intel.com Content-Type: text/plain; charset="UTF-8" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659980972; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZHWhwTp8cIhDH0JfdO8R0xF7w8OjHfJJkLv1GLih11s=; b=Q1JK/7k/37391R+GDTd4PP8PxPjXHrlgBk3xqBQwgTL3eCkTp+UjlxWEIMLc/oxFXU0GOm BGVxAOV2tlz4C2VQFnQNraVK8sqo4zAPAN4WmY3yXqUHdFzRjNb+SxmCXjKxZYg4PFrJhO d5kpYSrRgN5j4BXlgebiDGb4skoBdNY= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=V3qpWh53; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of shy828301@gmail.com designates 209.85.216.44 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1659980972; a=rsa-sha256; cv=none; b=GX4y1U9gWHzAoeqtUzHgHXgLwXVI4NF9mQa9O5fz3rD4vFWC/20sBArW0jbGJWqp12DbjA +lH7WSNE5Xr87YRIG5pzAmrz5YGUPz3STatRGWkBDxxRha4Yb79MNaVaLU8dwLf0UH8YcX X4kR4xLcTVC/UNR1iJCCUbc1IZOn+Ws= X-Rspamd-Queue-Id: 917AE40171 Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=V3qpWh53; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of shy828301@gmail.com designates 209.85.216.44 as permitted sender) smtp.mailfrom=shy828301@gmail.com X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: ajgz5nwkbj4jpga6ry56d41ykh4sk71y X-HE-Tag: 1659980972-787282 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Aug 4, 2022 at 11:29 PM Yin Fengwei wrote: > > If there is private data attached to THP, the refcount of > THP will be increased and block the THP split. Release > private data attached to THP before split it to increase > the chance of splitting THP successfully. > > There was a memory failure issue hit during HW error > injection testing with 5.18 kernel + xfs as rootfs. Test > got killed and system reboot was required to re-run the > test. > > The issue was tracked down to THP split failure caused the > memory failure not being handled. The page dump showed: > > [ 1785.433075] page:0000000025f9530b refcount:18 mapcount:0 mapping:000000008162eea7 index:0xa10 pfn:0x2f0200 > [ 1785.443954] head:0000000025f9530b order:4 compound_mapcount:0 compound_pincount:0 > [ 1785.452408] memcg:ff4247f2d28e9000 > [ 1785.456304] aops:xfs_address_space_operations ino:8555182 dentry name:"baseos-filenames.solvx" > [ 1785.466612] flags: 0x1000000000012036(referenced|uptodate|lru|active|private|head|node=0|zone=2) > [ 1785.476514] raw: 1000000000012036 ffb9460f8bc07c08 ffb9460f8bc08408 ff4247f22e6299f8 > [ 1785.485268] raw: 0000000000000a10 ff4247f194ade900 00000012ffffffff ff4247f2d28e9000 > > It was like the error was injected to a large folio for xfs > with private data attached. > > With private data released before split THP, the test case > could be run successfully many times without reboot system. > > Co-developed-by: Qiuxu Zhuo > Signed-off-by: Qiuxu Zhuo > Signed-off-by: Yin Fengwei > Suggested-by: Matthew Wilcox > Reviewed-by: Aaron Lu > --- > Changelog from v1: > - Move private release to split_huge_page_to_list > to cover wider path per Yang's comment > - Update to commit message > > Changelog from RFC: > - Use new folio API per Mathhew Wilcox's suggestion > - Add one line comment before re-get folio of page per > Miaohe's comment > - Remove RFC tag > - Add Co-developed-by of Qiuxu who did a lot of debugging > work to locate where the real issue is > > mm/huge_memory.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 15965084816d..edcbc6c2bb3f 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -2590,6 +2590,12 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) > goto out; > } > > + if (folio_test_private(folio) && > + !filemap_release_folio(folio, GFP_KERNEL)) { The GFP_KERNEL is fine for most THP split callsites except for the memory reclaim path since it might not allow certain flags to avoid recursion, for example, nested reclaim, issue I/O, etc. The most filesystems clear __GFP_FS. However it should not be a real life problem now since AFAIK just xfs supports large folios for now and xfs uses iomap release_folio() method which actually ignores gfp flags. So it sounds safer to follow the gfp convention used by xas_split_alloc() in the below. The best way is to pass in the gfp flag from the reclaimer IMO, but it seems overkilling at the moment. > + ret = -EBUSY; > + goto out; > + } > + > xas_split_alloc(&xas, head, compound_order(head), > mapping_gfp_mask(mapping) & GFP_RECLAIM_MASK); > if (xas_error(&xas)) { > > base-commit: 31be1d0fbd950395701d9fd47d8fb1f99c996f61 > -- > 2.25.1 >