From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7FFFC001E0 for ; Wed, 2 Aug 2023 11:42:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4AD7628015E; Wed, 2 Aug 2023 07:42:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 45D23280143; Wed, 2 Aug 2023 07:42:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2FDBA28015E; Wed, 2 Aug 2023 07:42:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 1C49D280143 for ; Wed, 2 Aug 2023 07:42:31 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id CC30F14086B for ; Wed, 2 Aug 2023 11:42:30 +0000 (UTC) X-FDA: 81078976860.01.4543C50 Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) by imf08.hostedemail.com (Postfix) with ESMTP id 6BC5016001D for ; Wed, 2 Aug 2023 11:42:28 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b=jGV80U9Y; dmarc=pass (policy=none) header.from=quicinc.com; spf=pass (imf08.hostedemail.com: domain of quic_charante@quicinc.com designates 205.220.180.131 as permitted sender) smtp.mailfrom=quic_charante@quicinc.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690976548; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=guJ37QtxG3DruOPStXclgXIZhZya9NYk/GyruzfWmAI=; b=KssDV9/ohIsoVN18w3S4s161Hj5QrG3fJnz9dvtILvVInHtOnu+kjCDHCTiSPW4RQ4BGrr seMkhPV6+IwzMAVIOkKUclVXYs45dsQkqmdQXyL/pJNfmie8WfcNRImsaAIyIKbH0gpJGd TbmxSqIxevT0RG+DgnIz9TkaBhuWb/k= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b=jGV80U9Y; dmarc=pass (policy=none) header.from=quicinc.com; spf=pass (imf08.hostedemail.com: domain of quic_charante@quicinc.com designates 205.220.180.131 as permitted sender) smtp.mailfrom=quic_charante@quicinc.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690976548; a=rsa-sha256; cv=none; b=Ip2mUdXhU0lvlJ3zYPewhG6veptfVeX5fBqVrlfg6kUCmng6bYSCqZaMF9eVXPvSKUmudd jCl4279wdaUkJv6U8CQSxMX/pckYgbZPES9YqhwTRfOwRaz3e0/Iqtzu/SPWjNNfK2CFdv BJ/Y7o9MWebHmjo5SlmSTwTieV9wpT4= Received: from pps.filterd (m0279869.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 372A4vrJ024218; Wed, 2 Aug 2023 11:41:58 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=qcppdkim1; bh=guJ37QtxG3DruOPStXclgXIZhZya9NYk/GyruzfWmAI=; b=jGV80U9YdS9bZXgflbUXpSYPJWzA+z5lrlg7c33gz4aRov3T/A0YFAZsa8na1LRUyLzW s7/8FnucO3QJlkdvxdfmhzCytyW7z0rva737tldPWdvAMLyKDzrRGmdNcIcTZUsvIbC4 Eedh2m3l2m8Ypm71y7s6BRFab81DZ9I+voWA2Q84JAlMPFA8ONWzf+NPQHcR4aZKnCmo eEjNvpX4ktoG1g7puAK4C2TNwB7nhPC2CzHO08u2eenqLZK6v2gHpHjidJVIyFyXYJKE DqXE9bYkDuNQN71nIiOa8+nv6U8fHZ41lxCEogTyl4NDu8tPJ0/TffJylxjIpb1tGH2m LA== Received: from nalasppmta02.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3s7bp69gdn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 02 Aug 2023 11:41:57 +0000 Received: from nalasex01a.na.qualcomm.com (nalasex01a.na.qualcomm.com [10.47.209.196]) by NALASPPMTA02.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 372BfuTr005620 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 2 Aug 2023 11:41:56 GMT Received: from [10.216.58.189] (10.80.80.8) by nalasex01a.na.qualcomm.com (10.47.209.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.30; Wed, 2 Aug 2023 04:41:50 -0700 Message-ID: <447aa786-b7c5-807e-1a6e-fb8369fc8a97@quicinc.com> Date: Wed, 2 Aug 2023 17:11:47 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.11.2 Subject: Re: [PATCH v2 1/3] mm-unstable: Multi-gen LRU: Fix per-zone reclaim Content-Language: en-US To: Kalesh Singh , , CC: , , , , Lecopzer Chen , Matthias Brugger , AngeloGioacchino Del Regno , Suleiman Souhlal , Oleksandr Natalenko , "Jan Alexander Steffens (heftig)" , Qi Zheng , Steven Barrett , Brian Geffon , Barry Song , , , , References: <20230802025606.346758-1-kaleshsingh@google.com> From: Charan Teja Kalla In-Reply-To: <20230802025606.346758-1-kaleshsingh@google.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.80.80.8] X-ClientProxiedBy: nasanex01b.na.qualcomm.com (10.46.141.250) To nalasex01a.na.qualcomm.com (10.47.209.196) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: 4XUYOAFIT1jf5-1o0YBu2i45mHyWOX_D X-Proofpoint-ORIG-GUID: 4XUYOAFIT1jf5-1o0YBu2i45mHyWOX_D X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-08-02_06,2023-08-01_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 phishscore=0 lowpriorityscore=0 malwarescore=0 impostorscore=0 spamscore=0 mlxlogscore=999 bulkscore=0 suspectscore=0 clxscore=1011 adultscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2308020103 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 6BC5016001D X-Stat-Signature: um9kyeizfk3iqc5rq9psckcu1f9mg371 X-HE-Tag: 1690976548-597826 X-HE-Meta: U2FsdGVkX1+wkdnvPKIE8XX+X2JBo4OaAV6weqgCTnYrRX4ObubzBpo5cEKEGkHsSAHamom8RzMdldCoi47TIwdIVHvX/a/jDzI/nSPr9xoQKVkYlGlSZKu+smLqykkBZEnG7WaoKeRPQszSfOBqXag8StkM7WMap2FU74SYgzquPL4q3g/+BDMwQIRjBjVd6gmC5mqwThM/FQlJcIlTLS1+lYGBg4hEscbbW7vWbjqfIblwiM/1qeyDXBR2hwRwizBHlWcsswx2EAG4CYMtfIW22EE0WvchGxlr/f6f3hRMNJI/98foBXn8XWWOGFlJYSzy1NJ/dd2F5gAR/VQcDa+5ZfsY4mxyWYwCELtrWwlLRqxLh+vPaO6VbX4IZJhT2kumnW5qtYhptBMWBdkzQtZRcenyZR07YI3XUoxqgGa4tsfoGe2M55EDRQt/UWCMzQzXYnN31iOHnvgVCnzpax3gjyCI2CIjz4OZUQA7JZyVmbCVXY/xjmL8m0buiX0RsDQ4xmbOuwQqMMU2joliNbWO/gj4YaUsqpK97zyOgPT5bXRiAFtdqh/fij+xHd0W6UZjUm24z4XlFCao6vpfTW/jCMISKW61QHEqAEbxpKTNJbpn+Ve4usnlNXoj2+bujb/36kMiP5QXI+4DWq3rl92gijRx5+53Qp7lEMvYf9pUDyhpSkfTlJDWXslY4QMgUVo/DeshrdEToMPg2VI2SDGyv6TvdvArPRqd1OmAwdqxNi3qfyqzJgKwwKGurIzh6toi2HAQ9pgUoaWHKlcCCBycGbkxFHgPxAlk0+Z9pp7tpInkDIF2ZEKktpAfTvQu11nIEGjMuRmDBB5t1T4QUmVpLRMf4I1JFSpKRvH35u/j/S7RBLs+twcamGnhBR8BasNhXTi22L8cXk9qoVdGEi16/ym5fN3BEgqSBp82tOymhnylz6Usk9xCscy3yO0SJBUlM3jq+uAA5sIRWym RTaLejLi t3mrgkZQM/+ukYrYnwHQi4sLV1tv3aFoFvKBpkPimVR0df3g64Y8ZUVt+IC7QyL7yc/AJAbdFBHO1wZq2P9mDXdLuFbc8aViAc80tOYoZzU6xF6KuOlFAyIFuuxplWOJ0glcDrPl6Y1MDFu9gq4g7KVC16a3pRZq5b+nU+fiuSwZPvFph0wTu/ixfqy+JgYqnf1pCzEMatvWngEsTpytbKFKUc7pr18fJPqEMukcoPEP4WmHTCq7CrhXitaxUi96t0WEIzZMij0zlkPUncmLgsMWrROs246JjXGEWyuz8YplZVYTUPsPWURkmyTlniVrAijqHmJx7I2MFk+0C1YHojs3OstsZLfdvc9bnkuUv4U5iEOESmkTkeOqFgKd6i3cskREXiDQiwzfoCDAHV7cIRi+R+wkXrYSibI9jao45FHntYAYgDZ4TVvEu4P9iwab2aHgmOr3RdTTLZvJFYPtTzmeQXN7haunlOqx1S1rezsiZG5sfg4ov1WPOIcDv3GJCSi9V98Oq5PH/Xo0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Thanks Kalesh for taking this to upstream. On 8/2/2023 8:26 AM, Kalesh Singh wrote: > MGLRU has a LRU list for each zone for each type (anon/file) in each > generation: > > long nr_pages[MAX_NR_GENS][ANON_AND_FILE][MAX_NR_ZONES]; > The min_seq (oldest generation) can progress independently for each > type but the max_seq (youngest generation) is shared for both anon and > file. This is to maintain a common frame of reference. > > In order for eviction to advance the min_seq of a type, all the per-zone > lists in the oldest generation of that type must be empty. > > The eviction logic only considers pages from eligible zones for > eviction or promotion. > > scan_folios() { > ... > for (zone = sc->reclaim_idx; zone >= 0; zone--) { > ... > sort_folio(); // Promote > ... > isolate_folio(); // Evict > } > ... > } > > Consider the system has the movable zone configured and default 4 > generations. The current state of the system is as shown below > (only illustrating one type for simplicity): > > Type: ANON > > Zone DMA32 Normal Movable Device > > Gen 0 0 0 4GB 0 > > Gen 1 0 1GB 1MB 0 > > Gen 2 1MB 4GB 1MB 0 > > Gen 3 1MB 1MB 1MB 0 > > Now consider there is a GFP_KERNEL allocation request (eligible zone > index <= Normal), evict_folios() will return without doing any work > since there are no pages to scan in the eligible zones of the oldest > generation. Reclaim won't make progress until triggered from a ZONE_MOVABLE > allocation request; which may not happen soon if there is a lot of free > memory in the movable zone. This can lead to OOM kills, although there > is 1GB pages in the Normal zone of Gen 1 that we have not yet tried to > reclaim. > > This issue is not seen in the conventional active/inactive LRU since > there are no per-zone lists. > > If there are no (not enough) folios to scan in the eligible zones, move > folios from ineligible zone (zone_index > reclaim_index) to the next > generation. This allows for the progression of min_seq and reclaiming > from the next generation (Gen 1). > As discussing offline, I think this can make system to spend too much time in scan_folios() in moving the pages from Gen-0 to Gen-1 of the other zone which can result into OOM is not active when necessary. > Qualcomm, Mediatek and raspberrypi [1] discovered this issue independently. > > [1] https://github.com/raspberrypi/linux/issues/5395 > > Fixes: ac35a4902374 ("mm: multi-gen LRU: minimal implementation") > Cc: stable@vger.kernel.org > Cc: Yu Zhao > Cc: Andrew Morton > Reported-by: Charan Teja Kalla > Reported-by: Lecopzer Chen > Signed-off-by: Kalesh Singh We tested this patch on our systems for couple of weeks and aggressive OOM is not observed which otherwise is easily reproducible. Tested-by: Charan Teja Kalla