From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D912C4167B for ; Tue, 31 Oct 2023 07:54:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3E9656B026B; Tue, 31 Oct 2023 03:54:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 399606B026E; Tue, 31 Oct 2023 03:54:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 260426B0272; Tue, 31 Oct 2023 03:54:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 132B66B026B for ; Tue, 31 Oct 2023 03:54:28 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id DB398B4FE9 for ; Tue, 31 Oct 2023 07:54:27 +0000 (UTC) X-FDA: 81404994174.21.26F78A6 Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) by imf01.hostedemail.com (Postfix) with ESMTP id 9A1534000A for ; Tue, 31 Oct 2023 07:54:25 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b=Ilgmz83U; spf=pass (imf01.hostedemail.com: domain of quic_pkondeti@quicinc.com designates 205.220.168.131 as permitted sender) smtp.mailfrom=quic_pkondeti@quicinc.com; dmarc=pass (policy=none) header.from=quicinc.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698738865; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nJTUnRTpbFY6bf2SOeqkwUAGzVKp03Eymn6YEB6D8mY=; b=pxBqITugpbrhBQo+eFav054AAhKH0zOzGKnps53c9lO3LwgkIr70tiVHXErX6AiHgf8L6/ m+60cxfFFu2deiWC7RMx0Con8ZXvRkFZrCNZspLYs98ppJiYAYdcb8EL9LJIqnvapo1QgK rcd5idO4SZEF/s/UDLIxcgiYeAoFgfc= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b=Ilgmz83U; spf=pass (imf01.hostedemail.com: domain of quic_pkondeti@quicinc.com designates 205.220.168.131 as permitted sender) smtp.mailfrom=quic_pkondeti@quicinc.com; dmarc=pass (policy=none) header.from=quicinc.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698738865; a=rsa-sha256; cv=none; b=ydTC3AH5k7uQJCbRqXfyoUW6bokDyaCjqdd44vpV9SS9d0VSY6xcjWXDIiESgRDsP6VASl FbiZzOwP+cfmX4VtDQndZdYBUtAa8vT1fwZXzH1fA3hJxSHIzoP2XwZb8CLS34Sdr8jMjh KKp7dUAZ1/OBg9wub97hLAe2gnv8zhk= Received: from pps.filterd (m0279865.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39V5Hk8N015183; Tue, 31 Oct 2023 07:54:24 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : in-reply-to; s=qcppdkim1; bh=nJTUnRTpbFY6bf2SOeqkwUAGzVKp03Eymn6YEB6D8mY=; b=Ilgmz83UueDuSnACNkvPsuVpHaGv+gXZlfGO5Qvs7f71y9fRzNcgJb6wXUttKIaOV5+8 d4ol5Px4e8gtRgvZus9UFpOoy4BK/ZWAWh/4syEoBDozoTGpF3V2EQE+UZfKkz62IPXQ 8A/cbjGA+U+8bWtdLrFqz+DkJqD+y0cWdSF6I1WG0B7Fqm64n5hkvWSP53vaTlgEZq5f i3s17Y/T4xPKYLch1b4KGb5QTJtv0eqwiLj2w62ZrsPqqn0LBKhPMTGpXLn5fWLoqzR1 oGzRASZNtuTG2L2LwwSIrZwGyHNavHihqEaUOuMs0E8oROICE/osl6bcLgc2DzzcTEJr QA== Received: from nalasppmta01.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3u2at12fc0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 31 Oct 2023 07:54:23 +0000 Received: from nalasex01a.na.qualcomm.com (nalasex01a.na.qualcomm.com [10.47.209.196]) by NALASPPMTA01.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 39V7rvkr004502 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 31 Oct 2023 07:53:57 GMT Received: from hu-pkondeti-hyd.qualcomm.com (10.80.80.8) by nalasex01a.na.qualcomm.com (10.47.209.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.39; Tue, 31 Oct 2023 00:53:54 -0700 Date: Tue, 31 Oct 2023 13:23:51 +0530 From: Pavan Kondeti To: Charan Teja Kalla CC: , , , , , , Subject: Re: [PATCH] mm: page_alloc: unreserve highatomic page blocks before oom Message-ID: References: <1698669590-3193-1-git-send-email-quic_charante@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <1698669590-3193-1-git-send-email-quic_charante@quicinc.com> X-Originating-IP: [10.80.80.8] X-ClientProxiedBy: nasanex01b.na.qualcomm.com (10.46.141.250) To nalasex01a.na.qualcomm.com (10.47.209.196) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: 1IA_lDv0LlpbLQMLSbGowYRHNZfsnmC_ X-Proofpoint-GUID: 1IA_lDv0LlpbLQMLSbGowYRHNZfsnmC_ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-30_13,2023-10-31_03,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 adultscore=0 mlxlogscore=999 bulkscore=0 mlxscore=0 phishscore=0 impostorscore=0 spamscore=0 malwarescore=0 clxscore=1011 suspectscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2310240000 definitions=main-2310310061 X-Rspamd-Queue-Id: 9A1534000A X-Rspam-User: X-Stat-Signature: y1con7b1pxqc9zzmeeicf6wtfmaswizk X-Rspamd-Server: rspam01 X-HE-Tag: 1698738865-429122 X-HE-Meta: U2FsdGVkX1/3iR3733eD8quUlHyAsARkkzdlMxSrwd/5xFWvolS9ZaGm9944YL5vkTRxE4zuMKe7d1z5LXmMkBFqfRgB6t6htWHDILfTaMjs1SYC3GjWJfI4wHrXMzkMJgGbDv2y9G1mg+pJpWguwzRx/n9iXIge5UOhyHI8Rrg/GPaSK7arHUKtyE86NJUSIr2Dm32wKJIDhFQdhJW9T5NJn8kmP04SErOK74mE2xAass8Xxe5YgRNlPtBR+HCN0K2DqFBEE7z9GDZwOnUEWKKgm+9XnWbijLKU43ex/FrmHfPwA0zokoIAaeGhzBYp3S1Beb0FyiLkWVLQbgQfKhsHtebn6OYwe9U9ua5xm4JLgMBTjbL+JOWa9bTEmkAwRHyWjctj3Esk4lShq+iegvtVARold5i94bgkMocORetpzFzusRoVh136iYIHgCzpDv3H+sns0tEWtqUAOBH/et6mvSkRHR0rvahXdfBQLxltnIQPFugjDaUuU4/NWv8CW8tX4QVgbcfWs7ROVwCDMJ7IprsMRbCx3uoo0shjiUhetQ0JOUfWrO6R0ytcdkEM/FjiQjwPHzdmysdMF/h1n2f5MLP4TeGw1zOU2VMDUuqWMsi3nc1K+yge/Rx6BBtWNNVzHS/pIEQ9dvH/9PVAqc5QERYe4X/O+d1fWnAvnrdZyOF9MfyOZXXOYgUZNOWhhLXnLJanpmpXpTMK9rOeo4RTDOnS6rPETVi+m5BUoATVEIp7H1weP/xXpNdFFzVi6H+AAxMIzT4YygpwXoH7tmg08pNEMzAmNvv3cqIwzQUztYjmTMvHKavYS8oSpQotkZQML0xBnT9QkEs/jsPHD1wpPhvwPIlfGbEOYeTdMPJ7vM1CPwkV+1Tj6N+mhTWeJpjo/4PcBgPyq4PpWhwfbu5RpyoRfBLEMO+QZiu2RAwgmD5H5u0du52oSgvA/jSo+b3K/FFLC/dHmoudaD3 UHYTZJl1 GTPmDw2aIVL8tLl8XeVoodWyqLcwOgiuTboj5XMixwSKieiszZb6Bvs5vEDQBUgG/LFgrVL4I/lpvgjqNKjAHc44oDvTKXqbUC5r9eM0B25RvEN2HhjZPf4LP/b+b0bzEl3ftA4FF9UEf0KmO3k6fp4+3nEHotSDZLC9bFAnyNtzcAsbuxbdHFoaFhEhP4rXHkxErtb0ZPIl+mybWG7wtrxQ/32sjG6qMXM7RDAMoZheIYUM1oERYxWFqhiqceP22twabJqLttvQ0hrEnlqhBuy4rU13wIR5e9Hd3SWi4KFEeSIhuHmaPBeZajZ507abnJ68dPkQSU8mCwHz3/n2tTXm07TuSuSxvozZY X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Oct 30, 2023 at 06:09:50PM +0530, Charan Teja Kalla wrote: > __alloc_pages_direct_reclaim() is called from slowpath allocation where > high atomic reserves can be unreserved after there is a progress in > reclaim and yet no suitable page is found. Later should_reclaim_retry() > gets called from slow path allocation to decide if the reclaim needs to > be retried before OOM kill path is taken. > > should_reclaim_retry() checks the available(reclaimable + free pages) > memory against the min wmark levels of a zone and returns: > a) true, if it is above the min wmark so that slow path allocation will > do the reclaim retries. > b) false, thus slowpath allocation takes oom kill path. > > should_reclaim_retry() can also unreserves the high atomic reserves > **but only after all the reclaim retries are exhausted.** > > In a case where there are almost none reclaimable memory and free pages > contains mostly the high atomic reserves but allocation context can't > use these high atomic reserves, makes the available memory below min > wmark levels hence false is returned from should_reclaim_retry() leading > the allocation request to take OOM kill path. This is an early oom kill > because high atomic reserves are holding lot of free memory and > unreserving of them is not attempted. > > (early)OOM is encountered on a machine in the below state(excerpt from > the oom kill logs): > [ 295.998653] Normal free:7728kB boost:0kB min:804kB low:1004kB > high:1204kB reserved_highatomic:8192KB active_anon:4kB inactive_anon:0kB > active_file:24kB inactive_file:24kB unevictable:1220kB writepending:0kB > present:70732kB managed:49224kB mlocked:0kB bounce:0kB free_pcp:688kB > local_pcp:492kB free_cma:0kB > [ 295.998656] lowmem_reserve[]: 0 32 > [ 295.998659] Normal: 508*4kB (UMEH) 241*8kB (UMEH) 143*16kB (UMEH) > 33*32kB (UH) 7*64kB (UH) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB > 0*4096kB = 7752kB > > Per above log, the free memory of ~7MB exist in the high atomic > reserves is not freed up before falling back to oom kill path. > > This fix includes unreserving these atomic reserves in the OOM path > before going for a kill. The side effect of unreserving in oom kill path > is that these free pages are checked against the high wmark. If > unreserved from should_reclaim_retry()/__alloc_pages_direct_reclaim(), > they are checked against the min wmark levels. > > Signed-off-by: Charan Teja Kalla Thanks for the detailed commit description. Really helpful in understanding the problem you are fixing. > --- > mm/page_alloc.c | 18 ++++++++++++++++++ > 1 file changed, 18 insertions(+) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 95546f3..2a2536d 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -3281,6 +3281,8 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, > .order = order, > }; > struct page *page; > + struct zone *zone; > + struct zoneref *z; > > *did_some_progress = 0; > > @@ -3295,6 +3297,16 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, > } > > /* > + * If should_reclaim_retry() encounters a state where: > + * reclaimable + free doesn't satisfy the wmark levels, > + * it can directly jump to OOM without even unreserving > + * the highatomic page blocks. Try them for once here > + * before jumping to OOM. > + */ > +retry: > + unreserve_highatomic_pageblock(ac, true); > + Not possible to fix this in should_reclaim_retry()? > + /* > * Go through the zonelist yet one more time, keep very high watermark > * here, this is only to catch a parallel oom killing, we must fail if > * we're still under heavy pressure. But make sure that this reclaim > @@ -3307,6 +3319,12 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, > if (page) > goto out; > > + for_each_zone_zonelist_nodemask(zone, z, ac->zonelist, ac->highest_zoneidx, > + ac->nodemask) { > + if (zone->nr_reserved_highatomic > 0) > + goto retry; > + } > + > /* Coredumps can quickly deplete all memory reserves */ > if (current->flags & PF_DUMPCORE) > goto out;