From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF736C54EE9 for ; Mon, 19 Sep 2022 11:44:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 28B4A940008; Mon, 19 Sep 2022 07:44:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 23A84940007; Mon, 19 Sep 2022 07:44:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 104CD940008; Mon, 19 Sep 2022 07:44:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 01F27940007 for ; Mon, 19 Sep 2022 07:44:49 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C46C8A1CC1 for ; Mon, 19 Sep 2022 11:44:48 +0000 (UTC) X-FDA: 79928653056.28.53AC7A3 Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) by imf08.hostedemail.com (Postfix) with ESMTP id 5287016000A for ; Mon, 19 Sep 2022 11:44:48 +0000 (UTC) Received: from pps.filterd (m0279863.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28JAkqQq021244; Mon, 19 Sep 2022 11:44:47 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=qcppdkim1; bh=OisH3uQLXFhfu16/Eza5SabomHCRaFQw7f4t3+M6Vb0=; b=iil/ClMA3cFR+Zmf/vxgiGPLAVX2f+xC72vMb1auovLasQ5YM0rJZDfMj72Nv07rjPc5 aFUkTnRXcnBvKcciP28QWqXt9CeJBBGaroBUydT6aWCuHeDlEkWTgb96hH5C+aYh99PO Hv/whXOhesW/I9bNHjHMEWseYVCJRV+jCYjMudfWSMbeLSRfYEXF/w4H2EdzE8VPMolm d4CQPZUvTfBc99oP+vw7WhFNZGcm44HijbQ3R0mudKjqMJZnAt0eQrlKLGS91f7UuPB5 POw6Y22ro0Yn9MyOP53GDof3n6C57fTxm4fh4z5yAIKCK/lKSEmkPtDtIP1rOiKBou1/ Xw== Received: from nalasppmta02.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3jn6f84fr3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Sep 2022 11:44:47 +0000 Received: from nalasex01a.na.qualcomm.com (nalasex01a.na.qualcomm.com [10.47.209.196]) by NALASPPMTA02.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 28JBOaA3002294 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Sep 2022 11:24:36 GMT Received: from [10.239.132.245] (10.80.80.8) by nalasex01a.na.qualcomm.com (10.47.209.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.29; Mon, 19 Sep 2022 04:24:34 -0700 Message-ID: Date: Mon, 19 Sep 2022 19:24:32 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.2.2 Subject: Re: [RESEND PATCH] mm:page_alloc.c: lower the order requirement of should_reclaim_retry To: Michal Hocko CC: , , , , References: <1663556455-30188-1-git-send-email-quic_zhenhuah@quicinc.com> Content-Language: en-US From: Zhenhua Huang In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.80.80.8] X-ClientProxiedBy: nasanex01b.na.qualcomm.com (10.46.141.250) To nalasex01a.na.qualcomm.com (10.47.209.196) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: R__gtOSrbiOlO7OE6zSkuvQ_6EQLni93 X-Proofpoint-ORIG-GUID: R__gtOSrbiOlO7OE6zSkuvQ_6EQLni93 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-19_05,2022-09-16_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 mlxscore=0 impostorscore=0 priorityscore=1501 phishscore=0 adultscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 clxscore=1015 lowpriorityscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2209190079 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663587888; a=rsa-sha256; cv=none; b=A8PJe8Ir15eUi+3W2pak9tfe6qla8diKrqCM8p+DmLKK8tnpLrEhpGhbR+90UWGC3v2GVX iP+m1CsjyGqKk5hlUE7xwqQ8voJvXcIgpi2bcY4aOOKSIsGzvYAR6Q6mOuTAEdwkKfroe+ v7f1ku0rNdvskzp8jEQufHy6I0ApGfM= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b="iil/ClMA"; spf=pass (imf08.hostedemail.com: domain of quic_zhenhuah@quicinc.com designates 205.220.168.131 as permitted sender) smtp.mailfrom=quic_zhenhuah@quicinc.com; dmarc=pass (policy=none) header.from=quicinc.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663587888; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OisH3uQLXFhfu16/Eza5SabomHCRaFQw7f4t3+M6Vb0=; b=HyLyfRfdOR1Q4mz9NQkmAXowsu4JAnCu0bbcWgK5Me5dXY07LF+2CGtAwcCzosHsPgMBSY UcHiFIUprmmbG6eNMOIVAugxKhCA0x4LvmXaaUyikoDRxdDmuyNR1wD0/Gri3ICFOjpJmJ BZlQfpJILSecZIoGdVmNx7xwYDL7utQ= X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 5287016000A Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b="iil/ClMA"; spf=pass (imf08.hostedemail.com: domain of quic_zhenhuah@quicinc.com designates 205.220.168.131 as permitted sender) smtp.mailfrom=quic_zhenhuah@quicinc.com; dmarc=pass (policy=none) header.from=quicinc.com X-Stat-Signature: 1kpg3wt9t8nzp8g3z3dybcy1urrkokja X-HE-Tag: 1663587888-855510 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Thanks Michal for comments! On 2022/9/19 16:14, Michal Hocko wrote: > On Mon 19-09-22 11:00:55, Zhenhua Huang wrote: >> When a driver was continuously allocating order 3 >> pages, it would be very easily OOM even there were lots of reclaimable >> pages. A test module is used to reproduce this issue, >> several key ftrace events are as below: >> >> insmod-6968 [005] .... 321.306007: reclaim_retry_zone: node=0 zone=Normal >> order=3 reclaimable=539988 available=592856 min_wmark=21227 no_progress_loops=0 >> wmark_check=0 >> insmod-6968 [005] .... 321.306009: compact_retry: order=3 >> priority=COMPACT_PRIO_SYNC_LIGHT compaction_result=withdrawn retries=0 >> max_retries=16 should_retry=1 >> insmod-6968 [004] .... 321.308220: >> mm_compaction_try_to_compact_pages: order=3 gfp_mask=GFP_KERNEL priority=0 >> insmod-6968 [004] .... 321.308964: mm_compaction_end: >> zone_start=0x80000 migrate_pfn=0xaa800 free_pfn=0x80800 zone_end=0x940000, >> mode=sync status=complete >> insmod-6968 [004] .... 321.308971: reclaim_retry_zone: node=0 >> zone=Normal order=3 reclaimable=539830 available=592776 min_wmark=21227 >> no_progress_loops=0 wmark_check=0 >> insmod-6968 [004] .... 321.308973: compact_retry: order=3 >> priority=COMPACT_PRIO_SYNC_FULL compaction_result=failed retries=0 >> max_retries=16 should_retry=0 >> >> There're ~2GB reclaimable pages(reclaimable=539988) but VM decides not to >> reclaim any more: >> insmod-6968 [005] .... 321.306007: reclaim_retry_zone: node=0 zone=Normal >> order=3 reclaimable=539988 available=592856 min_wmark=21227 no_progress_loops=0 >> wmark_check=0 >> >> >From meminfo when oom, there was NO qualified order >= 3 pages(CMA page not qualified) >> can meet should_reclaim_retry's requirement: >> Normal : 24671*4kB (UMEC) 13807*8kB (UMEC) 8214*16kB (UEC) 190*32kB (C) >> 94*64kB (C) 28*128kB (C) 16*256kB (C) 7*512kB (C) 5*1024kB (C) 7*2048kB (C) >> 46*4096kB (C) = 571796kB >> >> The reason of should_reclaim_retry early aborting was that is based on having the order >> pages in its free_list. For order 3 pages, that's easily fragmented. Considering enough free >> pages are the fundamental of compaction. It may not be suitable to stop reclaiming >> when lots of page cache there. Relax order by one to fix this issue. > > For the higher order request we rely on should_compact_retry which backs > on based on the compaction feedback. I would recommend looking why the > compaction fails. I think the reason of compaction failure is there're not enough free pages. Like in ftrace events showed, free pages(which include CMA) was only 592856 - 539988 = 52868 pages(reclaimable=539988 available=592856). There are some restrictions like suitable_migration_target() for free pages and suitable_migration_source() for movable pages. Hence eligible targets is fewer. > > Also this patch doesn't really explain why it should work and honestly > it doesn't really make much sense to me either. Sorry, my fault. IMO, The reason it should work is, say for this case of order 3 allocation: we can perform direct reclaim more times as we have only order 2 pages(which *lowered* by this change) in free_list(8214*16kB (UEC)). The order requirement which I have lowered is should_reclaim_retry -> __zone_watermark_ok: for (o = order; o < MAX_ORDER; o++) { struct free_area *area = &z->free_area[o]; ... for (mt = 0; mt < MIGRATE_PCPTYPES; mt++) { if (!free_area_empty(area, mt)) return true; } Order 2 pages can be more easily met, hence VM has more chance to return true from should_reclaim_retry. > >> With the change meminfo output when first OOM showing page cache was nearly >> exhausted: >> >> Normal free: 462956kB min:8644kB low:44672kB high:50844kB >> reserved_highatomic:4096KB active_anon:48kB inactive_anon:12kB >> active_file:508kB inactive_file:552kB unevictable:109016kB writepending:160kB >> present:7111680kB managed:6175004kB mlocked:107784kB pagetables:78732kB >> bounce:0kB free_pcp:996kB local_pcp:0kB free_cma:376412kB >> >> Signed-off-by: Zhenhua Huang >> --- >> mm/page_alloc.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 36b2021..b4ca6d1 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -4954,8 +4954,11 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order, >> /* >> * Would the allocation succeed if we reclaimed all >> * reclaimable pages? >> + * considering fragmentation, enough free pages are the >> + * fundamental of compaction: >> + * lower the order requirement by one >> */ >> - wmark = __zone_watermark_ok(zone, order, min_wmark, >> + wmark = __zone_watermark_ok(zone, order ? order - 1 : 0, min_wmark, >> ac->highest_zoneidx, alloc_flags, available); >> trace_reclaim_retry_zone(z, order, reclaimable, >> available, min_wmark, *no_progress_loops, wmark); >> -- >> 2.7.4 > BTW, regarding the test case: I used order 3 with GFP_KERNEL flag to continuously allocate to reproduce it. If you have other suggestions/ideas, please let me know. Thank you. Thanks, Zhenhua