From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, MSGID_FROM_MTA_HEADER,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4983AC352A3 for ; Mon, 10 Feb 2020 16:17:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C88A420714 for ; Mon, 10 Feb 2020 16:17:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="bU6DEJz1"; dkim=pass (1024-bit key) header.d=fb.onmicrosoft.com header.i=@fb.onmicrosoft.com header.b="B8EEIPwc" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C88A420714 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=fb.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4CD806B0128; Mon, 10 Feb 2020 11:17:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 47EA16B0129; Mon, 10 Feb 2020 11:17:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3229D6B012A; Mon, 10 Feb 2020 11:17:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0106.hostedemail.com [216.40.44.106]) by kanga.kvack.org (Postfix) with ESMTP id 1C4FC6B0128 for ; Mon, 10 Feb 2020 11:17:31 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id AA07A4826 for ; Mon, 10 Feb 2020 16:17:30 +0000 (UTC) X-FDA: 76474722660.07.mom24_41735cd5a7c5b X-HE-Tag: mom24_41735cd5a7c5b X-Filterd-Recvd-Size: 11601 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by imf37.hostedemail.com (Postfix) with ESMTP for ; Mon, 10 Feb 2020 16:17:29 +0000 (UTC) Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 01AG5mAw031663; Mon, 10 Feb 2020 08:17:27 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=date : from : to : cc : subject : message-id : references : content-type : in-reply-to : mime-version; s=facebook; bh=TMquVXRo004PB/0/rXnwQt4TmOU4yLcQU/kSX7A+dIc=; b=bU6DEJz1TcqD1i+O20tmKmrigyHgLI+BDp0xpbuHA2H9+7kR1tg5ysbcM2s38ID6rG3P zE00oJu2o7vUpayN/GnMXOKNVR3CM4PVn+8VH5krp9biF1CpMLeu/9dHJPhZ1mW2UXEX Svs/l9gNQ5Cevmj9J3N7VCANUgWkw2pTM7c= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com with ESMTP id 2y2dec588n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT); Mon, 10 Feb 2020 08:17:27 -0800 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (100.104.98.9) by o365-in.thefacebook.com (100.104.94.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1779.2; Mon, 10 Feb 2020 08:17:26 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=YR9V4hv5f8GVmsaiUNf3xBy+lauXjA5mMn92J4ByBOucLTVRyXdBqU03MWIRRGXsCrCN8DaXBYjgvphyPso0+rWpRv3JYQauW3sXINZ/AeLVSy5bH4wLphdHqN7LDRpbVPWr2RACRJPTy5BO+Z3BGVxjHlaQ+og89Osap/yr5RXjjq7rWu+cukH/XckRj2pIPsX0FNpqIaqYT8Zhn5g97HaQNTAra19+hatJtRisbsmhm5xOfVkjBzLuQV0IbFIurm/MSboBMyN5GFFLfqUUMiyeyETq5DCBLWkh/Zp2TDaWMggvbFMfjyu8SFcLtSHWPJwV6/PlBDYWgmh9ol7Bjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=TMquVXRo004PB/0/rXnwQt4TmOU4yLcQU/kSX7A+dIc=; b=iYT0UjnpOJ1grCxDJqyCUtwRgL+8ANwVkXaldkJ+plDsl49jsniry56HreMjvJVXqIQz25Waxh3NohMKNwtFMWzZd+/AmMzK6t5xcCxeq+5QCYZjt6rcBs3w8tzxpslHLWO9MHsp2323xepsZ7SquKu1AnMHzz6GFjCev7N621tfd5DswSOesznLTolk9BUC+QY/r2eYiAVA7gut4B0SUhOjs+r//eIkrpnDZb9SPP1Q2CMXgEBlW70bw2cRSrsaE1Za7Sen6mJ9a0llNMAVlcqWduC4BRd5ABe/HXcnc0Z0QT/bXWH61ktCEwRqRcmiobr6iG22aDNrk+uOVn9msA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=fb.com; dmarc=pass action=none header.from=fb.com; dkim=pass header.d=fb.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.onmicrosoft.com; s=selector2-fb-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=TMquVXRo004PB/0/rXnwQt4TmOU4yLcQU/kSX7A+dIc=; b=B8EEIPwcqiEJ8ztHK/ugJPd2ZJhxAzoQ8T1D41166bxD8SViganWNq8lYCbN+RvNBbnW5BAv4mxJ3ZQnYpXqbd4yuMXPdA+tTpSa+Cp4JZuvOReF7u7GKU07kpgGr1nIbElauzPQZCY1T3ADcE1N7KSCYDubUGr/5otyF4h599s= Received: from BYAPR15MB2631.namprd15.prod.outlook.com (20.179.155.147) by BYAPR15MB3334.namprd15.prod.outlook.com (20.179.57.33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2707.27; Mon, 10 Feb 2020 16:17:25 +0000 Received: from BYAPR15MB2631.namprd15.prod.outlook.com ([fe80::ccb6:a331:77d8:d308]) by BYAPR15MB2631.namprd15.prod.outlook.com ([fe80::ccb6:a331:77d8:d308%7]) with mapi id 15.20.2707.030; Mon, 10 Feb 2020 16:17:25 +0000 Date: Mon, 10 Feb 2020 08:17:21 -0800 From: Roman Gushchin To: Gavin Shan CC: , , , , Subject: Re: [RFC PATCH] mm/vmscan: Don't round up scan size for online memory cgroup Message-ID: <20200210161721.GA167254@tower.DHCP.thefacebook.com> References: <20200210121445.711819-1-gshan@redhat.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200210121445.711819-1-gshan@redhat.com> X-ClientProxiedBy: MW2PR16CA0020.namprd16.prod.outlook.com (2603:10b6:907::33) To BYAPR15MB2631.namprd15.prod.outlook.com (2603:10b6:a03:150::19) MIME-Version: 1.0 Received: from tower.DHCP.thefacebook.com (2620:10d:c090:200::9039) by MW2PR16CA0020.namprd16.prod.outlook.com (2603:10b6:907::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2707.21 via Frontend Transport; Mon, 10 Feb 2020 16:17:24 +0000 X-Originating-IP: [2620:10d:c090:200::9039] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 4c2297f8-4673-4db6-8a77-08d7ae44b6fb X-MS-TrafficTypeDiagnostic: BYAPR15MB3334: X-Microsoft-Antispam-PRVS: X-FB-Source: Internal X-MS-Oob-TLC-OOBClassifiers: OLM:5797; X-Forefront-PRVS: 03094A4065 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(366004)(346002)(376002)(396003)(39860400002)(136003)(189003)(199004)(6506007)(66556008)(55016002)(9686003)(66476007)(66946007)(316002)(186003)(16526019)(2906002)(33656002)(5660300002)(6666004)(6916009)(81166006)(81156014)(86362001)(52116002)(8676002)(478600001)(966005)(4326008)(1076003)(8936002)(7696005);DIR:OUT;SFP:1102;SCL:1;SRVR:BYAPR15MB3334;H:BYAPR15MB2631.namprd15.prod.outlook.com;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;MX:1;A:1; Received-SPF: None (protection.outlook.com: fb.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 6g06Jl9AE+v5++2r8ycKYUKNJqbMarLpjWIcodCA4wRhUvlCCRCqojTS+poeJyjm19iDFJXpyRqi1z2VERvLw8Mg7cBD1Qot99ryQ0ewiBryVeMdYGJplElIsshDe0l/Mw71RbG5jfBo8dNoZj7zekwLVZWfCbVRKzrWmRR/fZ04ebLVoYhiA0edojkIpJ2Z6Nv8hrGiUpj9CCMxZaw8qgKGYsvCF8HWySJo29RlTdMCkwZkfdwVe6ery/GoNa01xsu0S3/sqDQC36iMrxx1oXHxZs//9cBFcT9QslGDpOgiffjK101AgsnyHOeAHJzaLkNoNktPi6bASbFYY6EPukyY6cnYUEmWcnslLgUZ2wV3p8JHbLkR2Xla1FNm6jYOk4iS38n8B3npKwq3wbusUzJhMsusEYRVeUQmsDKwuMgVCap5JLm7qzfVW+iz6DK/r7IbOsdj2xpv1SbmdZBgaoxIc9ozA3xoWkVkSwSvugnRvXtowLwASNdcji//FOplUfEUrT4m4ImVVlfRWjw91g== X-MS-Exchange-AntiSpam-MessageData: Qez1iscig5Gvs3UHBAnE5sck1xa1rbTlL1pU47dTt0ojaNdES5F16WT5uTs8L9ZdbKSXIPn/jkA83RYGoEDjnm8dhzz2ean26hz03PXYgcRO5pq8B7xTY8tnuOa9bacSHV6+lYth3H3AQh5zm03YhPrmvO/wOdOzu2Udqw5Sx7s= X-MS-Exchange-CrossTenant-Network-Message-Id: 4c2297f8-4673-4db6-8a77-08d7ae44b6fb X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Feb 2020 16:17:24.9660 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ae927fe-1255-47a7-a2af-5f3a069daaa2 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: dBJuWODGLhIyuoCK5L3+7Nk8t0iuWjp4ovReEVkgSCL8RX5d/B0Ij3eLfIpM5CtO X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR15MB3334 X-OriginatorOrg: fb.com X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138,18.0.572 definitions=2020-02-10_05:2020-02-10,2020-02-10 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 adultscore=0 clxscore=1011 impostorscore=0 mlxlogscore=999 mlxscore=0 spamscore=0 lowpriorityscore=0 bulkscore=0 suspectscore=0 phishscore=0 priorityscore=1501 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002100122 X-FB-Internal: deliver X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello, Gavin! On Mon, Feb 10, 2020 at 11:14:45PM +1100, Gavin Shan wrote: > commit 68600f623d69 ("mm: don't miss the last page because of round-off > error") makes the scan size round up to @denominator regardless of the > memory cgroup's state, online or offline. This affects the overall > reclaiming behavior: The corresponding LRU list is eligible for reclaiming > only when its size logically right shifted by @sc->priority is bigger than > zero in the former formula (non-roundup one). Not sure I fully understand, but wasn't it so before 68600f623d69 too? > For example, the inactive > anonymous LRU list should have at least 0x4000 pages to be eligible for > reclaiming when we have 60/12 for swappiness/priority and without taking > scan/rotation ratio into account. After the roundup is applied, the > inactive anonymous LRU list becomes eligible for reclaiming when its > size is bigger than or equal to 0x1000 in the same condition. > > (0x4000 >> 12) * 60 / (60 + 140 + 1) = 1 > ((0x1000 >> 12) * 60) + 200) / (60 + 140 + 1) = 1 > > aarch64 has 512MB huge page size when the base page size is 64KB. The > memory cgroup that has a huge page is always eligible for reclaiming in > that case. The reclaiming is likely to stop after the huge page is > reclaimed, meaing the subsequent @sc->priority and memory cgroups will be > skipped. It changes the overall reclaiming behavior. This fixes the issue > by applying the roundup to offlined memory cgroups only, to give more > preference to reclaim memory from offlined memory cgroup. It sounds > reasonable as those memory is likely to be useless. So is the problem that relatively small memory cgroups are getting reclaimed on default prio, however before they were skipped? > > The issue was found by starting up 8 VMs on a Ampere Mustang machine, > which has 8 CPUs and 16 GB memory. Each VM is given with 2 vCPUs and 2GB > memory. 784MB swap space is consumed after these 8 VMs are completely up. > Note that KSM is disable while THP is enabled in the testing. With this > applied, the consumed swap space decreased to 60MB. > > total used free shared buff/cache available > Mem: 16196 10065 2049 16 4081 3749 > Swap: 8175 784 7391 > total used free shared buff/cache available > Mem: 16196 11324 3656 24 1215 2936 > Swap: 8175 60 8115 Does it lead to any performance regressions? Or it's only about increased swap usage? > > Fixes: 68600f623d69 ("mm: don't miss the last page because of round-off error") > Cc: # v4.20+ > Signed-off-by: Gavin Shan > --- > mm/vmscan.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index c05eb9efec07..876370565455 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2415,10 +2415,13 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc, > /* > * Scan types proportional to swappiness and > * their relative recent reclaim efficiency. > - * Make sure we don't miss the last page > - * because of a round-off error. > + * Make sure we don't miss the last page on > + * the offlined memory cgroups because of a > + * round-off error. > */ > - scan = DIV64_U64_ROUND_UP(scan * fraction[file], > + scan = mem_cgroup_online(memcg) ? > + div64_u64(scan * fraction[file], denominator) : > + DIV64_U64_ROUND_UP(scan * fraction[file], > denominator); It looks a bit strange to round up for offline and basically down for everything else. So maybe it's better to return to something like the very first version of the patch: https://www.spinics.net/lists/kernel/msg2883146.html ? For memcg reclaim reasons we do care only about an edge case with few pages. But overall it's not obvious to me, why rounding up is worse than rounding down. Maybe we should average down but accumulate the reminder? Creating an implicit bias for small memory cgroups sounds groundless. Thank you! Roman