From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95A6EC77B70 for ; Wed, 12 Apr 2023 03:15:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0B0F16B0074; Tue, 11 Apr 2023 23:15:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 061B8900002; Tue, 11 Apr 2023 23:15:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E92476B0078; Tue, 11 Apr 2023 23:15:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DA9A56B0074 for ; Tue, 11 Apr 2023 23:15:17 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id A41E4120E25 for ; Wed, 12 Apr 2023 03:15:17 +0000 (UTC) X-FDA: 80671273074.23.99C8A4A Received: from out30-98.freemail.mail.aliyun.com (out30-98.freemail.mail.aliyun.com [115.124.30.98]) by imf06.hostedemail.com (Postfix) with ESMTP id 8AB8F180007 for ; Wed, 12 Apr 2023 03:15:14 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf06.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.98 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681269316; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fMwtG3kjimCSdpRTY8GTgkdQjXk+vO4Foj6iAaTFla8=; b=5tsUX3TCu1yQ1efX/PnCXU1F4dPSzhytNNOyb3yU6/G+Fguteh3xod3X02abteoyr4tUYO CNKyyMZqsIjefK7NmY0DeO+vCwBP7HgC2yr0LVkYlu/aXeU8Ty912Cb3jXVXKBDrpZKoiM WJI7zGAZ/9l02cP0X7FvJaG9CsoguuQ= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf06.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.98 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681269316; a=rsa-sha256; cv=none; b=Y4ENZrLtdGztDjPryfoQzqIeyZALNxOp8IW5HXZBl1TUryl999tjIy7QRtH5LsTMh0Gv01 uPCR+KN1ynFsFxNlnNQsFkoLzWPUXHBxnOa87EEkyTsX1SAnhLnQk9a0C+wA9+j2gsyGbq cyOQdPUfs1+V1JhTLGPtXKSCYbmnJ5s= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R181e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045176;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=8;SR=0;TI=SMTPD_---0VfuWQdr_1681269309; Received: from 30.97.48.75(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0VfuWQdr_1681269309) by smtp.aliyun-inc.com; Wed, 12 Apr 2023 11:15:10 +0800 Message-ID: <80c53bfc-891d-af09-9b69-2954d7fe625e@linux.alibaba.com> Date: Wed, 12 Apr 2023 11:15:13 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.9.1 Subject: Re: [PATCH v2 1/2] mm: compaction: consider the number of scanning compound pages in isolate fail path To: Mel Gorman Cc: akpm@linux-foundation.org, osalvador@suse.de, vbabka@suse.cz, william.lam@bytedance.com, mike.kravetz@oracle.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <73d6250a90707649cc010731aedc27f946d722ed.1678962352.git.baolin.wang@linux.alibaba.com> <20230405103141.yu7p53psbvstv6kg@techsingularity.net> From: Baolin Wang In-Reply-To: <20230405103141.yu7p53psbvstv6kg@techsingularity.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 8AB8F180007 X-Stat-Signature: oq9qo9o6zwmb4395purd1w5fhsygcnd3 X-Rspam-User: X-HE-Tag: 1681269314-167611 X-HE-Meta: U2FsdGVkX1+GWCvyk3kPii8ds1M7MAf75XhRZNDhSQoaegLJtqh9RoFoyLwStai5Q2bRkRH3J8ShCYNhEHXR3w+mH+hfYpXA/JLE3Uyk7Reyv9dSnkYUBKlO+6WVWAEWOwd9MAV7+El9/84yxalPYw8VHUnIs7YJwKJ3QMr4wdUpw0pbVoocMb53oua4yU/LkyufKqL7noYYoypfSgdTVo8RtiJ5+HCvv3fOpc2DZHgiE0maq1AIg+oQ0ryKtmqAfWj12qgGxHLZFpd91A1+QZu6WhO44jpiqJ2UhEKFv4iEHP7m6AUvvKT7a0FjX43C5lg9xPIh51mdLf8BNmbbCnsR59LZZwXxLdABXWnIxU4tyxgsfUFpvibzatdhXky95qcPrBWXWfpDKWqGrpVRK6vIWZ8cu+VtSS0dbfBTD+JXPCGhIGA/C/8wxvehRksrqzMtaispC2yIyzVXWV215sVR1W8MbplQm2aXmN62LePKgxkN5hN8ZEJlwOX42mWsFK74Vczz8983DTEFkfxhRP/54JwsQB7/oOfJ68K0Bsz3aYV9PdRrr1QdirGqJc0eVxl/41n7n5ZqlDrz5yptWF0PVQrE0uA0lBi5AoApUUjKEq/wDFqbqRxuGs7DkGWMl1OOfcwdU8YY1snZO0p7uCUJEItnWTV9Iak52617RGFMpHHH+d8f/WUD2b2iCz0iTbRzhMuAA5Q5OHy/7steY2d7kmKOarSnaMLOo2Hzq4UlewCLptmTtFqc6YBQ5RPrDVB9nnSz5V7F8vXO4IJeWevCydHJOaYE0EkrpNxGZWKnkIKIY4bVvjjtSqkU/mSec1sJt7Qf1kDWqXbDRjHJKnCSSqTDA3NYFb6/XtpUJEbH0/OfnMEM+EZGn+esaqm8LoVUCVrclugyHOZEupYxBgdPNHn/kRohxcn0MAH4lpdYuWclibXAgy5EXvkL2mZUkwvdkARQhUJsy9YqtmS OH373ZJ8 3n9ou2xJvpkFhrPZAlAZ3oiyuVouvdfqGZVHADSvx1e6lp3lzhKeH9KAZVZ4UF97a0JbJbVuSfoN0UJp1gpRDBabDbe3dK9VGQ8l6T+rY/MGAgSb0JPciI25o8nIImEkL2D5OUzpzO8DkKZFCr+384AOG3q4VECuhPpYYDkCPZa8W2vA7CGLLti6abA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 4/5/2023 6:31 PM, Mel Gorman wrote: > On Thu, Mar 16, 2023 at 07:06:46PM +0800, Baolin Wang wrote: >> The commit b717d6b93b54 ("mm: compaction: include compound page count >> for scanning in pageblock isolation") had added compound page statistics >> for scanning in pageblock isolation, to make sure the number of scanned >> pages are always larger than the number of isolated pages when isolating >> mirgratable or free pageblock. >> >> However, when failed to isolate the pages when scanning the mirgratable or >> free pageblock, the isolation failure path did not consider the scanning >> statistics of the compound pages, which can show the incorrect number of >> scanned pages in tracepoints or the vmstats to make people confusing about >> the page scanning pressure in memory compaction. >> >> Thus we should take into account the number of scanning pages when failed >> to isolate the compound pages to make the statistics accurate. >> >> Signed-off-by: Baolin Wang > > Acked-by: Mel Gorman Thanks Mel. > > However, the patch highlights weakeness in the tracepoints and how > useful they are. > > Minimally, I think that the change might be misleading when comparing > tracepoints across kernel versions as it'll be necessary to check the exact > meaning of nr_scanned for a given kernel version. That's not a killer problem > as such, just a hazard if using an analysis tool comparing kernel versions. > > As an example, consider this > > if (PageCompound(page)) { > const unsigned int order = compound_order(page); > > if (likely(order < MAX_ORDER)) { > blockpfn += (1UL << order) - 1; > cursor += (1UL << order) - 1; > nr_scanned += compound_nr(page) - 1; <<< patch adds > } > goto isolate_fail; > } > > Only the head page is "scanned", the tail pages are not scanned so > accounting for them as "scanned" is not an accurate reflection of the > amount of work done. Isolation is different because the compound pages > isolated is a prediction of how much work is necessary to migrate that > page as it's obviously more work to copy 2M of data than 4K. The migrated > pages combined with isolation then can measure efficiency of isolation > vs migration although imperfectly as isolation is a span while migration > probably fails at the head page. > > The same applies when skipping buddies, the tail pages are not scanned > so skipping them is not additional work. > > Everything depends on what the tracepoint is being used for. If it's a > measure of work done, then accounting for skipped tail pages over-estimates > the amount of work. However, if the intent is to measure efficiency of > isolation vs migration then the "span" scanned is more useful. Yes, we are more concered about the efficiency of isolation vs migration. > None of this kills the patch, it only notes that the tracepoints as-is > probably cannot answer all relevant questions, most of which are only > relevant when making a modification to compaction in general. The patch > means that an unspecified pressure metric can be derived (maybe interesting > to sysadmins) but loses a metric about time spent on scanning (maybe > interesting to developers writing a patch). Of those concerns, sysadmins > are probably more common so the patch is acceptable but some care will be > need if modifying the tracepoints further if it enables one type of > analysis at the cost of another. I learned, and thanks for your excellent explaination.