From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 91C63C77B61
	for <linux-mm@archiver.kernel.org>; Sun, 16 Apr 2023 19:45:01 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id DFB838E0002; Sun, 16 Apr 2023 15:45:00 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id DABA58E0001; Sun, 16 Apr 2023 15:45:00 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id C4C208E0002; Sun, 16 Apr 2023 15:45:00 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14])
	by kanga.kvack.org (Postfix) with ESMTP id B4DD38E0001
	for <linux-mm@kvack.org>; Sun, 16 Apr 2023 15:45:00 -0400 (EDT)
Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay09.hostedemail.com (Postfix) with ESMTP id 700E98023A
	for <linux-mm@kvack.org>; Sun, 16 Apr 2023 19:45:00 +0000 (UTC)
X-FDA: 80688282360.02.3E69A76
Received: from mail-yw1-f173.google.com (mail-yw1-f173.google.com [209.85.128.173])
	by imf13.hostedemail.com (Postfix) with ESMTP id 581BE2000D
	for <linux-mm@kvack.org>; Sun, 16 Apr 2023 19:44:56 +0000 (UTC)
Authentication-Results: imf13.hostedemail.com;
	dkim=pass header.d=google.com header.s=20221208 header.b=pTFh9YoY;
	spf=pass (imf13.hostedemail.com: domain of hughd@google.com designates 209.85.128.173 as permitted sender) smtp.mailfrom=hughd@google.com;
	dmarc=pass (policy=reject) header.from=google.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1681674297;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=fE9QU8ssRHfrkp+Qs6Jt2adPN0B10qQBjFxitvUur5w=;
	b=fDwjWTSCU2fIPheE0ejrmVUvc0O9sH9OMWaEF4RTVhmh6MdqJkR+T9a/B87cE6eaew+W2L
	n6/QCMT81i2MXfgrlKGZSCa6UFBGNivr3glc4XO/XJPpFO2k5eEv7hVMxCJjhlARGodCp6
	YxBkcWzn0Vl2o+nEAKnPOQO8Kg0hiZ0=
ARC-Authentication-Results: i=1;
	imf13.hostedemail.com;
	dkim=pass header.d=google.com header.s=20221208 header.b=pTFh9YoY;
	spf=pass (imf13.hostedemail.com: domain of hughd@google.com designates 209.85.128.173 as permitted sender) smtp.mailfrom=hughd@google.com;
	dmarc=pass (policy=reject) header.from=google.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681674297; a=rsa-sha256;
	cv=none;
	b=pI2dNLtHOV4qJxJO2FeVy8yDyzjlBpHLKSsXAJVtxR6TmpJsrOmtZi5CfyV2jEOLXsQnpM
	CoL5NOz6HiGKjJMmL62a8ig+gTQFXUrAKM0fMlc0oC4pAoEun8KyUI1mmFsUXfnyquOXFz
	t6qBCnl+iLee85SIUk77zMOa/b5WGlU=
Received: by mail-yw1-f173.google.com with SMTP id 00721157ae682-54f6a796bd0so283992337b3.12
        for <linux-mm@kvack.org>; Sun, 16 Apr 2023 12:44:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20221208; t=1681674296; x=1684266296;
        h=mime-version:references:message-id:in-reply-to:subject:cc:to:from
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=fE9QU8ssRHfrkp+Qs6Jt2adPN0B10qQBjFxitvUur5w=;
        b=pTFh9YoYONT2cKveKNA6jETiwf92HtqU0CX8Fb+eU6YKNATL4+NSIe9LuB571YVQQ2
         FvSwcvWgTww19tc9Rn0y+4sCK5LZLj6N0yzT1FA5tlasgFfufKUaYPfhYO9DEP9ZDcYL
         HchzUVzvPelw0Q4huckH+4NhHZRPQrzs+FZc7TS9g0u1uCXZ7Gh7Rx0t1vw6xeO3fjB1
         ias0ybZKgaEzEMgV3MEOgwIBoeyu5n0/3eqYzjsLLft3tmdEYqDpNZ5I986azDXD11KD
         q/h6ljYEwHQnadmPQUM7nqloQwRDf9UjRpTnWnfKFqeIYHT89pXApVgoyTjabBqFP5wG
         a3Bw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1681674296; x=1684266296;
        h=mime-version:references:message-id:in-reply-to:subject:cc:to:from
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=fE9QU8ssRHfrkp+Qs6Jt2adPN0B10qQBjFxitvUur5w=;
        b=iFuYXIf3EqGbf8cf0j0CMBqnQcFSmxLdvxKdl0YOCjrDSDk4hBnAouDmAQrBfDRIKt
         /B91IoDALt/ACTVpDrcwAQQlVN32ea4IytcCdfWVzRIHxq+9u4hUm8QLkyPprqRTEwPE
         VCKATVSNbiEDrqSKuZYviTmyxW2EeZsHtwzZWcwzt/OxlOrFKDeqttt5ClaiRfe70cMB
         45K6TW+tdvvp4Hs97otrX9wNwzxWkuYt7/yjTVmxBrnSkbr7nmmQz0E2N6HgWH6AJ7uV
         0WA78XfSNlksNB/TSZGHWUR+wK3Ry97ZF5KjoqJ8b9D4pp0z1NvbDAzhIXtSHrUjFOXx
         TyZg==
X-Gm-Message-State: AAQBX9dfFDbJlGoezpuoYDdC7kyPRTL1k6NiOfRGOQCBoZ3A7admwS9U
	9OKy8ylvsqn4YwPhLbFl5sew/A==
X-Google-Smtp-Source: AKy350YDG3M9wmj9++Fuic4VEWPyfPHGcxyaI9iRZK4qaFSHnn38pvTPQQqEXoMJ3JmakygaSbceaA==
X-Received: by 2002:a81:7782:0:b0:543:b06a:19de with SMTP id s124-20020a817782000000b00543b06a19demr12158258ywc.3.1681674295681;
        Sun, 16 Apr 2023 12:44:55 -0700 (PDT)
Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147])
        by smtp.gmail.com with ESMTPSA id b186-20020a811bc3000000b0054eff15530asm2650453ywb.90.2023.04.16.12.44.53
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Sun, 16 Apr 2023 12:44:55 -0700 (PDT)
Date: Sun, 16 Apr 2023 12:44:53 -0700 (PDT)
From: Hugh Dickins <hughd@google.com>
X-X-Sender: hugh@ripple.attlocal.net
To: Zi Yan <ziy@nvidia.com>
cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>, 
    Yang Shi <shy828301@gmail.com>, Yu Zhao <yuzhao@google.com>, 
    linux-mm@kvack.org, 
    "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>, 
    Ryan Roberts <ryan.roberts@arm.com>, 
    =?ISO-8859-15?Q?Michal_Koutn=FD?= <mkoutny@suse.com>, 
    Roman Gushchin <roman.gushchin@linux.dev>, 
    Zach O'Keefe <zokeefe@google.com>, 
    Andrew Morton <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org, 
    cgroups@vger.kernel.org, linux-fsdevel@vger.kernel.org, 
    linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v3 6/7] mm: truncate: split huge page cache page to a
 non-zero order if possible.
In-Reply-To: <20230403201839.4097845-7-zi.yan@sent.com>
Message-ID: <9dd96da-efa2-5123-20d4-4992136ef3ad@google.com>
References: <20230403201839.4097845-1-zi.yan@sent.com> <20230403201839.4097845-7-zi.yan@sent.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
X-Rspamd-Server: rspam05
X-Rspamd-Queue-Id: 581BE2000D
X-Stat-Signature: fsjsoe1f19zdzwgtxqky3hg755gi4z38
X-Rspam-User: 
X-HE-Tag: 1681674296-681816
X-HE-Meta: U2FsdGVkX1+Tt6QfRP89f9V6DFPSQ6X3KCuyb3TWbH6/CnYUtJtCak86HgC/iroPmpeB6n72X5UE/cx7csEttSUgds+mrpba1I4Bxy0nOOt6piIOHtSQBcAKCkpTEX/6rjOwYAXuupdNb0X7/+UDuf6V+QdPpNuKoSw3ZxTHbSjrwAPdskT8TKTFH6qsoydHqKBDupc3eO7O0KdCDMBLk35rRWrIrcHnvOclrlZ7BNT81ySYEPAlX1l/lIMn6t3ucNxbjriMv77awUxmmouCuXShp/hBXkEma806b5tnG7xGDM45vk0q6mDM7yUDfySUFOJZ652mLQtFddJ+f8EK7mxY6LQ+fLEwLOZCu21K5KN/UB45P55L89EEXYWNEO97R4VDqdFc9VdhMkCEXs+bLDsiOnYZK5xl9VS9uVek368oZOFZdSnpClTQbwH/wP+fnyK+RCPZFbygwgfNTWu94A8ZH5pdLC2hWzif4DmmDtWotDRl3K1JLMzJSJza5GO+aGpd6byNF7NisRz16Vth7KuHINhws3oIfQ+asqwyZ3PaalM8ECF7/vgozaNWgfxbRRSygmLxBZi3flNCEx4cSNEW0V9d3qiPGUZy4ebseebXgET0J7mfBCHTZJUy9qkV37lujJXqyfWbmM+76GEemgjznhfJrIHGQZ9TcF7FhUrjSOGIEva11Hp8+jpuFKR5GDGSeVWsIebqhWcr4kD1C+dRcPkaTkDl5lc2Tbn8mM1qaGrGAF3bDjNJI6wsTmTtKEKkT2RUjZSKX4carGhLbHbyg2t1XD8f62VxEOmzitmQriRTJ9bCceKT5xtcBkGk2qLagzNknxTmoTdm+tW1CS5cBeMFhXODxxU4RJGpNK/2GDKd5CDYwqdya8R887ypxDLGWXgdjA4Vq25Z2tZNwyfwqD+/XZK1bQfVkAaQwK8t92RR8lPlDxH10RNUSumBEf4qU5VsUEf82635klY
 p/HrHC4e
 0SqXPLmjAlUrANnBftXw/7sWuKoSxcdkeCOytXqIx9sKypzQ7W7+xrstctc8ZDWv3WlOtLK1alH5VOtIYfQBqsVkE7Fp9AwEjHLcwC3KzpNTgmGZR1PCNuuGBleSWTzYz8qGeInOYakM1KYO8VzY0z3KBRAm/sTsLUWOEebdIePvy7Ifl6tsjZI803LVJgLEf7GyeA01S8ymZq9ANoT7t290lI3no9tdLv7GsiWxCOBUeGlaaEGjkuEEqnBl++L+DEVLmaZ8E3gDQ7/oV+U2lcRjHKjMXx6CoDAvOhy1WrvqTTCtZUvKcokdDpt1qQyA4yVHSz59sHk5zWPNJwk7H00oFcU/bCZJz4rrcpTyfHciI4B2FQsPZDtgpvFwspsQu24784cLhLN5SfTpnI1jFwl5SBZFdPdXlUv/AWGsxKTqyX7+cnSa7EtSMOg==
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Mon, 3 Apr 2023, Zi Yan wrote:

> From: Zi Yan <ziy@nvidia.com>
> 
> To minimize the number of pages after a huge page truncation, we do not
> need to split it all the way down to order-0. The huge page has at most
> three parts, the part before offset, the part to be truncated, the part
> remaining at the end. Find the greatest common divisor of them to
> calculate the new page order from it, so we can split the huge
> page to this order and keep the remaining pages as large and as few as
> possible.
> 
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---
>  mm/truncate.c | 21 +++++++++++++++++++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/truncate.c b/mm/truncate.c
> index 86de31ed4d32..817efd5e94b4 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -22,6 +22,7 @@
>  #include <linux/buffer_head.h>	/* grr. try_to_release_page */
>  #include <linux/shmem_fs.h>
>  #include <linux/rmap.h>
> +#include <linux/gcd.h>

Really?

>  #include "internal.h"
>  
>  /*
> @@ -211,7 +212,8 @@ int truncate_inode_folio(struct address_space *mapping, struct folio *folio)
>  bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
>  {
>  	loff_t pos = folio_pos(folio);
> -	unsigned int offset, length;
> +	unsigned int offset, length, remaining;
> +	unsigned int new_order = folio_order(folio);
>  
>  	if (pos < start)
>  		offset = start - pos;
> @@ -222,6 +224,7 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
>  		length = length - offset;
>  	else
>  		length = end + 1 - pos - offset;
> +	remaining = folio_size(folio) - offset - length;
>  
>  	folio_wait_writeback(folio);
>  	if (length == folio_size(folio)) {
> @@ -236,11 +239,25 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
>  	 */
>  	folio_zero_range(folio, offset, length);
>  
> +	/*
> +	 * Use the greatest common divisor of offset, length, and remaining
> +	 * as the smallest page size and compute the new order from it. So we
> +	 * can truncate a subpage as large as possible. Round up gcd to
> +	 * PAGE_SIZE, otherwise ilog2 can give -1 when gcd/PAGE_SIZE is 0.
> +	 */
> +	new_order = ilog2(round_up(gcd(gcd(offset, length), remaining),
> +				   PAGE_SIZE) / PAGE_SIZE);

Gosh.  In mm/readahead.c I can see "order = __ffs(index)",
and I think something along those lines would be more appropriate here.

But, if there's any value at all to choosing intermediate orders here in
truncation, I don't think choosing a single order is the right approach -
more easily implemented, yes, but is it worth doing?

What you'd actually want (if anything) is to choose the largest orders
possible, with smaller and smaller orders filling in the rest (I expect
there's a technical name for this, but I don't remember - bin packing
is something else, I think).

As this code stands, truncate a 2M huge page at 1M and you get two 1M
pieces (one then discarded) - nice; but truncate it at 1M+1 and you get
lots of order 2 (forced up from 1) pieces.  Seems weird, and not worth
the effort.

Hugh

> +
> +	/* order-1 THP not supported, downgrade to order-0 */
> +	if (new_order == 1)
> +		new_order = 0;
> +
> +
>  	if (folio_has_private(folio))
>  		folio_invalidate(folio, offset, length);
>  	if (!folio_test_large(folio))
>  		return true;
> -	if (split_folio(folio) == 0)
> +	if (split_huge_page_to_list_to_order(&folio->page, NULL, new_order) == 0)
>  		return true;
>  	if (folio_test_dirty(folio))
>  		return false;
> -- 
> 2.39.2