From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A891D64099 for ; Sat, 9 Nov 2024 01:40:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D31C96B00A8; Fri, 8 Nov 2024 20:40:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CE0E16B00A9; Fri, 8 Nov 2024 20:40:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B5B886B00AD; Fri, 8 Nov 2024 20:40:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 975686B00A8 for ; Fri, 8 Nov 2024 20:40:45 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 575C11607EB for ; Sat, 9 Nov 2024 01:40:45 +0000 (UTC) X-FDA: 82764851022.25.C721CDE Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com [209.85.208.44]) by imf30.hostedemail.com (Postfix) with ESMTP id C2D2480010 for ; Sat, 9 Nov 2024 01:39:32 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=EWCS74zI; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of richard.weiyang@gmail.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=richard.weiyang@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731116317; a=rsa-sha256; cv=none; b=VdxVCSyex15gsrlgO1OZtyFIVKVmNPdGSMmTZaCNUMBvNYe1m64v+5nT1HGMWOC1tQslQR 4wJPVJxNoUFV1WM17DdaI2W6rhXOu+OtALhanHhRrzXrl94L7jZfpnsOn0osWbC+tPP3qr /dKBYufsg/YGuaBNTRET289VVgqzfJg= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=EWCS74zI; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of richard.weiyang@gmail.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=richard.weiyang@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731116317; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Uh7bGYATB76O0zUipRFaISgH77hzivijO8+5g4niA9o=; b=Oe0QotLvAB2q00FV2OjJMNmPQYlQhJ+Bxc+mpRykkbOq48MSiZ4X/qjj4vQQPJh+5yW1tp m+iQPx3egwAt+Cuq/MRDOIwgR+EGG7D2hgDX6ivt03Nw/NzEu4aZNvBNzDsFwkoWchG2mV q+BPCmU7x8UbgXVPdqL54RwPlHzPBJ0= Received: by mail-ed1-f44.google.com with SMTP id 4fb4d7f45d1cf-5cefc36c5d4so3582152a12.0 for ; Fri, 08 Nov 2024 17:40:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731116442; x=1731721242; darn=kvack.org; h=user-agent:in-reply-to:content-disposition:mime-version:references :reply-to:message-id:subject:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=Uh7bGYATB76O0zUipRFaISgH77hzivijO8+5g4niA9o=; b=EWCS74zIVgTsI2qlSjBlcKMPhKwDjOIa1yZT3CmRyDgpfcunxhrli4LnkUcBw0NG5V wwpd2I0icC73DDtfHG9r+FHLc7olJCfLt3003WROp+dMZy0VED5/LB4xLCWT7+/nGiCi 96HxQuPeEbSvmY2dO6xNuGK3hRbWYmVSCMxNIVRA3MIc5l6K6pkqfOXAiZvB/c72nAjO RhIt8pMjlvn2jSHI/eArq7dlfIQF+egpFiDhtO2OrFBUbL7F8n9UgoaFvE5jy56wFZQA 70jSDf94ZTP0WDj8x1ZWHgxzL6x9A7eVxFV8DDkk87fNCsgnmi1smZlk7Yrd6e2Ygjxz iaNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731116442; x=1731721242; h=user-agent:in-reply-to:content-disposition:mime-version:references :reply-to:message-id:subject:to:from:date:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=Uh7bGYATB76O0zUipRFaISgH77hzivijO8+5g4niA9o=; b=leK1AumuwgimLCR3easDuJSXQ8yD9GdXzrZSg5DQTiK7sJvdmO1R0v5K/UMu9BF/8R DICX3Ooe2hMmVwUuqE6gr/Z0uyaKf+1hog+B3XorTOSlRPOxKn4eXtP6WseCSORY5qTT Wj3NQTUEWP1vfnUI8oZ6hg9QYbd719l+q4zuXpJ3WGhzcbYOtH9z3k9X8cBI+ay7OJEz k3yqXSh/4zvlfMNQT1HcIHgRwBjqVudxgO/Og7TBpGkVoLDp0R4WshyEZCTHHtOfPIQu ydNoqeFLYph+IQ4ZQwrf4Iu3vBjw7PSKddjdJDGORvAE9pvNcLZ0nrGHKqJt/39V733q l2gA== X-Forwarded-Encrypted: i=1; AJvYcCX0mZAJa+/nQCtC/54ucJFuyHHyfSqJs8c5vUQvurO77c9jqh2oxnRWvMedx11KYumJLwT8J2NVbA==@kvack.org X-Gm-Message-State: AOJu0YxucdNa1TfqgNwlZv/gLfG+vAEvfOz1F86IWyW+SK5jMEV+BmXH W+cJYw25xh0GZ5avSOyFZMfh2ZgOh61/sPZSQJE52S6Bx6rgyMcZ X-Google-Smtp-Source: AGHT+IHmJbzmlA5kw3FldkXyicjj16uDefxg2vG3w27UBK7K/9oiu4rESCfmF6gm4BXu0Vml6aTOEg== X-Received: by 2002:a17:906:c14d:b0:a9e:447b:6f6d with SMTP id a640c23a62f3a-a9eeff3a976mr485592966b.35.1731116441434; Fri, 08 Nov 2024 17:40:41 -0800 (PST) Received: from localhost ([185.92.221.13]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9ee0dc49cesm306006266b.126.2024.11.08.17.40.38 (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 Nov 2024 17:40:39 -0800 (PST) Date: Sat, 9 Nov 2024 01:40:38 +0000 From: Wei Yang To: "Liam R. Howlett" , Wei Yang , akpm@linux-foundation.org, maple-tree@lists.infradead.org, linux-mm@kvack.org, Sidhartha Kumar , Lorenzo Stoakes Subject: Re: [PATCH 3/4] maple_tree: use the correct min to calculate split Message-ID: <20241109014038.m3arxtj2pqhqjmoq@master> Reply-To: Wei Yang References: <20241020024628.22469-1-richard.weiyang@gmail.com> <20241020024628.22469-4-richard.weiyang@gmail.com> <20241108025542.whll2orsuj7aru43@master> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) X-Rspamd-Queue-Id: C2D2480010 X-Stat-Signature: e511nfbxs9wnoprisar7kdfxy4thhobh X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1731116372-253067 X-HE-Meta: U2FsdGVkX1+NpkskKJPGqxjtDjgC+QVdjcfTBp7+dpJx0k8AAGtVeKfjl6RAulktvTdPicK4StY0ZmBjJHwDlDJxnXWUrMCQn0rnjIa2AbXGlvtS9DpidMUN+wuV7H5qQFnnjKfy7qffOqXXujQo2zWJJw5rUB5rDlcoOHL9OcQ5A65CqAZHm6TTkv/askLk9lkZRjsQivuQsHsiJehG5ZX5yOdV5qWBD8GyrPxz8JahEoW+aE0N0YzJDSQpWlgHnDgsPPi3qXIKgspPOzvW873CHyh5FjQoL+049OmIvgEwYEoysuFoGybY/EcB1e4PrJ/AUCcJ5i5WHAOwXqlBKmagYQsJ5iDlE4Jpfyn8ePkrqUoQqVwheIC+crtPBQ4YSCGJEXpcdyTtPGbQmXRB8ocOLZYhSCdyg8oGHjZDC6BWQlDI2mNlNS+JIcK2VQVLlFgPmSK/09HRpBFqVrmeN+c2/JVQiIbph25eSeHuGXERRK05CBqSVQ9blKQjhpISYBXxOunIgG+6RTcQwWwd9J/HqxvCOIvY+Ya3M0+y/msoA+0K6PcG6XunZKi6Rek2uyC0KmizcuxnLPW6rNfye6NUR5L1RgWhtX3hHhwq+apM4MxQWP6d2IsHsYDBrFgVNboIFC3ItP0bFVuQpPMTA0S7i9xLPnD2moYmYcsXfdkGAKiKk7+GuqitosI1limDf7XajpY5vNwmzGQVodls+uE2NS6GP1vZRrjH2F3cbfYyASw5JfM6hDmEjPmc9g5zrzCgcbyAEVezTPVgk+Z1i0gWUTCQrU/JsLGaNZi8//xghoCUEbEeEkIyrKVTDoTJJmnyDgnf7PVnmnhtTBGzd80XrKj6Y0KIKkYK3kcznoIR9vM/gGC+gWXhZdLKT4OtKm8VT7wuHAZl1/WuqI4Mh73x5hCI1A0KpEc6EB1WywR59QVTjPHTiC+EYYBvjhpAeqb5aQpzIS7Nk8aycD2 7C9lvhWA wj6zhSep7xwa+X8H+cWmE4HbVwjDg0D5UiAsroMevbB4MES/iDgGe/ht4w689wfrQ0bl1WK/1strnJJYyiO0nfesFJI9tP1J+WFjZTaCOfR9/R28Ou6trleqAv6zK8Dvg0VOCfIO/bZ9TFlqZ5iI+Yu4mGTqEjJnIhsBcQi1eygkRC0+bZxtSD0sr/RHMtjLNtRvJQFkLFFcurvUjFAIV7OSl7cHg7VryCvVkieZg331a5guqSxP9hEeIj7/+gHFSdJcPocUDEHRUJVjR1XuGUi82gahUf6O/H4dsdTYI34uqeLeiP4jHlCC1ybvkaiwusWvwAmsJcBbPmmOESfKRQJparM1NMzFQ6mYp3BY00YCZyxZAH7jybA3ItKAfLDZZ42hqKoPKc0t7O5kTwdKmeHDULu8/WLYP9zEDffyCAUCWdFSxhMK842duq5BNaLP9TJ7ZfWYwZI3mU6ARHcRDzOyiQOq4BTiKYRd5Ut8INcTdDnpc/I+ZNbTTcmJaWB1YerqBgaN6AglLvldsVdVJjQzuOWLNkO/7gHAoSjt3JZJnukavSmMvu6mAiYwR+xKF5h44tvNDQM658TDAenOoKtZzE5Mkjoky0JB7UsjllXDro66uAgVQyDmbj9aGxzHxHOpLWPS/qyLt58DgiZy3avQMbtNfM9k2JrMA67TH3LZD0IcMZAAK/470B4W8PL6fbpKeY09otZt7GeX7XW2KkxnC5g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Nov 07, 2024 at 10:19:37PM -0500, Liam R. Howlett wrote: >* Wei Yang [241107 21:55]: >> On Sun, Oct 20, 2024 at 06:00:35PM -0400, Liam R. Howlett wrote: >> >* Wei Yang [241019 22:46]: >> >> The check in mab_calc_split() is to make sure the gap between >> >> [0, split] won't be too small. But we don't pass the correct min. >> >> >> >> First we may encounter a pivot[split] smaller than min. For example: >> >> >> >> mt_init_flags(mt, 0); >> >> for (count = 0; count <= 240; count++) { >> >> mas_set(&mas, count); >> >> mas_store(&mas, xa_mk_value(count)); >> >> } >> >> >> >> On splitting root for storing 240, the pivot[split] is smaller than min. >> >> This result a huge (pivot[split] - min). >> > >> >This is fine. >> > >> >There is an open work item to make it more accurate at higher levels, >> >but it's not a problem as it is. >> > >> >Each level upwards needs a better 'minimum span', meaning that the node >> >should have at least mas.min - mas.min + level * something. >> >> Don't follow this. >> >> First, I guess it is mas.max - mas.min? >> > >No, the idea is that we could do better if we could keep the data more >densely packed. > >If we have singletons (all range of 1), and we have 10,000 singletons, >then a parent node with 0 - 128 and a sibling of 129-257 makes no sense, >there is nothing that can be inserting into the parents last two slots >- we can't split singletons smaller so we should have done a better job >at splitting. > >If you think of how this grows today, we split then rebalance until the >left node is full and we have to split again. This could be better >optimised to avoid multiple rebalances by doing something to push more >data to the left based on the span of the children and the nodes. But >we'd want to keep enough so some modifications to each node doesn't >cause a rebalance in the opposite direction. > >So a minimum span of parent nodes one level up from the leaves would be >16 * 10 = 160, so when we split we can push more to the left to try and >get to 160 while keeping the right node sufficient. The math of the >span would change based on how far up the tree we go, but would be >easier to maintain. > >I'd probably still want to split it 50/50 the first time, but it might >make sense to push more on the next rebalance. so 10 => 6|5, 6|10 => >9|5, but maybe it's not worth the effort. I'm starting to think so, >considering it's been ineffective to date. There is a chance that the >active area gets moved to the more-full node and ends up splitting >sooner anyways.. > >All of this will probably change when dense nodes come anyways. > >> Then there is sth related to level. But I don't see level is used for >> calculation. Would you mind sharing more your original thought? >> >> > >> >It works today for leaves, somewhat. >> > >> >> To me it works by coincidence. > >Seems like it. > >> >> The condition check is: >> >> (bn->pivot[split] - min) < (slot_count - 1) >> >> while slot_count is a constent, 16 for leaf node. >> >> And we always pass 0 as min, the condition for leaf is: >> >> bn->pivot[split] < 15 >> >> For the mmap world, per my understanding, we won't expect an address is smaller >> than 15. > >There are other users, but I think we should just drop this attempt at >fancy stuff. > You mean "set mid_split = 0 and drop the min argument all together" mentioned in another mail? >> >> >> >> >> Second prev_l_mas.min is not initialized for the first iteration. This >> >> means on splitting leaf node, this value is mostly taking no effect. >> > >> >No, it is set to 0. Not initialized would cause random data loss. >> > >> >See MA_STATE() in the header. >> > >> >> Right, I want to say not initialized to the value we want. >> >> Will be more careful next time. >> >> >> >> >> Since we are now calculating the split of mas->node, we should use the >> >> mas->min instead of prev_l_mas.min. >> > >> >This sounds reasonable, but considering what this number is used for, I >> >don't see how it is working as well as it is today. I will need to look >> >deeper into this. >> > >> >> >> >> Signed-off-by: Wei Yang >> >> CC: Liam R. Howlett >> >> CC: Sidhartha Kumar >> >> CC: Lorenzo Stoakes >> >> --- >> >> lib/maple_tree.c | 2 +- >> >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> >> >> diff --git a/lib/maple_tree.c b/lib/maple_tree.c >> >> index 894dc5e9436e..c2d4b188646c 100644 >> >> --- a/lib/maple_tree.c >> >> +++ b/lib/maple_tree.c >> >> @@ -3357,7 +3357,7 @@ static void mas_split(struct ma_state *mas, struct maple_big_node *b_node) >> >> if (mas_push_data(mas, height, &mast, false)) >> >> break; >> >> >> >> - split = mab_calc_split(mas, b_node, &mid_split, prev_l_mas.min); >> >> + split = mab_calc_split(mas, b_node, &mid_split, mas->min); >> >> mast_split_data(&mast, mas, split); >> >> /* >> >> * Usually correct, mab_mas_cp in the above call overwrites >> >> -- >> >> 2.34.1 >> >> >> >> -- >> Wei Yang >> Help you, Help me -- Wei Yang Help you, Help me