From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92426EB64DA for ; Wed, 28 Jun 2023 07:37:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2F8318E0002; Wed, 28 Jun 2023 03:37:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2A88D8E0001; Wed, 28 Jun 2023 03:37:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 171C48E0002; Wed, 28 Jun 2023 03:37:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0AA758E0001 for ; Wed, 28 Jun 2023 03:37:21 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id C86C7140CD8 for ; Wed, 28 Jun 2023 07:37:20 +0000 (UTC) X-FDA: 80951351040.10.10B4B11 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf15.hostedemail.com (Postfix) with ESMTP id 0094EA0017 for ; Wed, 28 Jun 2023 07:37:18 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=QbtosIr0; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf15.hostedemail.com: domain of zhangpeng.00@bytedance.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=zhangpeng.00@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687937839; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mE5S+xEF8iVKqL+efUmsUqFQSIecG35V/4BhhPZycNg=; b=AkMxE8/uMFfH5yUbX553FauTbkntVOUkl8k0fJ2YwB6zNRon8rIsuQLMhr3nq8MY4+dkui ztdPQpAHuOBctQ3WP8Z4Kkd4BajtV04ARPqJ9U4tB8qk2/lEP6z+4tGy2ubeTKS0HBfH1O 27E748yt6+ZeG8bJygJw78mz9+18fv0= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=QbtosIr0; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf15.hostedemail.com: domain of zhangpeng.00@bytedance.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=zhangpeng.00@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687937839; a=rsa-sha256; cv=none; b=XnKBhFtS60xdHn76ppAPhQYw77sr6LY6fKsf4J1PShW+ZYF4sNyQyZZyt6M7RCZiUuJBM1 eGiB4cCrPnBBqNBkEhof494Mj9j9MViqrvFYVsjF0nfkXQGiOVBJy4WGTklDxFr/TcIItO PIQGJIsMphetzve/k85xsqgqnhaQUn0= Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-1b8033987baso4690375ad.0 for ; Wed, 28 Jun 2023 00:37:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687937838; x=1690529838; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mE5S+xEF8iVKqL+efUmsUqFQSIecG35V/4BhhPZycNg=; b=QbtosIr0Oiofd0m1YJ/2G0hrcOPSJ0DQv4HDaPLNZ1IHpnE0moK3Q0ibbix43c4xka YZqeaGFt7cOTKIAJODvsDenjWFZ/i48zn9+0DvhKaSeUbQNUU83yWCsOFW3ls1BGHcnc asJ6HNPheAjyZZkOXLo/dBXo8NLEIFjOx/kQcFAH35OrnabAafnFDs65LeE90usGENYX RpDYzHR4FQpqRR69iHZbbYs43/Vvnke6JY3DBsuERDTfjqfcic/kbUy0yo3//aH/p66x dE10HHKsxl/5Pvm2gGtCKWh6eVnhePwmM5m1/NEWNUt/6vorlb1Z2HDawC9YYQr+ahLG kvuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687937838; x=1690529838; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mE5S+xEF8iVKqL+efUmsUqFQSIecG35V/4BhhPZycNg=; b=fy+8Wc+9/wTNadKQ9VWTwoEntBZot2FCzXbyW77e2YP+rCja90E2FvU3i9rSTSZ5OI BRopY4RBXlBlgo7ttXQJbkpZQRefvdlBGreoqrdaUZNEJEPJBDopm5++sWIVG32TOepe E4PBS4iCQhQpnIo2nMPpiXQRQ/ubcePbFfFUto/uoRo85WrWwyBKBmPOeEutE7Ru0uSz 6MtC0TlRKRrPH1yLiibbWbjYG5Z4ulDMAHsP1Md9s24LwBcQWSBEON55LCBszwVKiqLE gdziYc0quYiHpX+Lf0CXXDnC2/LYm6hXV3Z5f08qkHNT1bXC3eL1Yx+Ey3TOJ7MjUpKr 0VdQ== X-Gm-Message-State: AC+VfDzLN0quOlUp4EqbZiLOaYdjNyq0jiv9OSZEmW+rlZKbo3skuMke MvzWeONHs9KMDZgE1ABxc4zvJA== X-Google-Smtp-Source: ACHHUZ7wsIFlFgorUeOyA8dts5//X0YcNBChMuP62NzBrpOOf8zMBBIOHsejV1gjrmGlqvrF9RiRhQ== X-Received: by 2002:a17:902:ecc3:b0:1b3:c62d:71b7 with SMTP id a3-20020a170902ecc300b001b3c62d71b7mr992905plh.18.1687937837922; Wed, 28 Jun 2023 00:37:17 -0700 (PDT) Received: from GL4FX4PXWL.bytedance.net ([203.208.167.146]) by smtp.gmail.com with ESMTPSA id jj6-20020a170903048600b001b8021fbcd2sm4836988plb.280.2023.06.28.00.37.15 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 28 Jun 2023 00:37:17 -0700 (PDT) From: Peng Zhang To: Liam.Howlett@oracle.com Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org, Peng Zhang Subject: [PATCH v4 3/4] maple_tree: optimize mas_wr_append(), also improve duplicating VMAs Date: Wed, 28 Jun 2023 15:36:56 +0800 Message-Id: <20230628073657.75314-4-zhangpeng.00@bytedance.com> X-Mailer: git-send-email 2.37.0 (Apple Git-136) In-Reply-To: <20230628073657.75314-1-zhangpeng.00@bytedance.com> References: <20230628073657.75314-1-zhangpeng.00@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 0094EA0017 X-Stat-Signature: n4fs53ykrsgn3wgoa7e941bdpbwf48zg X-Rspam-User: X-HE-Tag: 1687937838-349551 X-HE-Meta: U2FsdGVkX1+MrNgXAUedydpIMphtM+93HNkpwTFPsR+n52QHTuPBaZw8b5Ta0ric7cSbLc9aI/9XBpTlKotxmAGmnz8lGjxkXSothsptcdmqnjLRkYmqFoMATYFyDllWS0KUjDISIqTaLkB0y1QJrj7w7XgGIony3RTNnZ8kexoBdWrJvF4dxUTjw1ZBLdB3+bJNkhuWYELPpMMKhDDXrLOTVkl4rJbZrW4zlsOzj/5c5cROr+RErIDOPZkBmpSO9PtcGxL+ASeFU/3EGwvpDP7IMKi4++mOYLl6OJtks3VQcFjQDLD6jUJdkpvJH5YIkeOzplLJNT6q7zDh3IMoixvLhhRUbnkTNkNf7+ObO9kom+QLARr1bfrWgHDyJOxc2Y7RjOx7Fxz0H78ZqIRft+/z9tSEYzcMKf7LJ2bHxIZJfqeWbR6pPuVkf1q6S3B69k/B20AB87jkChMb3M/jMCbbrtZ7w544ze9y/R+FHN/iC7e/WkeG2ZxB8CDyoCJytDNyXfr6gSTJ78uNJDfNgWUHIZ++8I2awTFuguzs3TawD2p+opwGU4w9+WlqiJGf4O9PrKkd4KqbPewL5b0Q1PqvXD0uhxCgY+tIM9bd7qYsjF0K3WYfka370GAl6Sa+5YQ/Fc5lWraCzL9b0/6savV+bcNu0U7sATbMTQ+YfPyDIoP6ktxC7FCSAIjZeWMXqOHK1Th7r+BIGbtg9O3CBYOrvCpT0/wbcUsFWzEH7u8U8HPBqZYOpWowFS4ZpyZ6K8tOPLlMaX6Goqt6tsFyD3CvhqFVpNFzikOXmX2cGc6BLUkgjzREynL6+kljUH+xqTkyTdPqhnKqWFHd2wShjG+Ap8hSiUdg+RVFZAPeELu5KXeegIUHNha7k5DL7YcGqKraCqtYwIFoaZP9PPz0/NS6ZQaPymsK1SviSZGmEUUzd0EKBimW/ZnEiZPVkQDzYKTdST0qrMAKu7bJoaz FrpCnBva hkitY3bPyFFjRPQYPtYjD+pbX0GNmGYuzDFfdwLe3U6AuVWR0tp21reO2eYN8j9OPuJT477eAuIzf8qhyG5ViGVJeHoG/7S3ceWiRwA8iy7RmPfXsAouX/LlVVk0oJW0RYt1jzh1x5bLH5Od1TTDXMWPkX4YjO+WHJ+ke3/wgl+faMGWzdEFvAAtGgNZPZIrmqthk81kmoEG0P+ff9a+ybscnyYzqK5heXyB8Drm7wmOW8dTr2ewpi6zDYJZOWvAuT/z3rQOCK4xMaEcHCAjca/6auR3/x9QhnFhBtmw4jbBAdwmPI6zR3gdXPwZPWTNbN1U2Uct+e3Lda5WDU4+70ZVQ0YB5aqdPCJ7O1sPjHsCGThOrHXucGG/GXElIGmQH3h1d6RTDmFqz1hLs2k9CMk55MB+iGAUrlF6Aa5PRTCoQUag5yPnuX4ZXWDSzjqhAF2ICEIV9b4D+vzEnuqX0ieyosaqlY+9iBV2B X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When the new range can be completely covered by the original last range without touching the boundaries on both sides, two new entries can be appended to the end as a fast path. We update the original last pivot at the end, and the newly appended two entries will not be accessed before this, so it is also safe in RCU mode. This is useful for sequential insertion, which is what we do in dup_mmap(). Enabling BENCH_FORK in test_maple_tree and just running bench_forking() gives the following time-consuming numbers: before: after: 17,874.83 msec 15,738.38 msec It shows about a 12% performance improvement for duplicating VMAs. Signed-off-by: Peng Zhang Reviewed-by: Liam R. Howlett --- lib/maple_tree.c | 33 ++++++++++++++++++++++----------- 1 file changed, 22 insertions(+), 11 deletions(-) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index bfffbb7cab26..56b9b5be28c8 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -4266,10 +4266,10 @@ static inline unsigned char mas_wr_new_end(struct ma_wr_state *wr_mas) * * Return: True if appended, false otherwise */ -static inline bool mas_wr_append(struct ma_wr_state *wr_mas) +static inline bool mas_wr_append(struct ma_wr_state *wr_mas, + unsigned char new_end) { unsigned char end = wr_mas->node_end; - unsigned char new_end = end + 1; struct ma_state *mas = wr_mas->mas; unsigned char node_pivots = mt_pivots[wr_mas->type]; @@ -4281,16 +4281,27 @@ static inline bool mas_wr_append(struct ma_wr_state *wr_mas) ma_set_meta(wr_mas->node, maple_leaf_64, 0, new_end); } - if (mas->last == wr_mas->r_max) { - /* Append to end of range */ - rcu_assign_pointer(wr_mas->slots[new_end], wr_mas->entry); - wr_mas->pivots[end] = mas->index - 1; - mas->offset = new_end; + if (new_end == wr_mas->node_end + 1) { + if (mas->last == wr_mas->r_max) { + /* Append to end of range */ + rcu_assign_pointer(wr_mas->slots[new_end], + wr_mas->entry); + wr_mas->pivots[end] = mas->index - 1; + mas->offset = new_end; + } else { + /* Append to start of range */ + rcu_assign_pointer(wr_mas->slots[new_end], + wr_mas->content); + wr_mas->pivots[end] = mas->last; + rcu_assign_pointer(wr_mas->slots[end], wr_mas->entry); + } } else { - /* Append to start of range */ + /* Append to the range without touching any boundaries. */ rcu_assign_pointer(wr_mas->slots[new_end], wr_mas->content); - wr_mas->pivots[end] = mas->last; - rcu_assign_pointer(wr_mas->slots[end], wr_mas->entry); + wr_mas->pivots[end + 1] = mas->last; + rcu_assign_pointer(wr_mas->slots[end + 1], wr_mas->entry); + wr_mas->pivots[end] = mas->index - 1; + mas->offset = end + 1; } if (!wr_mas->content || !wr_mas->entry) @@ -4337,7 +4348,7 @@ static inline void mas_wr_modify(struct ma_wr_state *wr_mas) goto slow_path; /* Attempt to append */ - if (new_end == wr_mas->node_end + 1 && mas_wr_append(wr_mas)) + if (mas_wr_append(wr_mas, new_end)) return; if (new_end == wr_mas->node_end && mas_wr_slot_store(wr_mas)) -- 2.20.1