From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83FBCC83F0F for ; Thu, 10 Jul 2025 03:37:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 23FE06B00A4; Wed, 9 Jul 2025 23:37:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 217F46B00A5; Wed, 9 Jul 2025 23:37:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 106E06B00A6; Wed, 9 Jul 2025 23:37:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id EFDF26B00A4 for ; Wed, 9 Jul 2025 23:37:52 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 9D61E16021F for ; Thu, 10 Jul 2025 03:37:52 +0000 (UTC) X-FDA: 83646945984.20.D77D810 Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) by imf13.hostedemail.com (Postfix) with ESMTP id 96E4A20005 for ; Thu, 10 Jul 2025 03:37:50 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=fXmmSM43; spf=pass (imf13.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.216.45 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752118670; a=rsa-sha256; cv=none; b=YgYRZFDjc+ylrl+ebzkszk3PNEnc2C9DPOn+CMQCWx7T2KblNkcL/gxJVKi+gi0Ibm/L50 SIh0jEeJKn6tWJF118SBQD3A/ZjhClJ+TREkHBGmNghYWkQhJlkpVgQcZk1uR+5DPU/Q+n HLqsItrafbs9GYE+F9MLXwABSdWD4+k= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752118670; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JUngWupCRmA1NcaSn0JYmN9Vt9g7u4CLfwO2X664xO0=; b=mRUei8BxzwvMFnVmKVDP0MDc3QubGQbBUgcuOHTH+pf4EemdmunUrNyGy0wK4C6yB8Tl0s cQAp715UYdBbtH5CG4/l+i5kpr3JHnuOTSvotiTbEdpMVhq3PJVZ7zw3DLYUQClo+nKcnU khuFwCpHFMZLMgiT5TaTO8JmJyiQEkQ= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=fXmmSM43; spf=pass (imf13.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.216.45 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pj1-f45.google.com with SMTP id 98e67ed59e1d1-313a188174fso1442550a91.1 for ; Wed, 09 Jul 2025 20:37:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752118669; x=1752723469; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=JUngWupCRmA1NcaSn0JYmN9Vt9g7u4CLfwO2X664xO0=; b=fXmmSM43Aix5XrVhnMwLxHVXv4rjIrqLrI+xFZmvySy5NymAeBKIoQD7JSkNWlDw9P f7m8CDI8iOZ3N/vG7fMgEKCYaLtm5jcS9e0RKBeYuHtiCFO69anpoF2O3LZHfCn6Gqqz t/Cs0HCWy1gRMD93otlZDTitOz4LWG35zShUoltdw4kC3EVWW2FBIjvWBSVWXJJjsaTk Qh8Lg9MElC0+sGQAMmEcOj0nUCJFvQIL1loAHAqacokahg47XlMifxLTrkrUWPQGZdS2 QCcIBYjhhcymi8fb+Tx4tJlXDMtCTyfNh4K5/6z8UXxoblv7liq6RjrKdzPiFo4ou3DU phrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752118669; x=1752723469; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=JUngWupCRmA1NcaSn0JYmN9Vt9g7u4CLfwO2X664xO0=; b=Z8GTGoUMwMdmvFMxlytt2uHG+MkgQ1VPXi8GcmDhVxz7P5MxkS/tFK2ykxfZrJeBJK VmHI2PZrCFhUiV7mXescldEkCQGQ9ymSI8SPUTkNn2A2b2gKYVE/w7yD0pPiMFfI7WAq 1cLX5TUZA6GB7CXu4J1Lo2vT4U52zqDvRjOcCvWgrgavTfKo+gZ/aqnu0eKKnDDMzy4o FEqfoqyI2UCEXZ7MKQ5WWsJ3QX29aaUPETv6CcZGVMBw8HuCixQsPZ/p1fwGKf7l6Rcb ODTUabm7jNUVwFkLsN851iigdEq7bh8kV99Nr2pvKnMvbxmIAJO/D1Uc3vRf0w4H0n0V boIQ== X-Gm-Message-State: AOJu0Yw/ah5AN9kh2I3aSFs45Y8UkwiViDj5YVkozXaaZZn4QsiTla4b vezUpv7HqelLW87+P/80ZarQ88udLEsq4uahvefx6foxRjvrKcstzaE+V1PVkuQysbU= X-Gm-Gg: ASbGnctWxP4AvP0JFiid6h6caZuW36Oi6/PKmwMAiZLfs9qHy7aJZaUSy877TpBYTlh I9fx0nyBKe6w7vArzacuhwBI5iS1m/eqWHCvvV/ws+yEm60tkv0uEiuUlIyavPunP59UGw/tRg5 hlkmEjLl0+YrbGmHaMsAOT5iR8zQjtvOXgprEML9M0oGe72fHpt5o01/sv/lYOa99SW8mRTV/J7 6o+lz9r09Mkttv9WkHBRUZOWw2pp4gAHtfj9k0BLRXYgEUxgG6EHLAzq+jF1vKbMlmiAn4KytTt pIHgewn5pxiqdzJILLXhm+KQJAH4HHagBZYkAV+Y+FQsIESiK/aDPSkTdlsY4nuvZuPf5feXY6I P X-Google-Smtp-Source: AGHT+IGJMcv9Y48MgcubipVQ2N+kHU4eIToB95Q+8RK8anRoIeqtEpzS8aLCS0SfzRYrIcL0Mzs1eg== X-Received: by 2002:a17:90a:c887:b0:30e:6a9d:d78b with SMTP id 98e67ed59e1d1-31c3cf77947mr2551192a91.12.1752118668551; Wed, 09 Jul 2025 20:37:48 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.20]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-31c300689aasm3716320a91.13.2025.07.09.20.37.45 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 09 Jul 2025 20:37:47 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v5 4/8] mm/shmem, swap: tidy up swap entry splitting Date: Thu, 10 Jul 2025 11:37:02 +0800 Message-ID: <20250710033706.71042-5-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.0 In-Reply-To: <20250710033706.71042-1-ryncsn@gmail.com> References: <20250710033706.71042-1-ryncsn@gmail.com> Reply-To: Kairui Song MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 96E4A20005 X-Stat-Signature: 4f8rc1by6hrtzrtyah4atc1p4xmbtmko X-HE-Tag: 1752118670-260189 X-HE-Meta: U2FsdGVkX18aDXee/z0CGOnemjdo+PbiIX1XZEiH62EE5qG34XheorSCQryq9dsAVADvaoPq0YzJZu56jVBqHmkITrrm8mASRtZZO/0fk7l9hxIU/9KvzfDM/2ll8oAKeVgxtSeEdnfr3YNGrxWMVZtPkAtU2BPIie227XKYvR2OphM1WpnnGEXwGVpBIaat/lVjrZML8aH/nxdRjF4EAn1BCGWENsIJ+yvAurr6OGpYQ7lvOQ5EqQmx3sSEd3SOrNdNsdqQYR+7T+FrHd99emfhdY1wo/JjJ8FPWOnCvLjnTnbMf5+sI62VNyiu808B1ecGg6lOWD0hzwZu9ztTVS3DgnhOzCPw8h/cLvEERHokauppWp+iE9p7trh9V017FJ90KbUYKQT2SC2A0woSVYQrHIg4w7LAxO32frsUNaiT8TEUaYQCF6q4UvckHlsja7j+G3ln4DED5QNSYztsmCf/0qiy/dt1NiLwpYJL6i5xY+GutTiVDrqJqFtCHnGI3MOsvBkVxVGbvG9ktCD2gm1pzBmLDsqnzm7muA56VWjstw/+HS/Gd28wZtvtXIyyt+Vp6tYLDZNcPsXXUkJfeW2yVItaSTJSjJFSzOsPgSvwlVw99wPR0sKTLS+2D/GywLmOcmeskab4SIAieToNmaWIvF41GQIdRPW0m1E9Wn97JYPD1BwOGBC3yRFRmFFHJJxBqtHEeBnTRkcAWPO91D6fbIp1IT/vM0f42NK1ustNT82ZNhvH4OwRZMdWGZfc56T56Ug3yqWw8po6D5djNzrGGA+9k4To2oPGbSwTw0Ek3VDzImnsV6pPAYFDTqtjSvPGJ8DORCqe7NGLPMiJxd834JDG1piXs76fuZ18SOXHH33jKvmIH0GbM1Di/1pEn2AsqwH7Ol46XyYRhXfIzBXfWv3Hk92s2YEvQq80x5o+fH3OMJyovLiZDq0NHHyp8feMwSCXuyN4CbevCu6 xt/0EZTl nzLGlWN2IuZsl3QW/zKl/tnGlKGQ23n0Odk2WXZUqekHfin/ZHfDOL1W5MQxewbbL+sorqmwZ6BWOfon1C0xE6L9SDsrlm/Bhfs8DgB+N0oumTTnjeLLPcQ4zGL9nbN7MWz7gvuotMIWdXm9sayc9TtPuWH0cM2XIUfl9O0y+cFpwBHO2lckvvQzYJ3DWn0uVqbPCQbZA8U2UYmz46V7EMgci/hcfN4w/pyTj+Az41pS4MyyJt6Q3FWZ3P0VLRbPlYYUJygs7BqkRwDjErTo4jgTUy0ugDTdotzRnqHy0lpYGKtZ1x9ksr8KiEuVK/B0wFj/gc8ssY8qwAORQju+A4Fk4J185KnZ5UpGc3il//OFoWxC1cqOblm97QD6m01WGL8v8CY/XWVyvpgJKEBk2YzHDuvafZZigiAaP8bkxItI/NgBmmlo2dj7uOR6Zrn53VMuJC1SkN1j9tP1w8HkbNBsHrReS0uadOlY304B56PlLexGJCidw8lvRZK0zSCnVxCH6KTtwrM9oJ090JAeqZqeP/jGnuPKs1adMLewe2UyRweNY7s2sZtdEoam5Oxh2uCStsgnKdK+tc842YpGu+CUEqzuqYyLHSY5tB72IEoO7vaLAgwOHRYqQHSaM1SLLFZAH X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song Instead of keeping different paths of splitting the entry before the swap in start, move the entry splitting after the swapin has put the folio in swap cache (or set the SWAP_HAS_CACHE bit). This way we only need one place and one unified way to split the large entry. Whenever swapin brought in a folio smaller than the shmem swap entry, split the entry and recalculate the entry and index for verification. This removes duplicated codes and function calls, reduces LOC, and the split is less racy as it's guarded by swap cache now. So it will have a lower chance of repeated faults due to raced split. The compiler is also able to optimize the coder further: bloat-o-meter results with GCC 14: With DEBUG_SECTION_MISMATCH (-fno-inline-functions-called-once): ./scripts/bloat-o-meter mm/shmem.o.old mm/shmem.o add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-82 (-82) Function old new delta shmem_swapin_folio 2361 2279 -82 Total: Before=33151, After=33069, chg -0.25% With !DEBUG_SECTION_MISMATCH: ./scripts/bloat-o-meter mm/shmem.o.old mm/shmem.o add/remove: 0/1 grow/shrink: 1/0 up/down: 949/-750 (199) Function old new delta shmem_swapin_folio 2878 3827 +949 shmem_split_large_entry.isra 750 - -750 Total: Before=33086, After=33285, chg +0.60% Since shmem_split_large_entry is only called in one place now. The compiler will either generate more compact code, or inlined it for better performance. Signed-off-by: Kairui Song Reviewed-by: Baolin Wang Tested-by: Baolin Wang --- mm/shmem.c | 56 ++++++++++++++++++++++-------------------------------- 1 file changed, 23 insertions(+), 33 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index d8c872ab3570..97db1097f7de 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2266,14 +2266,16 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, struct address_space *mapping = inode->i_mapping; struct mm_struct *fault_mm = vma ? vma->vm_mm : NULL; struct shmem_inode_info *info = SHMEM_I(inode); + swp_entry_t swap, index_entry; struct swap_info_struct *si; struct folio *folio = NULL; bool skip_swapcache = false; - swp_entry_t swap; int error, nr_pages, order, split_order; + pgoff_t offset; VM_BUG_ON(!*foliop || !xa_is_value(*foliop)); - swap = radix_to_swp_entry(*foliop); + index_entry = radix_to_swp_entry(*foliop); + swap = index_entry; *foliop = NULL; if (is_poisoned_swp_entry(swap)) @@ -2321,46 +2323,35 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, } /* - * Now swap device can only swap in order 0 folio, then we - * should split the large swap entry stored in the pagecache - * if necessary. - */ - split_order = shmem_split_large_entry(inode, index, swap, gfp); - if (split_order < 0) { - error = split_order; - goto failed; - } - - /* - * If the large swap entry has already been split, it is + * Now swap device can only swap in order 0 folio, it is * necessary to recalculate the new swap entry based on - * the old order alignment. + * the offset, as the swapin index might be unalgined. */ - if (split_order > 0) { - pgoff_t offset = index - round_down(index, 1 << split_order); - + if (order) { + offset = index - round_down(index, 1 << order); swap = swp_entry(swp_type(swap), swp_offset(swap) + offset); } - /* Here we actually start the io */ folio = shmem_swapin_cluster(swap, gfp, info, index); if (!folio) { error = -ENOMEM; goto failed; } - } else if (order > folio_order(folio)) { + } +alloced: + if (order > folio_order(folio)) { /* - * Swap readahead may swap in order 0 folios into swapcache + * Swapin may get smaller folios due to various reasons: + * It may fallback to order 0 due to memory pressure or race, + * swap readahead may swap in order 0 folios into swapcache * asynchronously, while the shmem mapping can still stores * large swap entries. In such cases, we should split the * large swap entry to prevent possible data corruption. */ - split_order = shmem_split_large_entry(inode, index, swap, gfp); + split_order = shmem_split_large_entry(inode, index, index_entry, gfp); if (split_order < 0) { - folio_put(folio); - folio = NULL; error = split_order; - goto failed; + goto failed_nolock; } /* @@ -2369,15 +2360,13 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, * the old order alignment. */ if (split_order > 0) { - pgoff_t offset = index - round_down(index, 1 << split_order); - - swap = swp_entry(swp_type(swap), swp_offset(swap) + offset); + offset = index - round_down(index, 1 << split_order); + swap = swp_entry(swp_type(swap), swp_offset(index_entry) + offset); } } else if (order < folio_order(folio)) { swap.val = round_down(swap.val, 1 << folio_order(folio)); } -alloced: /* We have to do this with folio locked to prevent races */ folio_lock(folio); if ((!skip_swapcache && !folio_test_swapcache(folio)) || @@ -2434,12 +2423,13 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, shmem_set_folio_swapin_error(inode, index, folio, swap, skip_swapcache); unlock: - if (skip_swapcache) - swapcache_clear(si, swap, folio_nr_pages(folio)); - if (folio) { + if (folio) folio_unlock(folio); +failed_nolock: + if (skip_swapcache) + swapcache_clear(si, folio->swap, folio_nr_pages(folio)); + if (folio) folio_put(folio); - } put_swap_device(si); return error; -- 2.50.0