From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D25D5C7115A for ; Thu, 19 Jun 2025 01:32:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6137E6B00AC; Wed, 18 Jun 2025 21:32:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5EB036B00AD; Wed, 18 Jun 2025 21:32:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 527936B00AE; Wed, 18 Jun 2025 21:32:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 32AC06B00AC for ; Wed, 18 Jun 2025 21:32:21 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id ABCDE81424 for ; Thu, 19 Jun 2025 01:32:20 +0000 (UTC) X-FDA: 83570424840.20.6012C5E Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) by imf16.hostedemail.com (Postfix) with ESMTP id 50A46180003 for ; Thu, 19 Jun 2025 01:32:16 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf16.hostedemail.com: domain of shikemeng@huaweicloud.com designates 45.249.212.56 as permitted sender) smtp.mailfrom=shikemeng@huaweicloud.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750296738; a=rsa-sha256; cv=none; b=0ZcxlCpxhVGR+eBOQvm0hOm1ekmgqvsC3iHzdqfNLDLYy6NUngSbvvyyRZrk/N1LfKUFge /XREAqSlmxFd45wI6WWZZUssJYfVkYkdsz3xA6xC4ACTzYKRBY46IrN3yZwht8DOA0QhIq A+nEV9NWNWp3Fyl8MBZmhgqXyjV4Zoc= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf16.hostedemail.com: domain of shikemeng@huaweicloud.com designates 45.249.212.56 as permitted sender) smtp.mailfrom=shikemeng@huaweicloud.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750296738; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rJxLyptpZnOiu1yKGOUWfqUTdEVkNbE1/dsuUcOPGLM=; b=C1baAoGnc/nrH86IVCJCDm9sF4vP453QsDGoig/8HTxKJLSqGK8wP60gjysrhAFItSORJV tcLsQrlYJ8eMY98wtHbgtt+PAM0dhMKlqGqyAv/OZiWa8KaD8FTN9x5hS7ZD5eNYbKtDy4 1jJlBDwIl8j15WebfbGxzTM2BiYic3E= Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4bN34b0j4nzKHN0f for ; Thu, 19 Jun 2025 09:32:15 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.75]) by mail.maildlp.com (Postfix) with ESMTP id 6DACC1A1976 for ; Thu, 19 Jun 2025 09:32:13 +0800 (CST) Received: from [10.174.99.169] (unknown [10.174.99.169]) by APP2 (Coremail) with SMTP id Syh0CgA39GacaFNod99QPw--.20796S2; Thu, 19 Jun 2025 09:32:13 +0800 (CST) Subject: Re: [PATCH 3/4] mm/shmem, swap: improve mthp swapin process To: Kairui Song Cc: linux-mm@kvack.org, Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org References: <20250617183503.10527-1-ryncsn@gmail.com> <20250617183503.10527-4-ryncsn@gmail.com> <7e680582-ac35-3d2d-8945-c26410ff4f9b@huaweicloud.com> From: Kemeng Shi Message-ID: <7a168a7b-2b4b-4281-777b-96f952322237@huaweicloud.com> Date: Thu, 19 Jun 2025 09:32:12 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-CM-TRANSID:Syh0CgA39GacaFNod99QPw--.20796S2 X-Coremail-Antispam: 1UD129KBjvJXoWxWFyrZw45Xw18Cw4xCF4DArb_yoW5CF1fpF WSg3ZakFWkXrW2kr1aq3Wjqrn8K34xtF48Ja9rJw45Zas0kr12kr1Utw18uFyUArZ3A3yI vF4UWF9I93Z8t3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9Ib4IE77IF4wAFF20E14v26r4j6ryUM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4 vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7Cj xVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x 0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG 6I80ewAv7VC0I7IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFV Cjc4AY6r1j6r4UM4x0Y48IcVAKI48JM4IIrI8v6xkF7I0E8cxan2IY04v7Mxk0xIA0c2IE e2xFo4CEbIxvr21lc7CjxVAaw2AFwI0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4I kC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWU WwC2zVAF1VAY17CE14v26r1q6r43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr 0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r4j6F4UMIIF0xvE42xK8VAvwI8IcIk0rVWU JVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJb IYCTnIWIevJa73UjIFyTuYvjxUrsqXDUUUU X-CM-SenderInfo: 5vklyvpphqwq5kxd4v5lfo033gof0z/ X-Rspam-User: X-Stat-Signature: ht4znpdyi3dy8d7djddxubx1fyc8m8n9 X-Rspamd-Queue-Id: 50A46180003 X-Rspamd-Server: rspam02 X-HE-Tag: 1750296736-493612 X-HE-Meta: U2FsdGVkX19lGpQJrTNMTQ7jPchAHuPw3Hh0rHbOoE274B2kSzcuBD18OJa4df/uyW/OlECifIsliYL3CZlqBhYspCgA8CQV85F9Hq+v6q2KCZPM70nKFPxIcO6X2oSh5ppe1dksL2u8L/A4qjTU7zf4IzyvjSV80wtmQLeBsSYNb14u5J4mZPT41/xT+C507wW1TJCwm1qoen4ACg8PYXe+TgLWGV4W4XJ2Sd+0eN+gq6UUlCNa16bvnWKczDWCgkwOQXjNqGwgkPe2gF7igSSNcsWw8O01VfzEA3Y4bgIV02PThR9qTDSpeBajkJevS998LNdnXbgfHBgHmMV0GO2oV1uIwn/cU2exJQQubxOSbJ+bBuCIpF49J+pRCGxcptF9ezS6z8L0sCmPbWxrQ7rq1tTVspLaKIgtMzKu7Rv8yMjL5Frqho/Znt4b0DbIxwaAhNnlugZFJ+WGatUot7nUeAXGooXyUAneFaF7oNIbT/geJ7ATuv/Dsw96Eyjlx2Jv8f0EAP+4+mEvMmmthajrvU1/fLKuzlTcVGh/L8Fvw7hg4IlJT76Rtao8wqtfEM1u6/CqkViO5urm+acXydBvRRI/flRlIxEdqUfPuoT7WePBmyWmUEr5bXw/40vkVtwo+jiALy7SwJuksTin5x2YQQPIHsGd0sjitRisnNEWyHNJuehABnw2m//IwHeFW9M2QiYkciKr30portjJ+Ou0okNjvtbsCeARywQXrGoZjJTFAjlY12d8+022bdMnKCIYWQKL5+KLmmOGpoewNcJ6u/xN8xwxiJ7uIYmFvAfPiBPiBkC3YFAnpEMReSlZCq1/qmKaPbLcZodIguoEisPdXiFvuEeB/gzelpsjJcZOB7HaezVCHij83bOVL5Pnao+fhw/EAVWtWb/lKdV0K69GwTB18g0gsoGgZ8vrUlwsmssoGJp4tsByQVw0NQHZf6MLLiqBKQDfyOKFhKM S02hjKzk nMR9ALdOAa+l7z9gqN4/Xdj6BFHafHIN3UMrFUJiGTPbfXKnbfCYJc4Qrxm0UpsQmENJyGOAvFs9pgydcF3p9xKRJZGcDAQwKgdvIqb1GSO1IGWmnQi5wU4QjhXHIMmr5AjdAJHUzj3WmyQn94/43vGWa4KgcVX37l0zZiFArVLr6qcEa8vdlF38h/xMcb8/SXL62 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: on 6/18/2025 4:46 PM, Kairui Song wrote: > On Wed, Jun 18, 2025 at 4:26 PM Kemeng Shi wrote: >> on 6/18/2025 2:35 AM, Kairui Song wrote: >>> From: Kairui Song >>> >>> Tidy up the mTHP swapin workflow. There should be no feature change, but >>> consolidates the mTHP related check to one place so they are now all >>> wrapped by CONFIG_TRANSPARENT_HUGEPAGE, and will be trimmed off by >>> compiler if not needed. >>> >>> Signed-off-by: Kairui Song >>> --- >>> mm/shmem.c | 175 ++++++++++++++++++++++++----------------------------- >>> 1 file changed, 78 insertions(+), 97 deletions(-) >>> >>> diff --git a/mm/shmem.c b/mm/shmem.c >> >> ... >> >> Hello, here is another potensial issue if shmem swapin can race with folio >> split. >> >>> alloced: >>> + /* >>> + * We need to split an existing large entry if swapin brought in a >>> + * smaller folio due to various of reasons. >>> + * >>> + * And worth noting there is a special case: if there is a smaller >>> + * cached folio that covers @swap, but not @index (it only covers >>> + * first few sub entries of the large entry, but @index points to >>> + * later parts), the swap cache lookup will still see this folio, >>> + * And we need to split the large entry here. Later checks will fail, >>> + * as it can't satisfy the swap requirement, and we will retry >>> + * the swapin from beginning. >>> + */ >>> + swap_order = folio_order(folio); >>> + if (order > swap_order) { >>> + error = shmem_split_swap_entry(inode, index, swap, gfp); >>> + if (error) >>> + goto failed_nolock; >>> + } >>> + >>> + index = round_down(index, 1 << swap_order); >>> + swap.val = round_down(swap.val, 1 << swap_order); >>> + >> /* suppose folio is splited */ >>> /* We have to do this with folio locked to prevent races */ >>> folio_lock(folio); >>> if ((!skip_swapcache && !folio_test_swapcache(folio)) || >>> folio->swap.val != swap.val) { >>> error = -EEXIST; >>> - goto unlock; >>> + goto failed_unlock; >>> } >>> if (!folio_test_uptodate(folio)) { >>> error = -EIO; >>> @@ -2407,8 +2386,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, >>> goto failed; >>> } >>> >>> - error = shmem_add_to_page_cache(folio, mapping, >>> - round_down(index, nr_pages), >>> + error = shmem_add_to_page_cache(folio, mapping, index, >>> swp_to_radix_entry(swap), gfp); >> >> The actual order swapin is less than swap_order and the swap-in folio >> may not cover index from caller. >> >> So we should move the index and swap.val calculation after folio is >> locked. > > Hi, Thanks very much for checking the code carefully! > > If I'm not wrong here, holding a reference is enough to stabilize the folio > order. > See split_huge_page_to_list_to_order, "Any unexpected folio references > ... -EAGAIN" and can_split_folio. Thanks for feedback, then the change looks good to me. Reviewed-by: Kemeng Shi > > We can add a `swap_order == folio_order(folio)` check after folio lock > though, as a (sanity) check, just in case. >