From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07B1FC8303C for ; Mon, 7 Jul 2025 07:53:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A023F8D000D; Mon, 7 Jul 2025 03:53:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 98CCC8D0002; Mon, 7 Jul 2025 03:53:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 82CA08D000D; Mon, 7 Jul 2025 03:53:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 6BBF78D0002 for ; Mon, 7 Jul 2025 03:53:41 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 016A11285BE for ; Mon, 7 Jul 2025 07:53:40 +0000 (UTC) X-FDA: 83636704242.01.92B2A45 Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) by imf27.hostedemail.com (Postfix) with ESMTP id 6D2B34000D for ; Mon, 7 Jul 2025 07:53:37 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=Lp5+v987; spf=pass (imf27.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751874819; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YZOzuZhRDmIXxtNY1Xnrb4GtMpbKizrBm60btrY1XIY=; b=gRblCkCR9Aks4cG7tJXJ3pMnMxtRo2dwEJFSje7TPF00+Mrn7isU0Inc9ooDXa/moWx6dk 5hq/kgkXW77VB4WIFHGjHzaxbP4aNksB2PZZPlYLGC+9T3mdEZgYdfY+6qlq5vYgWIDgVE CUfpCPGJEvg+hcw2WGCQAy8Wnh16X9I= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=Lp5+v987; spf=pass (imf27.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751874819; a=rsa-sha256; cv=none; b=A4zmvM7e4EZzHmI6PKU0qgKZnhMUDmO6vkFDzqQWe3lyK6rHvA7ottr48oaeCrjG2zW1Nk EhF/73q/a3F2t+CbJFovinnIq9hHpSCtEWCa+0t6JNd0mg41YZGc0h6Q82HxjQAzG9mKF0 +3jlGnyIli8dozLCSL3YPuCy8FV20kc= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1751874813; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=YZOzuZhRDmIXxtNY1Xnrb4GtMpbKizrBm60btrY1XIY=; b=Lp5+v987Xjj3xXnLBljMSBFGDnymS4LhAv7UIFvi86FEFNyervbdyiZ0bQ1J0C3pToRBT4fptikUVVe/6LE/3XcPtjRAtLTMOFQ0jqEwzBHltMuRfEf0+F9YQ6RkvZXSk8MC5QTQKOdzeJj+Di26xgZxwKFeY6JxbxfCdask9fk= Received: from 30.74.144.127(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0Wi33KdW_1751874811 cluster:ay36) by smtp.aliyun-inc.com; Mon, 07 Jul 2025 15:53:31 +0800 Message-ID: <17d23ed0-3b12-42a5-a5de-994f570b1bca@linux.alibaba.com> Date: Mon, 7 Jul 2025 15:53:31 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 5/9] mm/shmem, swap: avoid false positive swap cache lookup To: Kairui Song , linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org References: <20250704181748.63181-1-ryncsn@gmail.com> <20250704181748.63181-6-ryncsn@gmail.com> From: Baolin Wang In-Reply-To: <20250704181748.63181-6-ryncsn@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Stat-Signature: 1x5unff5jnhpqyaxf1xuxiksbsf653ta X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 6D2B34000D X-HE-Tag: 1751874817-81019 X-HE-Meta: U2FsdGVkX19MydM/HLajKXztAuvu/xdkj6hWF7Tgs6c+stzc86ce/bzgMleD9e0aMQBuZYryyp1IEHP5WvvMLtafGN/+3Sesb6dYeSVgwgN6QfMWefqe2icF9lfEYcAwZ9eZtB6txRRIdaCkAyQm5cZrSDrJ3eXcrNwBNeMF5Ig73FuYIPcgDL9IHV5RPgLu/ODHTqp6LN0IGdMhyhvLrfQcpdB03CkHH3teIJTHaRaY6nObjlLJd73yjtBM/pflquqj5OeVBCN4on0c/488FhKUzajV/GFEMLslbvpp65hfjDrS2BtX46kGTF8qX3Vp4zV3b8O5nW02ypcXdWTKYW13MpTt7QIGX6HRQG+ZYsnHHfiRtHLtmu+A10zJ0XpoBslvni0J3U7Wq/6uJrA97+xoZdHQZgtO+EVXAXKWtjDX438lFDB3fOASZdi4kV7zDGkD9jv/kej7D9WWj3b4TlXF3k3EJ/4mV9tW1pRH81SZuWaotNU7tMkITC+O9Isc8S6S5dSf7rUMVFXcRyCySPH5HZlNuUQX6DIk0Gr5NEDP3exkphOLmmWw+6+pEner89lfjKbpbxC1TMyLCJ7xf96FjtrDygt0sePOlLUWSHTSK/7looev2K4Ij8nMPGdObvqNDpcrS1+4XG5l+hCALyCRF3bjhDUOm9XBKEc66t3hX1RwkJaajxNqK2Xq/11jMvaLVOSOxsEhtUgIblKDpAXky40A+tN6UJqH5ypvukUw7Z1F08nNIBHRCBkZKXJK9ddK7RoPkwQHvZknulw7qAAIL2Jo36F20MttELw29pogLB9VwAomiL8w3bq7d1tmnrn0PkqJphfmSqfbZ2ff4lEEK/ELO/cRcNbo1GurGKcinQHGLLCbRLFKopdK6I5XpjrDUpGUvMhysPSaH01jvKvKGd0+Sn11irdFWigNYmGU6nK919RQZObDiIdnS1mHdCfW1Sra5N3X2i4/5VY RPRkD2CN SCtRFnweE5OKssZZTH91GSOEzC/NljHMevjOUb4RLWTgWwZfz/qoxE6y6HB38z/Si7oVIHlUZ3gsNuJrCYuvmYBy5XyYFPIES96F/0uMDzQij50HNHYkeHTqQgU8YKTpvUmuhFyt3kq/eb7Y/7a8glk0WaJHcYCUVczMzsVqoE5HMbMKb6ozCHLiVQo8Yi8VUg/34f9q9FzBhjdm2GjyzrzK6PDsSfEOVNbKnnSoiVPfiSPEpq09KQ5eowTT0p8L+OXQlGE0EUsbR4WWdQqq9lvOhxJiExmrrCLd+3VEND+3KbDkxtLv0GrYgwH30M9K/pQM/8iuTALd4Gf1fOwoRAm2K3Grzs1YrWuBhvOmokBk/oALb/F5E8TZx9SA/22s4wsTXLp380CDgxcc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Kairui, On 2025/7/5 02:17, Kairui Song wrote: > From: Kairui Song > > If a shmem read request's index points to the middle of a large swap > entry, shmem swap in will try the swap cache lookup using the large > swap entry's starting value (which is the first sub swap entry of this > large entry). This will lead to false positive lookup results, if only > the first few swap entries are cached but the actual requested swap > entry pointed by index is uncached. This is not a rare event as swap > readahead always try to cache order 0 folios when possible. > > Currently, shmem will do a large entry split when it occurs, aborts > due to a mismatching folio swap value, then retry the swapin from > the beginning, which is a waste of CPU and adds wrong info to > the readahead statistics. > > This can be optimized easily by doing the lookup using the right > swap entry value. > > Signed-off-by: Kairui Song > --- > mm/shmem.c | 31 +++++++++++++++---------------- > 1 file changed, 15 insertions(+), 16 deletions(-) > > diff --git a/mm/shmem.c b/mm/shmem.c > index 217264315842..2ab214e2771c 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -2274,14 +2274,15 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, > pgoff_t offset; > > VM_BUG_ON(!*foliop || !xa_is_value(*foliop)); > - swap = index_entry = radix_to_swp_entry(*foliop); > + index_entry = radix_to_swp_entry(*foliop); > + swap = index_entry; > *foliop = NULL; > > - if (is_poisoned_swp_entry(swap)) > + if (is_poisoned_swp_entry(index_entry)) > return -EIO; > > - si = get_swap_device(swap); > - order = shmem_confirm_swap(mapping, index, swap); > + si = get_swap_device(index_entry); > + order = shmem_confirm_swap(mapping, index, index_entry); > if (unlikely(!si)) { > if (order < 0) > return -EEXIST; > @@ -2293,6 +2294,12 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, > return -EEXIST; > } > > + /* index may point to the middle of a large entry, get the sub entry */ > + if (order) { > + offset = index - round_down(index, 1 << order); > + swap = swp_entry(swp_type(swap), swp_offset(swap) + offset); > + } > + > /* Look it up and read it in.. */ > folio = swap_cache_get_folio(swap, NULL, 0); Please drop this patch, which will cause a swapin fault dead loop. Assume an order-4 shmem folio has been swapped out, and the swap cache holds this order-4 folio (assuming index == 0, swap.val == 0x4000). During swapin, if the index is 1, and the recalculation of the swap value here will result in 'swap.val == 0x4001'. This will cause the subsequent 'folio->swap.val != swap.val' check to fail, continuously triggering a dead-loop swapin fault, ultimately causing the CPU to hang.