From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E03E0E77188 for ; Mon, 6 Jan 2025 06:29:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 738EE6B0092; Mon, 6 Jan 2025 01:29:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6C1146B0093; Mon, 6 Jan 2025 01:29:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 561C86B0095; Mon, 6 Jan 2025 01:29:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 371E06B0092 for ; Mon, 6 Jan 2025 01:29:54 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id D1C22456C3 for ; Mon, 6 Jan 2025 06:29:53 +0000 (UTC) X-FDA: 82976051466.18.E59A65C Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) by imf23.hostedemail.com (Postfix) with ESMTP id 62B57140011 for ; Mon, 6 Jan 2025 06:29:50 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=PMp+mwd7; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf23.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736144991; a=rsa-sha256; cv=none; b=sbBaYHvidPwhgtYLjh4CAuORjthRXwj/Qm/uq99inzBAGILFgCVDt35msKH795xoesxXKR 1vS0PsbGxz1T+rfFM/rvOXoG910bryv7SxDNxcbUAUp0bvTVuB9nc29JzwpYE+KxGaqsoI rzbbwfn4wOoIr58uYuOK5EPG9wT4yCM= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=PMp+mwd7; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf23.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736144991; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2rp5d/NXdTPCy798TF4avcBe63oyb+Kp3kLESZ7ZCUk=; b=pqDBETcp9/G7wkA7Doye9PVU0HgMZSzxnFT7DKB2f8Sr2LIx5QLcdPEExgqd3qTNZ39oyN QhKDZipnds9rP4UNQaUw6C0BUFzO8UabKB35PWYewlxf8n3ueT3bELEBiWKLLr7dqki5rC aH0Etj3pq8tPoIouNDUSLieLVuvWC9E= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1736144987; h=Message-ID:Date:MIME-Version:Subject:From:To:Content-Type; bh=2rp5d/NXdTPCy798TF4avcBe63oyb+Kp3kLESZ7ZCUk=; b=PMp+mwd7L3QMaLd+pXHPLygrF8K6KCf+DH+JE2T6oXZvsNGo84dISVGtmZ+Vk2Br8Yc9dbxQPN7esKz6tuOVSiW6cagM0QQ/vjUZmnh+D00J708pjsQQUkjRCAdUVHL/4yydsrrvgE3dO1TCD2fGLADVNVUbJtPjERHFvFSQCVU= Received: from 30.74.144.118(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WN.XSw9_1736144985 cluster:ay36) by smtp.aliyun-inc.com; Mon, 06 Jan 2025 14:29:46 +0800 Message-ID: Date: Mon, 6 Jan 2025 14:29:45 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] mm: shmem: skip swapcache for swapin of synchronous swap device From: Baolin Wang To: Matthew Wilcox Cc: akpm@linux-foundation.org, hughd@google.com, david@redhat.com, wangkefeng.wang@huawei.com, kasong@tencent.com, ying.huang@linux.alibaba.com, 21cnbao@gmail.com, ryan.roberts@arm.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <04997e54c276eff40a6119a90d36a4e71aade89c.1735806921.git.baolin.wang@linux.alibaba.com> <8344980d-4c22-4694-9a76-2e5a7ada50cb@linux.alibaba.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 62B57140011 X-Stat-Signature: 8g698d5j7wto6wwdagcdcpabxkhnjq83 X-Rspam-User: X-HE-Tag: 1736144990-759887 X-HE-Meta: U2FsdGVkX1/7o7E9JFOG54iaygbBeBQurcnZc0y0I3wJUafDJ69Q1C+gSc1WXO/8capEz06apJSLA//vEsJadFS2JTOHG/GhPJ0p9Da68dSVDjhqd8kzPmAab74Hkc2HLsn2vXC3s9FU5KL9vAkIzMYrFQjyHofCuGsybnbQfehfOvTacbng0y+0VVItR7DuzbAlPiUBx0vJblRPz6910iEcfqsfyhEj9lBFIfYMcG3mCiATnNpWY71xg6WEBh1/7b0I9IzqK6hbyuUgtVsjkHbAbBONIsWCdofMyNk8nFTn/StdU7moIZ6HH/pDDppgOqVMMiXNRCfSBzyp/9G6f6Ntn6gXFHNNoCNPeXw1KhWkzpPPlAtYPJqI6I4P67Q3tnIQnK46ZtJrEyJ7Goj6QcXkhCnqMTtN+jMj+R4N2zThDg2uJAdmzxWorkkMGsJtTcq3mi2LtdT734BtzO1UVleu0n97XCatLaxCJVxY0twnn2j05EYtc9JZJC6X4VAGlYgo8zQy1gACPCETTc/00SGRA0u4D5M5FnJ3cP1EB3Wdkd05k0jSVjUhFBaLjAsZje+RjSJSzq3UyF/cwUhgcOt1y3XCfLdn1wN2iMy6QqRnVhBAY4eLaUy17TsXzzk3OpVLpcrGF58WMI9UkadXYFrh81M3qbVj+VMjtz6nU/oeXZV7bNIayps9AWKF/aTGbWZ3OMhk1cxOno8f+hJXw8/u2/zshWPDnk4msa+QLajGhJiqZ6kcaBavBTE1pOJ9G1D+TkxbLtAvmhSQi0o3gDhLbYACoSDNhZRlVC3IYiXyvcA2yL6vytUzG/xXM3zpGs8FJyf+Iv9augfxfdVrbOaadzcfNP6Ne53Aan+ktxJv9NtttwYURrhunhoulI4kqMAFWd5VkgKgmkgkIDaLoH+8SS8Q7EtYVuPARPdwPa5OerRJW4FLNiK4MAy2ZS1kEuROvEVmZl0bX1PS4Ih 7nz5c/wl i98xh8p76c6DrPWbycdqHduzpkfYrHBmZbu/E0tY3zKEm2/z2lj27PfTyTVABCc7esXRQ3waTf2NH+ly6komEz9i4Avx4zry1bEJXtgvZBy8aBIpFmUwMcFyRIzq0xvmK5wakjq8E0iFSHVq80O3AZuN5sBQj9f6jNs8H6VcAz7t4XQFf03GZvF4DFJ6UXrr6yOCyxH6BfxgBNTDibiYmsTVoxke4BLutRWer6Aqik4yK3Qirl4ZOdpCAJFXDQcCIEY02VWNEST4+VIGTxu4uJ1QczgAyaTRdhuSeJ2HEpNUYFhJqHEVqtRDlNuG4vJo71GO4iJs5K/9xBFE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025/1/6 12:59, Baolin Wang wrote: > > > On 2025/1/6 12:07, Matthew Wilcox wrote: >> On Mon, Jan 06, 2025 at 11:46:04AM +0800, Baolin Wang wrote: >>> On 2025/1/2 21:10, Matthew Wilcox wrote: >>>> On Thu, Jan 02, 2025 at 04:40:17PM +0800, Baolin Wang wrote: >>>>> With fast swap devices (such as zram), swapin latency is crucial to >>>>> applications. >>>>> For shmem swapin, similar to anonymous memory swapin, we can skip >>>>> the swapcache >>>>> operation to improve swapin latency. >>>> >>>> OK, but now we have more complexity.  Why can't we always skip the >>>> swapcache on swapin? >>> >>> Skipping swapcache is used to swap-in shmem large folios, avoiding >>> the large >>> folios being split. Meanwhile, since the IO latency of syncing swap >>> devices >>> is relatively small, it won't cause the IO latency amplification issue. >>> >>> But for async swap devices, if we swap-in the large folio one-time, I am >>> afraid the IO latency can be amplified. And I remember we still haven't >>> reached an agreement here[1], so let's step by step and start with >>> the sync >>> swap devices first. >> >> Regardless of whether we choose to swap-in an order-0 or a large folio, >> my point is that we should always do it to the pagecache rather than the >> swap cache. > > IMO, this would miss the swap readahead algorithm in the swap case, > which can benefit the order-0 swap-in. We need more work to ensure that > skipping swapcache is helpful for all cases, which is why I'm starting > with sync swap devices first. BTW, I used the SSD swap device to test the performance of skipping swapcache with the following hack changes, and I found that the performance of order-0 sequential swap-in shows a significant regression. Without the following changes: 1G order-0 shmem swap-in: 8056 ms With the following changes (skip swapcache): 1G order-0 shmem swap-in: 38536 ms diff --git a/mm/page_io.c b/mm/page_io.c index 9b983de351f9..1e22dedcd584 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -620,7 +620,6 @@ void swap_read_folio(struct folio *folio, struct swap_iocb **plug) unsigned long pflags; bool in_thrashing; - VM_BUG_ON_FOLIO(!folio_test_swapcache(folio) && !synchronous, folio); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_FOLIO(folio_test_uptodate(folio), folio); diff --git a/mm/shmem.c b/mm/shmem.c index e82ef1ef1c68..2902d3477520 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2295,7 +2295,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, fallback_order0 = true; /* Skip swapcache for synchronous device. */ - if (!fallback_order0 && data_race(si->flags & SWP_SYNCHRONOUS_IO)) { + if (!fallback_order0) { folio = shmem_swap_alloc_folio(inode, vma, index, swap, order, gfp); if (!IS_ERR(folio)) { skip_swapcache = true;