From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24192C2BA1A for ; Mon, 17 Jun 2024 09:40:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4632A6B015C; Mon, 17 Jun 2024 05:40:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 413456B015D; Mon, 17 Jun 2024 05:40:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2DACA6B015E; Mon, 17 Jun 2024 05:40:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 0BC156B015C for ; Mon, 17 Jun 2024 05:40:23 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 69B70141471 for ; Mon, 17 Jun 2024 09:40:22 +0000 (UTC) X-FDA: 82239885084.07.9E16802 Received: from mail-vk1-f178.google.com (mail-vk1-f178.google.com [209.85.221.178]) by imf12.hostedemail.com (Postfix) with ESMTP id 9B48640009 for ; Mon, 17 Jun 2024 09:40:20 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=HtwTYPXQ; spf=pass (imf12.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.178 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718617217; a=rsa-sha256; cv=none; b=ubjRmrpEHVvHLub+Mgmnp+avGWuHJjLWfp3TdpPT1NDK6mNRACW/vrx7YyaPfbVFqVTz4l ApuTKUCyyzRWBgb3GUbZw5d2SmZtoF+QP69WRi46naCk0kVpnyOBU/6bJ9ZrDRjPQPAaOd Q/w9PAtldKs/b29zNRyxUVbNIh56yQw= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=HtwTYPXQ; spf=pass (imf12.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.178 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718617217; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wuDlyu9i7r0os+gEKzNJUuH1CCj1UPa7Q3D2hYttq4o=; b=Ofq9Jf2l4kcRPIizWDHaA87EBFWa4ceGQ8ePrnI97yQovO+4tv1YGW4VqPsYV5unD4KqG9 9Dx+m90oh8t0GtG5ddKtlT5S5o3g7+MNC5sImJRV04CspEsS6cHORtwUaHWMaL3Bfv3oMP kAdCbIyfpkbg+NX8ARhcHFEKOyl35d8= Received: by mail-vk1-f178.google.com with SMTP id 71dfb90a1353d-4ed0abe8580so1292516e0c.2 for ; Mon, 17 Jun 2024 02:40:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718617219; x=1719222019; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=wuDlyu9i7r0os+gEKzNJUuH1CCj1UPa7Q3D2hYttq4o=; b=HtwTYPXQcPF2ksFUMq6tEbHgig/FUiglIhg2OgEVqms+Rwxq9cJx3gFZQpTPNEV8p9 r7oSV+36KGXc0VmQKOuEg3omRODQDe0uCMGXiYncDvuheJ35EpRAmibVbh1lTRKcxS+T 451y/1o/bBcEMGFPiEzUROnJ6T8FjEwA6aA7Qz9CPb7BDM1A33FOjIhkfcHcgdLzykE9 iMST38/+g/LeLlZCSk0GjUE6/CogKnNuXxFss0M0rcH1it6p3+IrzqJJDAnAcNMfFRH0 dLgQlRqIjvObhJ9dTLTjhpPfTmNjXtFjs2XT1SoP07CuAqkF0hzy26YvyYpjZEniou/N U+Hw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718617219; x=1719222019; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wuDlyu9i7r0os+gEKzNJUuH1CCj1UPa7Q3D2hYttq4o=; b=i09su2l8ADSCKGI8IQfeRRqynF2AxQ3HPE6X9kh877UgmJ/zTN22Fd7M3Edu8CxKkv ytR/IcZJl/cIE19O62ZqiVU+9VWeEwACp2QbEv16DNF3SmuPXhHDEcBW/EoQskcIhrRV oYDKEuhztTO9vuHRstl2nsmo5M/TunXJACq8pjsN3LDKDp/SbDd1L0MR7VGgWopG4ABF WPxH7CEyhWo0NjVQ+mWAioZQoSSDHLmNb/Ndxasw9d6A0PBiMX9cACxp1XQqrcTrokDt VhQFncC5tf/ZT8rrxyNvwG5zuQh2UYJlEQuR4KO2FYqL7NdhYlgRNA8pYKACDwmsVSpd IRMw== X-Forwarded-Encrypted: i=1; AJvYcCX3aDxMyvPGRGhFNhQwlT67PIWtOTPa+MZKm6WxsrckM1EKpMpzJHHXXWIWsiGqiXSPYOVdF1W04VEZfXzVg1y2O6M= X-Gm-Message-State: AOJu0Yz7xaZ4AAvHDLeJTRJpzxG2hpCq7KFm2UPbu+9inXRFJd0FCnHi bUJ+n1Iu1Nn78/STIJL2L0m0W0uPR19jTXVNbHMKWUVXPk7ovVwtqxNc1+MCRzQh/HJG+02h/+F 6nUNC6hemjt9O4Db6bdLZckISlXk= X-Google-Smtp-Source: AGHT+IEw6oDtm4mejFc0hZ7G9vbnUGfrUvJh9D6qHrcpHhpur5dNcAqM9IaP9weIEh32H/dNyPOlDpXOCwBcuv77C8I= X-Received: by 2002:a05:6122:c9f:b0:4ed:80:bd85 with SMTP id 71dfb90a1353d-4ee3df992c1mr8545507e0c.5.1718617219511; Mon, 17 Jun 2024 02:40:19 -0700 (PDT) MIME-Version: 1.0 References: <20240614100329.1203579-1-hch@lst.de> <20240614100329.1203579-2-hch@lst.de> <20240614112148.cd1961e84b736060c54bdf26@linux-foundation.org> <20240616085436.GA28058@lst.de> <9ef638fc-5606-45da-a237-2e09ee05bbeb@arm.com> In-Reply-To: <9ef638fc-5606-45da-a237-2e09ee05bbeb@arm.com> From: Barry Song <21cnbao@gmail.com> Date: Mon, 17 Jun 2024 21:40:08 +1200 Message-ID: Subject: Re: [PATCH] nfs: fix nfs_swap_rw for large-folio swap To: Ryan Roberts , Christoph Hellwig Cc: Andrew Morton , Trond Myklebust , Anna Schumaker , Steve French , linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-mm@kvack.org, Barry Song Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 55xi164rigdtaquznghgpz7yaieg1tz5 X-Rspamd-Queue-Id: 9B48640009 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1718617220-389897 X-HE-Meta: U2FsdGVkX19XToR98229nFxP+YpqlC03NJis9zg8J7474zrpady0VukdsfMPkhyrg14R6uuHaoCmXiC4P0Ky77Wyn4lVaq4x0U3nmCxqOILxym5q1ibZNM6WFq5EVg0g2YewYs8RpqhnLqmV1TVr0nhqFaD+mRPQv6D/g84Z9VzZwt/HMdo7lpZtXJ3XGxUNt2iKVEYF91Psbtci8jhang7gKF1ZcirlNBnUbvcG251GxS0jVDm+rmULM/5+35bl8tMj9X1hG+jvszRzssttghB8KspEumcrE0M7/spct5zeUY8V7DxD/PKQFicyi1uXVxYnwbDInPzo8NhhN+8FnfYHmoNxOINFqU9uplOUxg9zuGg3WmOlrYzXWEuD3Kts+H9mElJY6IVTFczyx7XDyIzt2AoCrUvEPeWjdUZ5RG9nxSkOLgBWMrXow1m7QjYMApjX5Nv2rEZCOHKkCaOWNBC+egSyr0JTiiN55zv3ko1SVLpmusPGFKOAYQIo6s9Yn+LmPYonfYdzzuodn2Iu+AuKZNjKpigfi6MkhoIp1WpJzremjwV4g69g6flMcpn+TkXUXMJpErwhiM8V1RQFwjhuqyXaXYeZMQbNpgnY1VS45rlN578vuoQv3qddo7cUQvDCx0JnKjntvnKVyzLzjlgwGw/6KRNZPi0faNnh+E0HqiFMlDFyVozqtfv9TENVn85p/h0oX05cv4QF+3Bms6W9ZKj0Raw+fqOVdTyqeW6OtswKDlemm5Xsdte8OfJt15GVzY5S+sSqY5kM8D5BjptoVliOzAn6FRNFAO2WxPBhE2DiRYWtgU9+Kitp0Pe67qsxZ49KNCu9gps1KtDXOjhQd/avj2b3ncHFeRw1AiqVkpZCf6YoHO8sn7HA3C01+4wQ6GvJXkcqXaA5Ilog7BtvjYdNZGtmvMLgQSs6zOfCWddx5tCfF/0xBOQV/CSe4RVN4tOkOWd/D11H1QA izvBWnX1 mR3DMEs4fsm8dA1CTWB9ln4ILGYu0HG5uJO8Ll7beYQkQz3SIUtGz5QQ7T963rUgwZAgC4GYiGXe3USo4o1ovVALuOl88wXruPNNydug7Ev3wlazfVx1ctXdgl7qe7O3H8kbKm4cfd8M5lAVqeA+oHS4h2cx4DxF/ywS06ocb/TtWIKmyrbrdr7CEIxjuFKtW1lkNvicIVeo43BjZlLhYLXArr3Z1bx6ICe3TmjX281ey8HUe6zVd1plALpF6yhx7wWtWaX4nNoVHQP3FioEeN54eY/UtT+CTN7dIxLGOM/h+TCD/K1SuTGzANfpToEM0gEKeajmZcpwhWVg6oGxbkLoNKCUaoGhYy5D3nWbSrCb7+ylSW9NCGyFcj1rOnq/2iXLthf3X3aUTLatyUer2WoXMHQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jun 17, 2024 at 8:03=E2=80=AFPM Ryan Roberts = wrote: > > On 16/06/2024 11:23, Barry Song wrote: > > On Sun, Jun 16, 2024 at 4:54=E2=80=AFPM Christoph Hellwig = wrote: > >> > >> On Sun, Jun 16, 2024 at 12:16:10PM +1200, Barry Song wrote: > >>> As I understand it, this isn't happening because we don't support > >>> mTHP swapping out to a swapfile, whether it's on NFS or any > >>> other filesystem. > >> > >> It does happen. The reason why I sent this patch is becaue I observed > >> the BUG_ON trigger on a trivial swap generation workload (usemem.c fro= m > >> xfstests). > > > > This is quite unusual. Could you share your setup and backtrace? I'd > > like to reproduce the issue, as the mm code only supports mTHP > > swapout on block devices. What is your swap device or swap file? > > Additionally, on what kind of filesystem is the executable file built > > from usemem.c located? > > Yes, I'm also confused by this, since as Barry says, the swap-out changes= to > support mTHP are only intended to be activated when the swap device is a > non-rotating block device - swap files on file systems are explicitly not > supported and all swapping should be done page-by-page in that case. This > constraint is exactly the same as for the pre-existing PMD-size THP swap-= out > support. So if you are seeing large folios being written after the mTHP s= wap-out > change, you should also be seeing large folios before this change. > > Hopefully the stack trace will tell us what's going on here. Hi Ryan, Christoph, I am able to reproduce the issue now. I am debugging and will update the root cause with you this week. Initial investigation shows the issue might *not* be related to THP_SWPOUT. I am even able to reproduce it after disabling thp and mthp, entirely by small folios: [ 215.925069] folio_alloc_swap folio nr:1 anon:1 swapbacked:1 [ 215.926383] vmscan: shrink_folio_list folio nr:1 anon:1 swapbacked:1 [ 215.927008] folio_alloc_swap folio nr:1 anon:1 swapbacked:1 [ 215.929368] ------------[ cut here ]------------ [ 215.929824] kernel BUG at fs/nfs/direct.c:144! [ 215.930403] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SM= P [ 215.931264] Modules linked in: [ 215.932328] CPU: 3 PID: 214 Comm: mthp_swpout_tes Not tainted 6.10.0-rc3-ga12328d9fb85-dirty #292 [ 215.932953] Hardware name: linux,dummy-virt (DT) [ 215.933461] pstate: 21400005 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE= =3D--) [ 215.934030] pc : nfs_swap_rw+0x60/0x70 [ 215.935079] lr : swap_write_unplug+0x64/0xb0 [ 215.935559] sp : ffff800087363280 [ 215.935958] x29: ffff800087363280 x28: ffff0000c3241800 x27: fffffdffc32= 3a4c0 [ 215.937012] x26: fffffdffc323a4c8 x25: ffff0001b4a51500 x24: ffff8000825= 0f670 [ 215.937893] x23: 0000000000000001 x22: ffff0000c0b2da00 x21: 00000000000= 20000 [ 215.938734] x20: ffff0000c46a8bd8 x19: ffff0000c154f800 x18: fffffffffff= fffff [ 215.939594] x17: 0000000000000000 x16: 0000000000000000 x15: ffff8001073= 63097 [ 215.940591] x14: 0000000000000000 x13: 313a64656b636162 x12: 70617773203= 13a6e [ 215.941621] x11: 6f6e6120313a726e x10: ffff800083e86318 x9 : ffff8000803= e9ad4 [ 215.942673] x8 : ffff800087363168 x7 : 0000000000000000 x6 : ffff0001adb= fa4c6 [ 215.943674] x5 : 0000000000000002 x4 : 0000000000020000 x3 : 00000000000= 20000 [ 215.944673] x2 : ffff8000806015e8 x1 : ffff8000873632a0 x0 : ffff0000c15= 4f800 [ 215.945568] Call trace: [ 215.945906] nfs_swap_rw+0x60/0x70 [ 215.946351] __swap_writepage+0x2e8/0x328 [ 215.946775] swap_writepage+0x68/0xd0 [ 215.947184] pageout+0xe4/0x430 [ 215.947587] shrink_folio_list+0x9bc/0xf60 [ 215.947992] reclaim_folio_list+0x8c/0x168 [ 215.948454] reclaim_pages+0xfc/0x178 [ 215.948843] madvise_cold_or_pageout_pte_range+0x8d8/0xf28 [ 215.949285] walk_pgd_range+0x390/0x808 [ 215.949660] __walk_page_range+0x1e0/0x1f0 [ 215.950040] walk_page_range+0x1f0/0x2c8 [ 215.950458] madvise_pageout+0xf8/0x280 [ 215.950905] madvise_vma_behavior+0x314/0xa20 [ 215.951361] madvise_walk_vmas+0xc0/0x128 [ 215.951807] do_madvise.part.0+0x110/0x558 [ 215.952298] __arm64_sys_madvise+0x68/0x88 [ 215.952723] invoke_syscall+0x50/0x128 [ 215.953148] el0_svc_common.constprop.0+0x48/0xf8 [ 215.953592] do_el0_svc+0x28/0x40 [ 215.954036] el0_svc+0x50/0x150 [ 215.954610] el0t_64_sync_handler+0x13c/0x158 [ 215.955070] el0t_64_sync+0x1a4/0x1a8 [ 215.955685] Code: a8c17bfd d50323bf 9a9fd000 d65f03c0 (d4210000) [ 215.956510] ---[ end trace 0000000000000000 ]--- > > (Sorry for my slow responses/lack of engagement over the last month; its = been a > combination of paternity leave/lack of sleep/working on other things. I'm= hoping > to get properly back into this stuff within the next couple of weeks). > > Thanks, > Ryan >