From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A9DEC10F16 for ; Mon, 6 May 2024 12:27:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A274B6B0089; Mon, 6 May 2024 08:27:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9D70F6B008A; Mon, 6 May 2024 08:27:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 878626B008C; Mon, 6 May 2024 08:27:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 688E26B0089 for ; Mon, 6 May 2024 08:27:17 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 057B2140841 for ; Mon, 6 May 2024 12:27:16 +0000 (UTC) X-FDA: 82087896114.17.7F239BC Received: from mail-vk1-f179.google.com (mail-vk1-f179.google.com [209.85.221.179]) by imf12.hostedemail.com (Postfix) with ESMTP id 15CB04001D for ; Mon, 6 May 2024 12:27:14 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="C/kNlGNO"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf12.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.179 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1714998435; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3REQiNx8W32vr+UQbDqQaLWxlDjL0bY+bOsvnNcIwDc=; b=k7AofQBnAMoEaYP5FMr8nBcpRwPH02d0SkLay9JhhbDQBxYe+U3gpGUY+AQibsmtMUEGAo +qnKX49lQ44FqXX+IeDVFd7TuJqquvVMKqQCVqicNryqQyQvJW7u6XIQ5tygubZwAncNVj nz3rjVZcO7ipI1amkQL04BHRvpRQDjE= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="C/kNlGNO"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf12.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.179 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1714998435; a=rsa-sha256; cv=none; b=Q9kYQQ5As2YYn36BHkzNbyafMlP6S9AnHR+9oCl90dAMuvX01yUBZGNfpoO1MdZps1ZkOh K5mgWBxn5EONgYCniWT9zYljbptLUd/py3LBC6XAKZuocJ9MotqHDMrb70hF72EUYUtHTn eHWQb5WTpAWBaoTjLMWMvUBUO17drR8= Received: by mail-vk1-f179.google.com with SMTP id 71dfb90a1353d-4df3e3c674fso312334e0c.2 for ; Mon, 06 May 2024 05:27:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714998434; x=1715603234; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=3REQiNx8W32vr+UQbDqQaLWxlDjL0bY+bOsvnNcIwDc=; b=C/kNlGNOhBkoJgGdn17dzajw807ysJy2vjueLolC36PGqI2B8ZQldhsU1ig8ghqZ+s WI7yTMnl3IYWxLouBQpQ2vEk3s+7M82oCzKjvlOl5lUI09LDvDAZoKJxcry9aNR/Xfwf VmRbFIh7aEWIa47fjYZDK5Jw7Z4uVi8a9HZXxAyXLATj76K5Iqlt5+dJx9kXMP3mdurF cSerZiuEkZ6JVS7+0Gwotg9mC+4G2TL10s2enjmTiZ6R5+Bwr1gldT08bn8HgPNY/bMx Uf7ZPK6x3FAIDDgim7PLGD6sm5ct30+m6v5rZ+atSuTUhb8QxFuFBPotC1/i7ARzAn3U fjLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714998434; x=1715603234; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3REQiNx8W32vr+UQbDqQaLWxlDjL0bY+bOsvnNcIwDc=; b=TWMif9etv+62k6AP0qFiAi8a9coGIbCgUNFP9H2L2sPsdMiVDwb4Hbge1DuA98o6rd OuU3E6dTx+lN0zi3GWYxjW0J9QJPA6AfImmfVvbfdhHtTyOqbMNT+MHiUG26kwCZzANY KBtA6R+yr16FJmzOMMlPFfcu47psko4JgygDr6z7RZz3aGmE8PfvyG9H0n5kJTCMonf/ WBDQnCca5sfzWbwwDnW738fN4SNGJKO55H0evnS8Gf65hMr6fXX8uTuBOCuK0a78XxA3 qpMVnkWTeaNTJWPm6Gs78fB4tJvk5JDoiEC7vdzu/oS9prc4h/3dFjEjiWoMQBQIqKX7 UjNA== X-Forwarded-Encrypted: i=1; AJvYcCXd5mJtXZvDd/4ECgP9qCpk5R5zPDjPRSPC1faX9CNkzjpPV3J8afeAk6LqRExF/DjGbO/B0gLQodCQpe5wku8yGps= X-Gm-Message-State: AOJu0Yw5dFvnxiFg8pLZMHcqjO7uDyv+Tf6a+mzYh/Wn4x/QNeCD3AW1 tMsT+RBSoLtnME0OANI7CYrMMM3LrADv43IcG9x8ZJFQzRI07eHQ9GBvfdkPqOPcStl3d3nj/ZG CdU+auhN/tAL9kz9IF8IekBbXj1c= X-Google-Smtp-Source: AGHT+IGBuoH8XvTyRkFu2MBgE/jNhULNXD8Gg5uDj4+D5Uc4iI/wUmjoWTQTk+aJs5ghFyaErFRR+dlYOLqfkADwuf8= X-Received: by 2002:a05:6122:3194:b0:4da:ced8:b09a with SMTP id ch20-20020a056122319400b004daced8b09amr9499070vkb.0.1714998433913; Mon, 06 May 2024 05:27:13 -0700 (PDT) MIME-Version: 1.0 References: <20240503005023.174597-1-21cnbao@gmail.com> <20240503005023.174597-7-21cnbao@gmail.com> <0226a6f7-26ac-48b0-932d-1b7201cde1d7@redhat.com> In-Reply-To: <0226a6f7-26ac-48b0-932d-1b7201cde1d7@redhat.com> From: Barry Song <21cnbao@gmail.com> Date: Tue, 7 May 2024 00:27:02 +1200 Message-ID: Subject: Re: [PATCH v3 6/6] mm: swap: entirely map large folios found in swapcache To: David Hildenbrand Cc: akpm@linux-foundation.org, linux-mm@kvack.org, baolin.wang@linux.alibaba.com, chrisl@kernel.org, hanchuanhua@oppo.com, hannes@cmpxchg.org, hughd@google.com, kasong@tencent.com, linux-kernel@vger.kernel.org, ryan.roberts@arm.com, surenb@google.com, v-songbaohua@oppo.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yosryahmed@google.com, yuzhao@google.com, ziy@nvidia.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 8rezgo5q3x9fmgci9wppesp7ynt5qr5d X-Rspamd-Queue-Id: 15CB04001D X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1714998434-491641 X-HE-Meta: U2FsdGVkX1+EchBigLDuR0OaoZ2SfbNRZH1OCUPKkkyR44QPc6JtGtskh/0D7fHV72MdJkU/qkHoPJNrcZyyKQ+xL/t4O4IoBZN9WG5Oumiui044cV+CRCH9wtXYO8XejG6TFfjTeVUKSKpJqMYdlDXnWsWUT/qYu7HRbGqUxpQaaJEQz/Zqc+XZcSWZu56iX4ES90G5HCtjQ5cqF0ojmxBv3cnul5R892RlgGvPHDRED6qM1s8Wf743RkgpCJzAmMjfqBKAlX0myzv+CycWrmwDmppo+aR6XrLJ2skXc+RQ4qSJACrU9tHlx2LdXCs7OTyYIS2uQD9ZFqGgwHArirJ916J0i+mod0OW6TK3PS5dg7wJSNmBnpWJuHPaYPgZRzbuHxiD5XB6vGbuGjP9+rLyjWxuNZ/0nmFMK5jd9r+sGQREHwyfkF4aujUY+KHvc+Op+HvSnZRc6fINjnkX472RQ7+0gHRap8uTE+90rl2qwHoQOWRtd3OiBFVBCk5b0K0t0uN+sgpHHUqJI41jxJgubb9Q4uRiRmLJKW9q8cQAysPmkevRXPRyOwe49HYOyUwT35zwYv0kMY+wuJycsexQqr47tMVvgfS4aa9UcwcezSO7kqdRNixuqMhKmILJWWQUq6CpWHuhZGpUQ+rkg9r8zg/zx3R7wGRHtK0BGPtFhcZ3S5T/S5M7ul3rMMpcV0F+CYenu3lhWhnCMFhWFTu/sIxfoTjZ2WdK8tMW29TgCwNTeklBUEb6dPbQjpCuCDAEFygMjhJ8LR6fRX+KZ1iOCIHNcOcFuGihhrxR7YWQjJdk45xj01gPu5VanANp7sxmj0AO5NIhpUuy3d/hT9sMKEulo08JjaeuR91BcleLaCFTodWcecsGOcC1CX1ElpgVSwMkE6xti3MzkvrrDXZE/abb8ROuJ6oGzl1C+wipMOYEfwDNF/JYjsrwZhSHcU0LYvhZJ5hmuWfh65t /OyQDrLv TWzskQ5Yz/SnheZ0cGft4uu0M9MJ9c71Kzkj3qrLcYGAOAdfLWWF3g6GZOipQCyvyQFJPCH9VeEV4Ptc5cQeteNqNb5dXsSDawcy+8H6Fw8exift/8O6OV2w1TL78XXxWTlVv1v8JPVU4Uxz222wvX+OPK3BH2lfvyx312+LK8wU+SjHxuyhgpEORs/tDfN2P1kOkTCVAUXCwnUso9WyGvNrNiLgwmHWXXhDRQ+ZpCUTIaQBion4HZHkUOM7NEw/LaWsDh5sXTREG069vn056F9yY13SaxB5WdKNhDPzqQRszok+R801tm9FrpcLsgDXQp82ebNrHR+otUs80RM5EBFFpP+nYfLcZcWjgVvhJFESLnVu0rX8V2+xju1hZRPp54BPYQdON6o2lelwEeTwCypQC4nodUyK53b9fn1PY3+A3iswKnqzfsL+nhg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, May 7, 2024 at 12:05=E2=80=AFAM David Hildenbrand wrote: > > On 03.05.24 02:50, Barry Song wrote: > > From: Chuanhua Han > > > > When a large folio is found in the swapcache, the current implementatio= n > > requires calling do_swap_page() nr_pages times, resulting in nr_pages > > page faults. This patch opts to map the entire large folio at once to > > minimize page faults. Additionally, redundant checks and early exits > > for ARM64 MTE restoring are removed. > > > > Signed-off-by: Chuanhua Han > > Co-developed-by: Barry Song > > Signed-off-by: Barry Song > > --- > > mm/memory.c | 60 ++++++++++++++++++++++++++++++++++++++++++----------= - > > 1 file changed, 48 insertions(+), 12 deletions(-) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index 22e7c33cc747..940fdbe69fa1 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -3968,6 +3968,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > > pte_t pte; > > vm_fault_t ret =3D 0; > > void *shadow =3D NULL; > > + int nr_pages =3D 1; > > + unsigned long page_idx =3D 0; > > + unsigned long address =3D vmf->address; > > + pte_t *ptep; > > > > if (!pte_unmap_same(vmf)) > > goto out; > > @@ -4166,6 +4170,36 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > > goto out_nomap; > > } > > > > + ptep =3D vmf->pte; > > + if (folio_test_large(folio) && folio_test_swapcache(folio)) { > > + int nr =3D folio_nr_pages(folio); > > + unsigned long idx =3D folio_page_idx(folio, page); > > + unsigned long folio_start =3D vmf->address - idx * PAGE_S= IZE; > > + unsigned long folio_end =3D folio_start + nr * PAGE_SIZE; > > + pte_t *folio_ptep; > > + pte_t folio_pte; > > + > > + if (unlikely(folio_start < max(vmf->address & PMD_MASK, v= ma->vm_start))) > > + goto check_folio; > > + if (unlikely(folio_end > pmd_addr_end(vmf->address, vma->= vm_end))) > > + goto check_folio; > > + > > + folio_ptep =3D vmf->pte - idx; > > + folio_pte =3D ptep_get(folio_ptep); > > + if (!pte_same(folio_pte, pte_move_swp_offset(vmf->orig_pt= e, -idx)) || > > + swap_pte_batch(folio_ptep, nr, folio_pte) !=3D nr) > > + goto check_folio; > > + > > + page_idx =3D idx; > > + address =3D folio_start; > > + ptep =3D folio_ptep; > > + nr_pages =3D nr; > > + entry =3D folio->swap; > > + page =3D &folio->page; > > + } > > + > > +check_folio: > > + > > /* > > * PG_anon_exclusive reuses PG_mappedtodisk for anon pages. A swa= p pte > > * must never point at an anonymous page in the swapcache that is > > @@ -4225,12 +4259,13 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > > * We're already holding a reference on the page but haven't mapp= ed it > > * yet. > > */ > > - swap_free_nr(entry, 1); > > + swap_free_nr(entry, nr_pages); > > if (should_try_to_free_swap(folio, vma, vmf->flags)) > > folio_free_swap(folio); > > > > - inc_mm_counter(vma->vm_mm, MM_ANONPAGES); > > - dec_mm_counter(vma->vm_mm, MM_SWAPENTS); > > + folio_ref_add(folio, nr_pages - 1); > > + add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages); > > + add_mm_counter(vma->vm_mm, MM_SWAPENTS, -nr_pages); > > pte =3D mk_pte(page, vma->vm_page_prot); > > > > /* > > @@ -4240,34 +4275,35 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > > * exclusivity. > > */ > > if (!folio_test_ksm(folio) && > > - (exclusive || folio_ref_count(folio) =3D=3D 1)) { > > + (exclusive || (folio_ref_count(folio) =3D=3D nr_pages && > > + folio_nr_pages(folio) =3D=3D nr_pages))) { > > if (vmf->flags & FAULT_FLAG_WRITE) { > > pte =3D maybe_mkwrite(pte_mkdirty(pte), vma); > > vmf->flags &=3D ~FAULT_FLAG_WRITE; > > I fail to convince myself that this change is correct, and if it is > correct, it's confusing (I think there is a dependency on > folio_free_swap() having been called and succeeding, such that we don't > have a folio that is in the swapcache at this point). > > Why can't we move the folio_ref_add() after this check and just leave > the check as it is? > > "folio_ref_count(folio) =3D=3D 1" is as clear as it gets: we hold the sin= gle > reference, so we can do with this thing whatever we want: it's certainly > exclusive. No swapcache, no other people mapping it. Right. I believe the code works correctly but is a bit confusing. as you said, we might move folio_ref_add() behind folio_ref_count(folio) =3D=3D 1. > > > -- > Cheers, > > David / dhildenb > Thanks Barry