From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2AFF8EB64DA for ; Sat, 8 Jul 2023 04:46:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AA8536B0071; Sat, 8 Jul 2023 00:46:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A575A6B0072; Sat, 8 Jul 2023 00:46:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9473A8D0001; Sat, 8 Jul 2023 00:46:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 8693F6B0071 for ; Sat, 8 Jul 2023 00:46:33 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 36FF21C79A0 for ; Sat, 8 Jul 2023 04:46:32 +0000 (UTC) X-FDA: 80987208624.14.F28BA24 Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) by imf21.hostedemail.com (Postfix) with ESMTP id 777C21C0002 for ; Sat, 8 Jul 2023 04:46:30 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=N00yb2y9; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.169 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688791590; a=rsa-sha256; cv=none; b=ZadlE9ufYnSW2xdYll7hI6b8hLJdjJwJgIXux0uP1PluxQMkVBO4XfCFNt97vMm5G4R0H/ Rd8GEhygV+Ej8KKABr++DpG5blZ/uwqlo2zIL0KvG7H/EjuN5626oD4LAAjneEIed7YaNA 7kld37r8LjBntuk3fuPWTSLLy4UGvTs= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=N00yb2y9; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.169 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688791590; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YyAMdvJktjgVZgqIlZO7YuMmEhUFKN2Y2cqBBOKqGiU=; b=LCchkmW+8xOX3k+Ghh7P1tK1VJUdduXv1sIOP6tbPW8AM0MNCX0eSBOS315iJTw+5R0uBn ECuRKqYKQpDyqMNtmPEfILP1ZgFhqrWK3mynJJAni4Y9f0S8I1GftvBIo7mhjp7rQDeO7/ QoHGSzS3/p7ekfIn3uwaHzDquKCxfSI= Received: by mail-qt1-f169.google.com with SMTP id d75a77b69052e-401f4408955so65771cf.1 for ; Fri, 07 Jul 2023 21:46:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688791589; x=1691383589; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=YyAMdvJktjgVZgqIlZO7YuMmEhUFKN2Y2cqBBOKqGiU=; b=N00yb2y9yM9yMJ/lJ+MIeALEu4ief3j70Iy2wzbccmSEshsLeJFXNGscVhBc1wqufP 285RIu0byqu0sHgr1Dj/U0pcr5vyLipTY9pYyFzP4qWl/8eFLDqkKcRHLUeWepi1m9aJ 8/a5kA8EXvpXzu5COj4KCTv86ShSD2w4641Yil5uKyqanCGg5mIZhSG3dycSp941GKZw kGncppV1z/kqlaJcwPctOCzfRsNWsMHeOFDiUMK2y4A7Odmx0xtvIz8DucxA0rWto9O7 bwRRjM5D+Ki+uANjMJN0tjhZleM8IDjkYofxaPlfbkKJgxmaXW88yf2ZsSMyRmwcV2iC IXww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688791589; x=1691383589; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YyAMdvJktjgVZgqIlZO7YuMmEhUFKN2Y2cqBBOKqGiU=; b=M018UcZC/oE2JgDMtwRESInfM9u024UYUAYgtQuXfuhNio/+yVf7q5ErCMWo0CSqtg iMoSVSUdFdwnvdw/NbbfBQmaZ+Wg5ZbqYN8bQJVO+Cx8VZOEBOGQGzjiLRSKCgG0K3xC 5m3EaHVJOPXayNkvSpFR+dPH6rQ2ByBXTEz0IfGkIvHx1pZRI+JnN2Az0gMeuAixN3v0 jWzW3nw76aJX8mu4NPyYEK7k5Obl9t6F4sAzWLTGwzx1Nk8e7FLRAcKlSF6Z7VZe2h4W n5kQaaI/3PZB45ilFE3VUjP8/5RZoW4OhSQ7elYGV2bJU9dDHov/XegWmoRoUojMF3zH VUFA== X-Gm-Message-State: ABy/qLYGNkFZpjr67KwpmbTxfYGNe5JrAa9s5b7ur0XIspIWirzHYW8D t43FZIf/WJevx2R62HmdA+ai0cnQrvE047QovTFhWA== X-Google-Smtp-Source: APBJJlFoC6KdctcB2dk1Nls7g98ZI12r8YIz0zdzjvddb7lswh7FUNnBgCmR0ecN/b30tq+Wsmu+35xn0ifk+G/VzcA= X-Received: by 2002:a05:622a:104b:b0:3f8:e0a:3e66 with SMTP id f11-20020a05622a104b00b003f80e0a3e66mr59202qte.3.1688791589388; Fri, 07 Jul 2023 21:46:29 -0700 (PDT) MIME-Version: 1.0 References: <20230707165221.4076590-1-fengwei.yin@intel.com> In-Reply-To: <20230707165221.4076590-1-fengwei.yin@intel.com> From: Yu Zhao Date: Fri, 7 Jul 2023 22:45:53 -0600 Message-ID: Subject: Re: [RFC PATCH 0/3] support large folio for mlock To: Yin Fengwei Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, ryan.roberts@arm.com, shy828301@gmail.com, akpm@linux-foundation.org, willy@infradead.org, david@redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 777C21C0002 X-Stat-Signature: eorp3sijksd3zk6ixtuh848s6819fncb X-HE-Tag: 1688791590-140629 X-HE-Meta: U2FsdGVkX18sYDlETZlbiumbJtFLlGhcvCmCmEbmgprOSJ7+/ts6NGThHvIv1n+iLIg62OIUtB8IBy5GMmRjZ2KIsEojopLkTyiz63g19WOuTSkV0atxqfAulqu2Pc4dZoNqkBOs9LgNP5za90FLRekpGT7GrmfIQV4tLIkJ56altr7WfjoQ1yzq+X9NzPTDC63P6o5j6cxJTVauSeqBY5LxgdNkFsj3OT6wQioCqhQ86rOhDU/k30hQ40gBRAoxUWE8EvmVKZgAnAkj4J9IWHAOWChD5aY7UY6je782jrYfpyIDqyVNWpNVkazJLtB2w5Kf37J/XVpGoyrrjxDsBgTYyfyV8+CVTVySI6zMwEZysrnY+ssMQYYWv8JlOfNlK6yjDiZjaYxNfO2HTKy+8Lch+VP0G0GY9uAMgH3cSo6fjPstwhsXwQz05Aqte176G6FaSZkd/fZJIOZBkKDtzNn1q3wngJALN1C+S6tnPvOmiBjnk6GA2a1j5wEUe1UmVgVd4jF1dL9pmRMdtqvhH9sB57NdRNJgNqzleS5uUZTlpCyf9iPErFdyBrLCWf2APOZ/5F7qJj9A0ocWiV0OzQGXAWQ8u52r2Q3TRh+TfsdPDcHQKXU3DGVE2sJah7jobKp2jMbwG8qufqrlw+huMvbkz7Qe1fZ+Uv882uQa95CfgHZNGPgcyJ/MUxKTYRn/8PnIHEC/x+IyNK3qorMOH/ZxfFKLkjK+n4J9IjRNJ4fUtyVl6JlupsqBGYZC2T0KpS2p/6ye6Cbnn9RSIXEavD61GUyDK3unb6VJa5rx7AWXXlKKvjlparnJFiO1PXdkbYJ5BaJHLZKGSZWibKrChY4yqUcEFRnORol5EtNV3jZ0b/Rxo/ZQwYvuOdtF2yjzEhvn6u2gdpWiMXEpdgZPV0bpxXbSeFD9oaLBALLubL4uAKgUGWkrlH5/PGeNawv++UVy6M5wh74d2ldk/CV BlYZLIcj nJnxI8OkZ3UZKhiX7VKEhiq5MS84Qo9Lcdu4wTWp9XXjlinvJYSobFI+4/TehcBzaABZ9dhDe+pElG+XjRRqAdSI1GZ8eZHUNzkkpauB+9nHoFmX6mJ0KT+nDvUwn0Ieqza9FmLiM7Mx6iEQYd6hZ24pjcj4ZSCaQJiR6E3MnlmBtoyxN4yqdf7Intm0AHClDH47D3tHIFGlCPRNfCatk/CKh4iCNLg/62oFIc3gFXVqMD7sXRwYIvuanZA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000020, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jul 7, 2023 at 10:52=E2=80=AFAM Yin Fengwei = wrote: > > Yu mentioned at [1] about the mlock() can't be applied to large folio. > > I leant the related code and here is my understanding: > - For RLIMIT_MEMLOCK related, there is no problem. Becuase the > RLIMIT_MEMLOCK statistics is not related underneath page. That means > underneath page mlock or munlock doesn't impact the RLIMIT_MEMLOCK > statistics collection which is always correct. > > - For keeping the page in RAM, there is no problem either. At least, > during try_to_unmap_one(), once detect the VMA has VM_LOCKED bit > set in vm_flags, the folio will be kept whatever the folio is > mlocked or not. > > So the function of mlock for large folio works. But it's not optimized > because the page reclaim needs scan these large folio and may split > them. > > This series identified the large folio for mlock to two types: > - The large folio is in VM_LOCKED VMA range > - The large folio cross VM_LOCKED VMA boundary > > For the first type, we mlock large folio so page relcaim will skip it. > For the second type, we don't mlock large folio. It's allowed to be > picked by page reclaim and be split. So the pages not in VM_LOCKED VMA > range are allowed to be reclaimed/released. This is a sound design, which is also what I have in mind. I see the rationales are being spelled out in this thread, and hopefully everyone can be convinced. > patch1 introduce API to check whether large folio is in VMA range. > patch2 make page reclaim/mlock_vma_folio/munlock_vma_folio support > large folio mlock/munlock. > patch3 make mlock/munlock syscall support large folio. Could you tidy up the last patch a little bit? E.g., Saying "mlock the 4K folio" is obviously not the best idea. And if it's possible, make the loop just look like before, i.e., if (!can_mlock_entire_folio()) continue; if (vma->vm_flags & VM_LOCKED) mlock_folio_range(); else munlock_folio_range(); Thanks.