From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22F04C5478C for ; Tue, 27 Feb 2024 02:17:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7A3C44401DE; Mon, 26 Feb 2024 21:17:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7545044017F; Mon, 26 Feb 2024 21:17:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 61BEB4401DE; Mon, 26 Feb 2024 21:17:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 532EA44017F for ; Mon, 26 Feb 2024 21:17:44 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 1CD02120AC0 for ; Tue, 27 Feb 2024 02:17:44 +0000 (UTC) X-FDA: 81835972848.12.56B2BE9 Received: from mail-ua1-f51.google.com (mail-ua1-f51.google.com [209.85.222.51]) by imf20.hostedemail.com (Postfix) with ESMTP id 7113D1C0008 for ; Tue, 27 Feb 2024 02:17:42 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ZNlCW8CB; spf=pass (imf20.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.51 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709000262; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ReCSL7lbirgRScT6I2z3qdqNs57J7V9jRqJOBCMvkWQ=; b=tBz2r8HWNCol7EafpmicXV5Xg6fin43valZh+FOIkHptnDetIjtSDV9Um0Ni1QDo+i/eAF 54l/6ao5NUzWoZjvE+fNzbFV5Z9lvmbloRwKOK1rrXGZts5Ir95aH0Mcmo7Dl46aPRffCy 9I66ZhchPOgPH0pHonloA6tYZewCxyE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709000262; a=rsa-sha256; cv=none; b=gsA5ga0cETwYlRBuqf7jsIaNkBkoMbHroAX1SBzO3gafEXKxUD7mDMUdCgxsrR1QZogWPy +UaSakRBoiC7QbN8yFSHoxrETnugj6QXqHGZU5ru7DNoEDd3SDgF1Bpx1f9hYfeKVsPlJk sXZEe6U0UKjDYCaDFcrsYl6NyNm6gEQ= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ZNlCW8CB; spf=pass (imf20.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.51 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-ua1-f51.google.com with SMTP id a1e0cc1a2514c-7ce603b9051so2681398241.2 for ; Mon, 26 Feb 2024 18:17:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709000261; x=1709605061; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ReCSL7lbirgRScT6I2z3qdqNs57J7V9jRqJOBCMvkWQ=; b=ZNlCW8CBc3UsDhy01VQZxVOC3tIG7Pc/7Bkt+qC7WtjuVMT3kl48AFSo/hmDE13lv0 YsPdVfXcwpLU5pK+sFVmj4RBtnuC6rsBV/8MxBv+982tR2kIjqPxoDN6vM6ZrPI2pIcl cP60+/2gX7GU49MWRPfCcD7hrRwyWwGjieL568voBEPmGorjyZvqkbXAyBgQt2cteBSe kbMZ4aIcYYN5qPVmivWJ9ojJpr6W1UB226aMPpzJWHz67EVMAFpw7olVJs5DoEqbtBG4 4F5THvFS2bMMonIQ1d30N0AhuYQwLls8NSl5lde7gOLfp7zqAh7edos8k3sIL71zgylZ Z1xA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709000261; x=1709605061; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ReCSL7lbirgRScT6I2z3qdqNs57J7V9jRqJOBCMvkWQ=; b=Uf1cT7qihpfDOij1pKn+uSqZ5lIxuA5bnRp8NO5g/4mgVM0n760nufh7sLVDw/9NG/ aTAmnGWcx3Ck1RGXcxbXKWRcDaD76RR8DL0j3GkL1nalR1c2KVkhuqdHYIzJjmqlHbzb 2/dzVXlabBTsCLoKsKktZA/vLifm/FbqHMUL9C2HiLM0QSXulwkL4XWFm2cT6y8xGBCp aZK1Ej8U0MB0OrSS9rPRd/boBvZFfFMgObTnDP8oKWauT3VgdyC+gEc8cwgv762E3Ce3 93V6k2SHp8etq2WZz9Jo0+NWsbm3QQ3Qb0bw80q/xYbsCeopDoy5YeAj88UMhIpRv0UG JKbQ== X-Forwarded-Encrypted: i=1; AJvYcCVEaZfBQgIzXAq9NOCx6DzhLIcEXlXgyRTKioekLdGBoGs0CWkZ0nV2+kLQC6e7HNrbXNconjy7m33TQwNaqk3xwLw= X-Gm-Message-State: AOJu0YzqgTxjglrCF8QV1Xfz2y+ckj6soWhLmw2EGK0oEG4+lXAEUHVo Fl8skaPPVIRTLZnGXrmqU7iZWvoY1HiW9en54x3feln1d88wtwvnq8/4cE86PAI7SndXF/RvsIW R9MccqG58MMkcyOeHhKFCuWHZYgY= X-Google-Smtp-Source: AGHT+IEk8kkUHQ8MJb1uXCKzpyAgxVOkjWfhWRHsXmy6Twh6aR2kkc//jeCTboYwBMdncRdSMRaUzssHG+fNi+RnZc0= X-Received: by 2002:a05:6102:809:b0:470:432c:282e with SMTP id g9-20020a056102080900b00470432c282emr6224724vsb.18.1709000261519; Mon, 26 Feb 2024 18:17:41 -0800 (PST) MIME-Version: 1.0 References: <20240226083714.26187-1-ioworker0@gmail.com> <9bcf5141-7376-441e-bbe3-779956ef28b9@redhat.com> <318be511-06de-423e-8216-af869f27f849@arm.com> <19758162-be5f-4dc4-b316-77b0115d12ce@intel.com> In-Reply-To: <19758162-be5f-4dc4-b316-77b0115d12ce@intel.com> From: Barry Song <21cnbao@gmail.com> Date: Tue, 27 Feb 2024 15:17:30 +1300 Message-ID: Subject: Re: [PATCH 1/1] mm/madvise: enhance lazyfreeing with mTHP in madvise_free To: Yin Fengwei Cc: Ryan Roberts , Lance Yang , David Hildenbrand , akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mhocko@suse.com, minchan@kernel.org, peterx@redhat.com, shy828301@gmail.com, songmuchun@bytedance.com, wangkefeng.wang@huawei.com, zokeefe@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: wgjd4p5dn7dxw69ac6kkdbo6u9wgiip8 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 7113D1C0008 X-Rspam-User: X-HE-Tag: 1709000262-473659 X-HE-Meta: U2FsdGVkX1/CvHaxexvHkcwca/i/OE6OLRYbv1kXCed/bhMoXxS/Jtet4Eg0jfjvhQd65jtVPmrTBqqOO9tuS5juiNlbziWMwA63kQawn3JdCXyOFY0qPqr7VFdT9NCv3YdbB9gvSHk21hYl2azSN21Phy+pYKGq4u0H//vcl9HFO5Ve+CAay/WdvrI7OrNRR0Sa5OglL/U8xJy+LwoeavCi/11nYd93GAdyMzrIkoKJ1br6yFMI422o4Q+qi155FqxKIpL4iy0967Fjy9bHYpda6s0l/RIrchmT3nZb1RP1VqHjKldwpA1Paaz7ZUfD58sG18BtW8FJT9z2Bt0jVm+u4s7ePwu1ESkG6vCSkvEYFLg8vfgl+fyoIgt0bfAqowVedH5Ekos018WNljqeQFg6BryA2Up24eb0vFVjgdE9KjvByqbogUNp/1Vxp5Sakw9rerIcFSPrjhhOxRFNNk7Z3kQ90IZL9K7UfXklNMMQ5R3pxO05rr72GzfqOJOxnJ0cFOxIYixxGYGNH0pjZRp1t8aIKUkRD2if+/jM7rDAs6Mq3nhcM0qg7FVV95JmQPNRQv3fJ702zfGNhRPwMuORFw+QZDQZaFkvoKp9n+62i6AcRA1XriBa1xLz9wZnQZH/G2LF9IbeoWiSUSQLoGgCVLjg//jFBrD1AzbSYy1YYwXUIfoiG/yUO0qYh9lm9/DyDLnee0kaYLn8RLUCy/aPCPJZqa7f9Y7tyMee7Mj14rvTaXQaHlKqkYqMgjYZIZUGI10MEcynlkxqYLwJ7JIvwiIny7AcwwLDnHmvLLTYal4+/ft9ZFAyexOAM9x5SzDUNbQyqVBJtkl3ZIiOT1PH7xVrkQFgrFJvUagzVQIc5QciQwXpNG6EL++O/Uvpbwpuz92e3qy+D2oR0Zcsp9SJA70Qi3grXAZ1rV7hoG9hXHvNrRRKNLmph4XxlDsZ39l4wrzcLB2v5s38Sbo dr96v8hT zP6juN6HOk1LlgHz5fGX8k7ojxdlax5bVQWy8S10QafuEtft/8vepRg1bUm02NsAg36qyPaVk2HiTZIN9A1k1SIG3aw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Feb 27, 2024 at 2:51=E2=80=AFPM Yin Fengwei = wrote: > > > > On 2/27/24 04:49, Barry Song wrote: > > On Tue, Feb 27, 2024 at 2:04=E2=80=AFAM Ryan Roberts wrote: > >> > >> On 26/02/2024 08:55, Lance Yang wrote: > >>> Hey David, > >>> > >>> Thanks for your suggestion! > >>> > >>> On Mon, Feb 26, 2024 at 4:41=E2=80=AFPM David Hildenbrand wrote: > >>>> > >>> [...] > >>>>> On Mon, Feb 26, 2024 at 12:00=E2=80=AFPM Barry Song <21cnbao@gmail.= com> wrote: > >>>>> [...] > >>>>>> On Mon, Feb 26, 2024 at 1:33=E2=80=AFAM Lance Yang wrote: > >>>>> [...] > >>> [...] > >>>>> +static inline bool pte_range_cont_mapped(pte_t *pte, unsigned long= nr) > >>>>> +{ > >>>>> + pte_t pte_val; > >>>>> + unsigned long pfn =3D pte_pfn(pte); > >>>>> + for (int i =3D 0; i < nr; i++) { > >>>>> + pte_val =3D ptep_get(pte + i); > >>>>> + if (pte_none(pte_val) || pte_pfn(pte_val) !=3D (pfn += i)) > >>>>> + return false; > >>>>> + } > >>>>> + return true; > >>>>> +} > >>>> > >>>> I dislike the "cont mapped" terminology. > >>>> > >>>> Maybe folio_pte_batch() does what you want? > >>> > >>> folio_pte_batch() is a good choice. Appreciate it! > >> > >> Agreed, folio_pte_batch() is likely to be widely useful for this chang= e and > >> others, so suggest exporting it from memory.c and reusing as is if pos= sible. > > > > I actually missed folio_pte_batch() in cont-pte series and re-invented > > a function > > to check if a large folio is entirely mapped in MADV_PAGEOUT[1]. export= ing > > folio_pte_batch() will also benefit that case. The problem space is sam= e. > > > > [1] https://lore.kernel.org/linux-mm/20240118111036.72641-7-21cnbao@gma= il.com/ > I am wondering whether we can delay large folio split till page reclaim p= hase > for madvise cases. > > Like if we hit folio which is partially mapped to the range, don't split = it but > just unmap the mapping part from the range. Let page reclaim decide wheth= er > split the large folio or not (If it's not mapped to any other range,it wi= ll be > freed as whole large folio. If part of it still mapped to other range,pag= e reclaim > can decide whether to split it or ignore it for current reclaim cycle). Yes, we can. but we still have to play the ptes check game to avoid adding folios multiple times to reclaim the list. I don't see too much difference between splitting in madvise and splitting in vmscan. as our real purpose is avoiding splitting entirely mapped large folios. for partial mapped large folios, if we split in madvise, then we don't need to play the game of skipping folios while iterating PTEs. if we don't split in madvise, we have to make sure the large folio is only added in reclaimed list one time by checking if PTEs belong to the previous added folio. > > Splitting does work here. But it just drops all the benefits of large fol= io. > > > Regards > Yin, Fengwei > > > > >> > >>> > >>> Best, > >>> Lance > >>> > >>>> > >>>> -- > >>>> Cheers, > >>>> > >>>> David / dhildenb > > Thanks Barry