From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.5 required=3.0 tests=FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AFEDBC32771 for ; Fri, 24 Jan 2020 13:24:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5C2992075D for ; Fri, 24 Jan 2020 13:24:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5C2992075D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=sina.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EA1966B0006; Fri, 24 Jan 2020 08:24:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E2C396B0007; Fri, 24 Jan 2020 08:24:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF2A66B0008; Fri, 24 Jan 2020 08:24:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0162.hostedemail.com [216.40.44.162]) by kanga.kvack.org (Postfix) with ESMTP id B63DD6B0006 for ; Fri, 24 Jan 2020 08:24:09 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 5FE01180AD804 for ; Fri, 24 Jan 2020 13:24:09 +0000 (UTC) X-FDA: 76412596218.15.leg65_240d314341d5b X-HE-Tag: leg65_240d314341d5b X-Filterd-Recvd-Size: 5621 Received: from r3-17.sinamail.sina.com.cn (r3-17.sinamail.sina.com.cn [202.108.3.17]) by imf29.hostedemail.com (Postfix) with SMTP for ; Fri, 24 Jan 2020 13:24:07 +0000 (UTC) Received: from unknown (HELO localhost.localdomain)([114.254.172.143]) by sina.com with ESMTP id 5E2AEFF30000F988; Fri, 24 Jan 2020 21:24:05 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 15162049283364 From: Hillf Danton To: Alexander Graf Cc: Alexander Duyck , kvm@vger.kernel.org, mst@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, mgorman@techsingularity.net, Hillf Danton , Minchan Kim , vbabka@suse.cz Subject: Re: [PATCH v16.1 0/9] mm / virtio: Provide support for free page reporting Date: Fri, 24 Jan 2020 21:23:52 +0800 Message-Id: <20200124132352.12824-1-hdanton@sina.com> In-Reply-To: <20200122173040.6142.39116.stgit@localhost.localdomain> References: <20200122173040.6142.39116.stgit@localhost.localdomain> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000104, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 23 Jan 2020 11:20:07 +0100 Alexander Graf wrote: >=20 > The big problem I see is that what I really want from a user's point of= =20 > view is a tuneable that says "Automatically free clean page cache pages= =20 > that were not accessed in the last X minutes". A diff is made on top of 1a4e58cce84e ("mm: introduce MADV_PAGEOUT") with= out test in any form, assuming it goes in line with the tunable above but wit= hout "X minutes" taken into account. [BTW, please take a look at Content-Type: text/plain; charset=3D"utf-8"; format=3D"flowed" Content-Transfer-Encoding: base64 and ensure pure text message.] --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -69,6 +69,7 @@ =20 #define MADV_COLD 20 /* deactivate these pages */ #define MADV_PAGEOUT 21 /* reclaim these pages */ +#define MADV_CCPC 22 /* reclaim cold & clean page cache pages */ =20 /* compatibility flags */ #define MAP_FILE 0 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -35,6 +35,7 @@ struct madvise_walk_private { struct mmu_gather *tlb; bool pageout; + int behavior; }; =20 /* @@ -50,6 +51,7 @@ static int madvise_need_mmap_write(int b case MADV_DONTNEED: case MADV_COLD: case MADV_PAGEOUT: + case MADV_CCPC: case MADV_FREE: return 0; default: @@ -304,6 +306,7 @@ static int madvise_cold_or_pageout_pte_r struct madvise_walk_private *private =3D walk->private; struct mmu_gather *tlb =3D private->tlb; bool pageout =3D private->pageout; + bool ccpc =3D private->behavior =3D=3D MADV_CCPC; struct mm_struct *mm =3D tlb->mm; struct vm_area_struct *vma =3D walk->vma; pte_t *orig_pte, *pte, ptent; @@ -429,6 +432,8 @@ regular_page: VM_BUG_ON_PAGE(PageTransCompound(page), page); =20 if (pte_young(ptent)) { + if (ccpc) + continue; ptent =3D ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); ptent =3D pte_mkold(ptent); @@ -436,6 +441,10 @@ regular_page: tlb_remove_tlb_entry(tlb, pte, addr); } =20 + if (ccpc) + if (PageDirty(page)) + continue; + /* * We are deactivating a page for accelerating reclaiming. * VM couldn't reclaim the page unless we clear PG_young. @@ -502,12 +511,13 @@ static long madvise_cold(struct vm_area_ } =20 static void madvise_pageout_page_range(struct mmu_gather *tlb, - struct vm_area_struct *vma, + struct vm_area_struct *vma, int behavior, unsigned long addr, unsigned long end) { struct madvise_walk_private walk_private =3D { .pageout =3D true, .tlb =3D tlb, + .behavior =3D behavior, }; =20 tlb_start_vma(tlb, vma); @@ -515,10 +525,10 @@ static void madvise_pageout_page_range(s tlb_end_vma(tlb, vma); } =20 -static inline bool can_do_pageout(struct vm_area_struct *vma) +static inline bool can_do_pageout(struct vm_area_struct *vma, int behavi= or) { if (vma_is_anonymous(vma)) - return true; + return behavior !=3D MADV_CCPC; if (!vma->vm_file) return false; /* @@ -531,7 +541,7 @@ static inline bool can_do_pageout(struct inode_permission(file_inode(vma->vm_file), MAY_WRITE) =3D=3D 0; } =20 -static long madvise_pageout(struct vm_area_struct *vma, +static long madvise_pageout(struct vm_area_struct *vma, int behavior, struct vm_area_struct **prev, unsigned long start_addr, unsigned long end_addr) { @@ -542,12 +552,12 @@ static long madvise_pageout(struct vm_ar if (!can_madv_lru_vma(vma)) return -EINVAL; =20 - if (!can_do_pageout(vma)) + if (!can_do_pageout(vma, behavior)) return 0; =20 lru_add_drain(); tlb_gather_mmu(&tlb, mm, start_addr, end_addr); - madvise_pageout_page_range(&tlb, vma, start_addr, end_addr); + madvise_pageout_page_range(&tlb, vma, behavior, start_addr, end_addr); tlb_finish_mmu(&tlb, start_addr, end_addr); =20 return 0; @@ -936,7 +946,8 @@ madvise_vma(struct vm_area_struct *vma, case MADV_COLD: return madvise_cold(vma, prev, start, end); case MADV_PAGEOUT: - return madvise_pageout(vma, prev, start, end); + case MADV_CCPC: + return madvise_pageout(vma, behavior, prev, start, end); case MADV_FREE: case MADV_DONTNEED: return madvise_dontneed_free(vma, prev, start, end, behavior); @@ -960,6 +971,7 @@ madvise_behavior_valid(int behavior) case MADV_FREE: case MADV_COLD: case MADV_PAGEOUT: + case MADV_CCPC: #ifdef CONFIG_KSM case MADV_MERGEABLE: case MADV_UNMERGEABLE: --