From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25D0AC433F5 for ; Fri, 29 Oct 2021 12:16:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B88E8600CC for ; Fri, 29 Oct 2021 12:16:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B88E8600CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 3E319940007; Fri, 29 Oct 2021 08:16:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 36B986B0072; Fri, 29 Oct 2021 08:16:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2336F940007; Fri, 29 Oct 2021 08:16:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0097.hostedemail.com [216.40.44.97]) by kanga.kvack.org (Postfix) with ESMTP id E7EC06B0071 for ; Fri, 29 Oct 2021 08:16:47 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 7108C1CFB3 for ; Fri, 29 Oct 2021 12:16:47 +0000 (UTC) X-FDA: 78749373654.14.4F7F0B2 Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) by imf27.hostedemail.com (Postfix) with ESMTP id 34BB670000A4 for ; Fri, 29 Oct 2021 12:16:45 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R151e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04400;MF=ningzhang@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0Uu9Yt1A_1635509761; Received: from ali-074845.local(mailfrom:ningzhang@linux.alibaba.com fp:SMTPD_---0Uu9Yt1A_1635509761) by smtp.aliyun-inc.com(127.0.0.1); Fri, 29 Oct 2021 20:16:41 +0800 Subject: Re: [RFC 1/6] mm, thp: introduce thp zero subpages reclaim To: Matthew Wilcox Cc: linux-mm@kvack.org, Andrew Morton , Johannes Weiner , Michal Hocko , Vladimir Davydov , Yu Zhao References: <1635422215-99394-1-git-send-email-ningzhang@linux.alibaba.com> <1635422215-99394-2-git-send-email-ningzhang@linux.alibaba.com> From: ning zhang Message-ID: <0ee9c9f6-4d56-a8dd-c922-0d4fc456cbf0@linux.alibaba.com> Date: Fri, 29 Oct 2021 20:16:00 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.12.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed X-Stat-Signature: g5e1yernbzehirfndb45wmwwwqw4pwmt Authentication-Results: imf27.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf27.hostedemail.com: domain of ningzhang@linux.alibaba.com designates 115.124.30.132 as permitted sender) smtp.mailfrom=ningzhang@linux.alibaba.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 34BB670000A4 X-HE-Tag: 1635509805-881799 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: =E5=9C=A8 2021/10/28 =E4=B8=8B=E5=8D=888:53, Matthew Wilcox =E5=86=99=E9=81= =93: > On Thu, Oct 28, 2021 at 07:56:50PM +0800, Ning Zhang wrote: >> +++ b/include/linux/huge_mm.h >> @@ -185,6 +185,15 @@ unsigned long thp_get_unmapped_area(struct file *= filp, unsigned long addr, >> void free_transhuge_page(struct page *page); >> bool is_transparent_hugepage(struct page *page); >> =20 >> +#ifdef CONFIG_MEMCG >> +int zsr_get_hpage(struct hpage_reclaim *hr_queue, struct page **recla= im_page); >> +unsigned long zsr_reclaim_hpage(struct lruvec *lruvec, struct page *p= age); >> +static inline struct list_head *hpage_reclaim_list(struct page *page) >> +{ >> + return &page[3].hpage_reclaim_list; >> +} >> +#endif > I don't think any of this needs to be under an ifdef. That goes for a > lot of your other additions to header files. > >> @@ -1110,6 +1121,10 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_= data_t *pgdat, int order, >> gfp_t gfp_mask, >> unsigned long *total_scanned); >> =20 >> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE >> +void del_hpage_from_queue(struct page *page); >> +#endif > That name is too generic. Also, to avoid ifdefs in code, it should be: > > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > void del_hpage_from_queue(struct page *page); > #else > static inline void del_hpage_from_queue(struct page *page) { } > #endif > >> @@ -159,6 +159,12 @@ struct page { >> /* For both global and memcg */ >> struct list_head deferred_list; >> }; >> + struct { /* Third tail page of compound page */ >> + unsigned long _compound_pad_2; >> + unsigned long _compound_pad_3; >> + /* For zero subpages reclaim */ >> + struct list_head hpage_reclaim_list; > Why do you need _compound_pad_3 here? > >> +++ b/include/linux/mmzone.h >> @@ -787,6 +787,12 @@ struct deferred_split { >> struct list_head split_queue; >> unsigned long split_queue_len; >> }; >> + >> +struct hpage_reclaim { >> + spinlock_t reclaim_queue_lock; >> + struct list_head reclaim_queue; >> + unsigned long reclaim_queue_len; >> +}; > Have you considered using an XArray instead of a linked list? > >> +static bool hpage_estimate_zero(struct page *page) >> +{ >> + unsigned int i, maybe_zero_pages =3D 0, offset =3D 0; >> + void *addr; >> + >> +#define BYTES_PER_LONG (BITS_PER_LONG / BITS_PER_BYTE) > BYTES_PER_LONG is simply sizeof(long). > Also, I'd check the entire cacheline rather than just one word; it's > essentially free. > >> +#ifdef CONFIG_MMU >> +#define ZSR_PG_MLOCK(flag) (1UL << flag) >> +#else >> +#define ZSR_PG_MLOCK(flag) 0 >> +#endif > Or use __PG_MLOCKED ? > >> +#ifdef CONFIG_ARCH_USES_PG_UNCACHED >> +#define ZSR_PG_UNCACHED(flag) (1UL << flag) >> +#else >> +#define ZSR_PG_UNCACHED(flag) 0 >> +#endif > Define __PG_UNCACHED in page-flags.h? > >> +#ifdef CONFIG_MEMORY_FAILURE >> +#define ZSR_PG_HWPOISON(flag) (1UL << flag) >> +#else >> +#define ZSR_PG_HWPOISON(flag) 0 >> +#endif > __PG_HWPOISON > >> +#define hr_queue_list_to_page(head) \ >> + compound_head(list_entry((head)->prev, struct page,\ >> + hpage_reclaim_list)) > I think you're better off subtracting 3*sizeof(struct page) than > loading from compound_head. > >> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE >> +/* Need the page lock if the page is not a newly allocated page. */ >> +static void add_hpage_to_queue(struct page *page, struct mem_cgroup *= memcg) >> +{ >> + struct hpage_reclaim *hr_queue; >> + unsigned long flags; >> + >> + if (READ_ONCE(memcg->thp_reclaim) =3D=3D THP_RECLAIM_DISABLE) >> + return; >> + >> + page =3D compound_head(page); > Why do you think the caller might be passing in a tail page here? Thanks for the comments!=C2=A0 I will modify it.