From: Huaisheng HS1 Ye <yehs1@lenovo.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"willy@infradead.org" <willy@infradead.org>,
"vbabka@suse.cz" <vbabka@suse.cz>,
"mgorman@techsingularity.net" <mgorman@techsingularity.net>,
"kstewart@linuxfoundation.org" <kstewart@linuxfoundation.org>,
"alexander.levin@verizon.com" <alexander.levin@verizon.com>,
"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
"colyli@suse.de" <colyli@suse.de>,
NingTing Cheng <chengnt@lenovo.com>,
Ocean HY1 He <hehy1@lenovo.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"iommu@lists.linux-foundation.org"
<iommu@lists.linux-foundation.org>,
"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
Christoph Hellwig <hch@lst.de>
Subject: RE: [External] Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
Date: Fri, 25 May 2018 09:43:09 +0000 [thread overview]
Message-ID: <HK2PR03MB1684ED6EC6859A88A196DC0C92690@HK2PR03MB1684.apcprd03.prod.outlook.com> (raw)
In-Reply-To: <20180524121853.GG20441@dhcp22.suse.cz>
From: Michal Hocko [mailto:mhocko@kernel.org]
Sent: Thursday, May 24, 2018 8:19 PM>
> > Let me try to reply your questions.
> > Exactly, GFP_ZONE_TABLE is too complicated. I think there are two advantages
> > from the series of patches.
> >
> > 1. XOR operation is simple and efficient, GFP_ZONE_TABLE/BAD need to do twice
> > shift operations, the first is for getting a zone_type and the second is for
> > checking the to be returned type is a correct or not. But with these patch XOR
> > operation just needs to use once. Because the bottom 3 bits of GFP bitmask have
> > been used to represent the encoded zone number, we can say there is no bad zone
> > number if all callers could use it without buggy way. Of course, the returned
> > zone type in gfp_zone needs to be no more than ZONE_MOVABLE.
>
> But you are losing the ability to check for wrong usage. And it seems
> that the sad reality is that the existing code do screw up.
In my opinion, originally there shouldn't be such many wrong combinations of these bottom 3 bits. For any user, whether or driver and fs, they should make a decision that which zone is they preferred. Matthew's idea is great, because with it the user must offer an unambiguous flag to gfp zone bits.
Ideally, before any user wants to modify the address zone modifier, they should clear it firstly, then ORing the GFP zone flag which comes from the zone they prefer.
With these patches, we can loudly announce that, the bottom 3 bits of zone mask couldn't accept internal ORing operations.
The operations like __GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM is illegal. The current GFP_ZONE_TABLE is precisely the root of this problem, that is __GFP_DMA, __GFP_DMA32 and __GFP_HIGHMEM are formatted as 0x1, 0x2 and 0x4.
>
> > 2. GFP_ZONE_TABLE has limit with the amount of zone types. Current GFP_ZONE_TABLE
> > is 32 bits, in general, there are 4 zone types for most ofX86_64 platform, they
> > are ZONE_DMA, ZONE_DMA32, ZONE_NORMAL and ZONE_MOVABLE. If we want to expand the
> > amount of zone types to larger than 4, the zone shift should be 3.
>
> But we do not want to expand the number of zones IMHO. The existing zoo
> is quite a maint. pain.
>
> That being said. I am not saying that I am in love with GFP_ZONE_TABLE.
> It always makes my head explode when I look there but it seems to work
> with the current code and it is optimized for it. If you want to change
> this then you should make sure you describe reasons _why_ this is an
> improvement. And I would argue that "we can have more zones" is a
> relevant one.
Yes, GFP_ZONE_TABLE is too complicated. The patches have 4 advantages as below.
* The address zone modifiers have new operation method, that is, user should decide which zone is preferred at first, then give the encoded zone number to bottom 3 bits in GFP mask. That is much direct and clear than before.
* No bad zone combination, because user should choose just one address zone modifier always.
* Better performance and efficiency, current gfp_zone has to take shifting operation twice for GFP_ZONE_TABLE and GFP_ZONE_BAD. With these patches, gfp_zone() just needs one XOR.
* Up to 8 zones can be used. At least it isn't a disadvantage, right?
Sincerely,
Huaisheng Ye
next prev parent reply other threads:[~2018-05-25 9:43 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-21 15:20 Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 01/12] include/linux/gfp.h: " Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 02/12] arch/x86/kernel/amd_gart_64: update usage of address zone modifiers Huaisheng Ye
2018-05-22 9:38 ` Christoph Hellwig
2018-05-22 10:17 ` [External] " Huaisheng HS1 Ye
2018-05-21 15:20 ` [RFC PATCH v2 03/12] arch/x86/kernel/pci-calgary_64: " Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 04/12] drivers/iommu/amd_iommu: " Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 05/12] include/linux/dma-mapping: " Huaisheng Ye
2018-05-21 15:30 ` Christoph Hellwig
2018-05-21 15:20 ` [RFC PATCH v2 10/12] mm/zsmalloc: " Huaisheng Ye
2018-05-22 11:22 ` Matthew Wilcox
2018-05-22 11:51 ` [External] " Huaisheng HS1 Ye
2018-05-21 15:20 ` [RFC PATCH v2 11/12] include/linux/highmem: update usage of movableflags Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 12/12] arch/x86/include/asm/page.h: " Huaisheng Ye
2018-05-22 9:40 ` [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Christoph Hellwig
2018-05-22 18:37 ` Michal Hocko
2018-05-23 16:07 ` [External] " Huaisheng HS1 Ye
2018-05-24 12:18 ` Michal Hocko
2018-05-25 9:43 ` Huaisheng HS1 Ye [this message]
2018-05-28 13:37 ` Michal Hocko
2018-05-30 9:02 ` Huaisheng HS1 Ye
2018-05-30 9:11 ` Christoph Hellwig
2018-05-30 9:12 ` Michal Hocko
2018-05-24 5:19 ` Matthew Wilcox
2018-05-24 12:23 ` Michal Hocko
2018-05-24 15:18 ` Matthew Wilcox
2018-05-24 15:29 ` Michal Hocko
2018-05-25 12:00 ` Matthew Wilcox
2018-05-28 13:33 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=HK2PR03MB1684ED6EC6859A88A196DC0C92690@HK2PR03MB1684.apcprd03.prod.outlook.com \
--to=yehs1@lenovo.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.levin@verizon.com \
--cc=chengnt@lenovo.com \
--cc=colyli@suse.de \
--cc=gregkh@linuxfoundation.org \
--cc=hch@lst.de \
--cc=hehy1@lenovo.com \
--cc=iommu@lists.linux-foundation.org \
--cc=kstewart@linuxfoundation.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox