From: Wei Yang <richard.weiyang@gmail.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@gmail.com>,
linux-mm@kvack.org, akpm@linux-foundation.org, osalvador@suse.de,
david@redhat.com
Subject: Re: [PATCH] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
Date: Mon, 17 Dec 2018 15:08:19 +0000 [thread overview]
Message-ID: <20181217150819.zqz7u5kjswkvqoqu@master> (raw)
In-Reply-To: <20181217122523.GI30879@dhcp22.suse.cz>
On Mon, Dec 17, 2018 at 01:25:23PM +0100, Michal Hocko wrote:
>On Fri 14-12-18 10:39:12, Wei Yang wrote:
>> Below is a brief call flow for __offline_pages() and
>> alloc_contig_range():
>>
>> __offline_pages()/alloc_contig_range()
>> start_isolate_page_range()
>> set_migratetype_isolate()
>> drain_all_pages()
>> drain_all_pages()
>>
>> Since set_migratetype_isolate() is only used in
>> start_isolate_page_range(), which is just used in __offline_pages() and
>> alloc_contig_range(). And both of them call drain_all_pages() if every
>> check looks good. This means it is not necessary call drain_all_pages()
>> in each iteration of set_migratetype_isolate().
>>
>> By doing so, the logic seems a little bit clearer.
>> set_migratetype_isolate() handles pages in Buddy, while
>> drain_all_pages() takes care of pages in pcp.
>
>I have to confess I am not sure about the purpose of the draining here.
>I suspect it is to make sure that pages in the pcp lists really get
>isolated and if that is the case then it makes sense.
>
>In any case I strongly suggest not touching this code without a very
>good explanation on why this is not needed. Callers do XYZ is not a
>proper explanation because assumes that all callers will know that this
>has to be done. So either we really need to drain and then it is better
>to make it here or we don't but that requires some explanation.
>
Yep, let me try to explain what is trying to do.
Based on my understanding, online_pages do two things
* adjust zone/pgdat status
* put pages into Buddy
Generally, offline_pages do the reverse
* take pages out of Buddy
* adjust zone/pgdat status
While it is not that easy to take pages out of Buddy, since pages are
* pcp list
* slub
* other usage
This means before taking a page out of Buddy, we need to return it first
to Buddy.
Current implementation is interesting by introducing migrate type. By
setting migrate type to MIGRATE_ISOLATE, this range of pages will never
be allocated from Buddy. And every page returned in this range will
never be touched by Buddy.
Function start_isolate_page_range() just do this.
Then let's focus on the pcp list. This is a little bit different
than other allocated pages. These are actually "partially" allocated
pages. They are not counted in Buddy Free pages, either no real use. So
we have two choice to get back those pages:
* wait until it is allocated to a real user and wait for return
* or drain them directly
Current implementation take 2nd approach.
Then we can see there are also two way to drain them:
* drain them range by range
* drain them in a whole range
Both looks good, but not necessary to do them both. Because after we set
a pageblock migrate type to MIGRATE_ISOLATE, pages in this range will
never be allocated nor be put on pcp list. So after we drain one
particular range, it is not necessary to drain this range again.
The reason why I choose to drain them in a whole range is current
drain_all_pages() just carry zone information. For example, a zone may
have 1G while a pageblock is 128M. The pageblock is 1/8 of this zone.
This means in case there are 8 pages on pcp list, only 1 page drained by
drain_all_pages belongs to this pageblock. But we drain other 7 healthy
pages.
CPU1 pcp list CPU2 pcp list
+---------------+ +---------------+
|A1 B3 C8 F6 | |E1 G3 D8 B6 |
+---------------+ +---------------+
A B C D E F G
+---------+---------+---------+---------+---------+---------+---------+
|012345678| | | | | | |
+---------+---------+---------+---------+---------+---------+---------+
|<-pgblk->|
|<- Zone ->|
This is a chart for illustration. In case we want to isolate pgblk D,
while zone pcp list has 8 pages and only one belongs to this pgblk D.
This means the drain on pgblk base has much side effect. And with one
drain on each pgblk, this may increase the contention on this zone.
Well, another approach is to enable drain_all_pages() with exact range
information. But neither approach needs to do them both.
--
Wei Yang
Help you, Help me
next prev parent reply other threads:[~2018-12-17 15:08 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-14 2:39 Wei Yang
2018-12-14 3:57 ` Andrew Morton
2018-12-14 7:01 ` Wei Yang
2018-12-14 15:17 ` Wei Yang
2018-12-17 12:21 ` Michal Hocko
2018-12-18 20:48 ` Wei Yang
2018-12-17 12:25 ` Michal Hocko
2018-12-17 15:08 ` Wei Yang [this message]
2018-12-17 15:48 ` Michal Hocko
2018-12-18 14:44 ` Wei Yang
2018-12-18 20:46 ` [PATCH v2] " Wei Yang
2018-12-18 21:14 ` David Hildenbrand
2018-12-18 21:49 ` Wei Yang
2018-12-18 22:18 ` David Hildenbrand
2018-12-18 23:29 ` Oscar Salvador
2018-12-19 9:51 ` Michal Hocko
2018-12-19 9:57 ` Oscar Salvador
2018-12-19 13:53 ` Wei Yang
2018-12-19 14:13 ` Michal Hocko
2018-12-19 14:33 ` Wei Yang
2018-12-19 14:39 ` Michal Hocko
2018-12-20 15:58 ` Wei Yang
2018-12-20 16:23 ` Michal Hocko
2018-12-21 3:37 ` Wei Yang
2018-12-19 13:29 ` Wei Yang
2018-12-19 13:40 ` Michal Hocko
2018-12-19 13:56 ` Wei Yang
2018-12-19 14:12 ` Michal Hocko
2018-12-19 14:41 ` Wei Yang
2018-12-19 10:05 ` Michal Hocko
2018-12-21 17:02 ` [PATCH v3] mm: remove extra drain pages on pcp list Wei Yang
2018-12-21 17:02 ` Wei Yang
2019-01-03 13:56 ` Michal Hocko
2019-01-05 23:27 ` Wei Yang
2019-01-05 23:31 ` [PATCH v4] " Wei Yang
2019-01-05 23:31 ` Wei Yang
2019-01-07 11:34 ` David Hildenbrand
2019-01-08 9:10 ` Oscar Salvador
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181217150819.zqz7u5kjswkvqoqu@master \
--to=richard.weiyang@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=osalvador@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox