linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Wei Yang <richard.weiyang@gmail.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@gmail.com>,
	linux-mm@kvack.org, akpm@linux-foundation.org, osalvador@suse.de,
	david@redhat.com
Subject: Re: [PATCH] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
Date: Tue, 18 Dec 2018 14:44:10 +0000	[thread overview]
Message-ID: <20181218144410.mrm45xthqhqvw5zw@master> (raw)
In-Reply-To: <20181217154812.GU30879@dhcp22.suse.cz>

On Mon, Dec 17, 2018 at 04:48:12PM +0100, Michal Hocko wrote:
>On Mon 17-12-18 15:08:19, Wei Yang wrote:
>> On Mon, Dec 17, 2018 at 01:25:23PM +0100, Michal Hocko wrote:
>> >On Fri 14-12-18 10:39:12, Wei Yang wrote:
>> >> Below is a brief call flow for __offline_pages() and
>> >> alloc_contig_range():
>> >> 
>> >>   __offline_pages()/alloc_contig_range()
>> >>       start_isolate_page_range()
>> >>           set_migratetype_isolate()
>> >>               drain_all_pages()
>> >>       drain_all_pages()
>> >> 
>> >> Since set_migratetype_isolate() is only used in
>> >> start_isolate_page_range(), which is just used in __offline_pages() and
>> >> alloc_contig_range(). And both of them call drain_all_pages() if every
>> >> check looks good. This means it is not necessary call drain_all_pages()
>> >> in each iteration of set_migratetype_isolate().
>> >> 
>> >> By doing so, the logic seems a little bit clearer.
>> >> set_migratetype_isolate() handles pages in Buddy, while
>> >> drain_all_pages() takes care of pages in pcp.
>> >
>> >I have to confess I am not sure about the purpose of the draining here.
>> >I suspect it is to make sure that pages in the pcp lists really get
>> >isolated and if that is the case then it makes sense.
>> >
>> >In any case I strongly suggest not touching this code without a very
>> >good explanation on why this is not needed. Callers do XYZ is not a
>> >proper explanation because assumes that all callers will know that this
>> >has to be done. So either we really need to drain and then it is better
>> >to make it here or we don't but that requires some explanation.
>> >
>> 
>> Yep, let me try to explain what is trying to do.
>> 
>> Based on my understanding, online_pages do two things
>> 
>>     * adjust zone/pgdat status
>>     * put pages into Buddy
>> 
>> Generally, offline_pages do the reverse
>> 
>>     * take pages out of Buddy
>>     * adjust zone/pgdat status
>> 
>> While it is not that easy to take pages out of Buddy, since pages are
>> 
>>     * pcp list
>>     * slub
>>     * other usage
>> 
>> This means before taking a page out of Buddy, we need to return it first
>> to Buddy.
>> 
>> Current implementation is interesting by introducing migrate type. By
>> setting migrate type to MIGRATE_ISOLATE, this range of pages will never
>> be allocated from Buddy. And every page returned in this range will
>> never be touched by Buddy.
>> 
>> Function start_isolate_page_range() just do this.
>> 
>> Then let's focus on the pcp list. This is a little bit different
>> than other allocated pages. These are actually "partially" allocated
>> pages. They are not counted in Buddy Free pages, either no real use. So
>> we have two choice to get back those pages:
>> 
>>     * wait until it is allocated to a real user and wait for return
>>     * or drain them directly
>> 
>> Current implementation take 2nd approach.
>> 
>> Then we can see there are also two way to drain them:
>> 
>>     * drain them range by range
>>     * drain them in a whole range
>> 
>> Both looks good, but not necessary to do them both. Because after we set
>> a pageblock migrate type to MIGRATE_ISOLATE, pages in this range will
>> never be allocated nor be put on pcp list. So after we drain one
>> particular range, it is not necessary to drain this range again.
>
>OK, this is an important point and actually the argument that i am
>wrong. I have missed that free_unref_page_commit skips pcp lists for
>MIGRATE_ISOLATE (resp. all migrate types above MIGRATE_PCPTYPES).
>Then you are right that we are OK to drain the zone only once _after_ we
>have isolated the full range.
>
>So please send a new patch with this clarification in the changelog and
>I will ack it.
> 
>> The reason why I choose to drain them in a whole range is current
>> drain_all_pages() just carry zone information. For example, a zone may
>> have 1G while a pageblock is 128M. The pageblock is 1/8 of this zone.
>> This means in case there are 8 pages on pcp list, only 1 page drained by
>> drain_all_pages belongs to this pageblock. But we drain other 7 healthy
>> pages.
>> 
>>  CPU1 pcp list                            CPU2 pcp list
>> 
>>  +---------------+                        +---------------+ 
>>  |A1  B3  C8  F6 |                        |E1  G3  D8  B6 |
>>  +---------------+                        +---------------+
>> 
>> 
>>    A         B         C         D         E         F         G
>>   +---------+---------+---------+---------+---------+---------+---------+
>>   |012345678|         |         |         |         |         |         |
>>   +---------+---------+---------+---------+---------+---------+---------+
>>                                 |<-pgblk->|
>>   |<-                              Zone                               ->|
>> 
>> 
>> This is a chart for illustration. In case we want to isolate pgblk D,
>> while zone pcp list has 8 pages and only one belongs to this pgblk D.
>> This means the drain on pgblk base has much side effect. And with one
>> drain on each pgblk, this may increase the contention on this zone.
>> 
>> Well, another approach is to enable drain_all_pages() with exact range
>> information. But neither approach needs to do them both.
>
>Is this actually worth the additional complexity? Have you seen an
>actual workload that would benefit from that?

No, I just mention the possible approach in my mind. While currently
drain pcp list once is enough.

I will prepare v2 with more detailed changelog with migratetype thing.

>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

  reply	other threads:[~2018-12-18 14:44 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-14  2:39 Wei Yang
2018-12-14  3:57 ` Andrew Morton
2018-12-14  7:01   ` Wei Yang
2018-12-14 15:17   ` Wei Yang
2018-12-17 12:21     ` Michal Hocko
2018-12-18 20:48       ` Wei Yang
2018-12-17 12:25 ` Michal Hocko
2018-12-17 15:08   ` Wei Yang
2018-12-17 15:48     ` Michal Hocko
2018-12-18 14:44       ` Wei Yang [this message]
2018-12-18 20:46 ` [PATCH v2] " Wei Yang
2018-12-18 21:14   ` David Hildenbrand
2018-12-18 21:49     ` Wei Yang
2018-12-18 22:18       ` David Hildenbrand
2018-12-18 23:29   ` Oscar Salvador
2018-12-19  9:51   ` Michal Hocko
2018-12-19  9:57     ` Oscar Salvador
2018-12-19 13:53       ` Wei Yang
2018-12-19 14:13         ` Michal Hocko
2018-12-19 14:33           ` Wei Yang
2018-12-19 14:39             ` Michal Hocko
2018-12-20 15:58               ` Wei Yang
2018-12-20 16:23                 ` Michal Hocko
2018-12-21  3:37                   ` Wei Yang
2018-12-19 13:29     ` Wei Yang
2018-12-19 13:40       ` Michal Hocko
2018-12-19 13:56         ` Wei Yang
2018-12-19 14:12           ` Michal Hocko
2018-12-19 14:41             ` Wei Yang
2018-12-19 10:05   ` Michal Hocko
2018-12-21 17:02   ` [PATCH v3] mm: remove extra drain pages on pcp list Wei Yang
2018-12-21 17:02     ` Wei Yang
2019-01-03 13:56     ` Michal Hocko
2019-01-05 23:27       ` Wei Yang
2019-01-05 23:31     ` [PATCH v4] " Wei Yang
2019-01-05 23:31       ` Wei Yang
2019-01-07 11:34       ` David Hildenbrand
2019-01-08  9:10       ` Oscar Salvador

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181218144410.mrm45xthqhqvw5zw@master \
    --to=richard.weiyang@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox