From: NeilBrown <neilb@suse.com>
To: Vlastimil Babka <vbabka@suse.cz>,
Michal Hocko <mhocko@kernel.org>,
linux-mm@kvack.org
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Mel Gorman <mgorman@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
"Darrick J. Wong" <darrick.wong@oracle.com>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
NeilBrown <neilb@suse.de>, Jonathan Corbet <corbet@lwn.net>,
Paolo Bonzini <pbonzini@redhat.com>,
"Eric W. Biederman" <ebiederm@xmission.com>
Subject: Re: [RFC PATCH 0/4 v2] mm: give __GFP_REPEAT a better semantic
Date: Wed, 24 May 2017 11:06:04 +1000 [thread overview]
Message-ID: <87shjvhxmr.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <77fdc6db-5cc1-297f-e049-0d6f824e688c@suse.cz>
[-- Attachment #1: Type: text/plain, Size: 7088 bytes --]
On Tue, May 23 2017, Vlastimil Babka wrote:
> On 05/16/2017 11:10 AM, Michal Hocko wrote:
>> So, is there some interest in this? I am not going to push this if there
>> is a general consensus that we do not need to do anything about the
>> current situation or need a different approach.
>
> After the recent LWN article [1] I think that we should really support
> marking allocations as failable, without making them too easily failable
> via __GFP_NORETRY. The __GFP_RETRY_MAY_FAIL flag sounds like a good way
> to do that without introducing a new __GFP_MAYFAIL. We could also
> introduce a wrapper such as GFP_KERNEL_MAYFAIL.
>
> [1] https://lwn.net/Articles/723317/
Yes please!!!
I particularly like:
> - GFP_KERNEL | __GFP_NORETRY - overrides the default allocator behavior and
> all allocation requests fail early rather than cause disruptive
> reclaim (one round of reclaim in this implementation). The OOM killer
> is not invoked.
> - GFP_KERNEL | __GFP_RETRY_MAYFAIL - overrides the default allocator behavior
> and all allocation requests try really hard. The request will fail if the
> reclaim cannot make any progress. The OOM killer won't be triggered.
> - GFP_KERNEL | __GFP_NOFAIL - overrides the default allocator behavior
> and all allocation requests will loop endlessly until they
> succeed. This might be really dangerous especially for larger orders.
There seems to be a good range here, and the two end points are good
choices.
I like that only __GFP_NOFAIL triggers the OOM.
I would like the middle option to be the default. I think that is what
many people thought the default was. I appreciate that making the
transition might be awkward.
Maybe create GFP_DEFAULT which matches the middle option and encourage
that in new code??
We would probably want guidelines on when __GFP_NOFAIL is acceptable.
I assume:
- no locks held
- small allocations OK, large allocation need clear justification.
- error would be exposed to systemcall
???
I think it is important to give kernel developers clear options and make
it easy for them to choose the best option. This helps to do that.
Thanks,
NeilBrown
>
>> On Tue 07-03-17 16:48:39, Michal Hocko wrote:
>>> Hi,
>>> this is a follow up for __GFP_REPEAT clean up merged in 4.7. The previous
>>> version of this patch series was posted as an RFC
>>> http://lkml.keprnel.org/r/1465212736-14637-1-git-send-email-mhocko@kernel.org
>>> Since then I have reconsidered the semantic and made it a counterpart
>>> to the __GFP_NORETRY and made it the other extreme end of the retry
>>> logic. Both are not invoking the OOM killer so they are suitable
>>> for allocation paths with a fallback. Also a new potential user has
>>> emerged (kvmalloc - see patch 4). I have also renamed the flag from
>>> __GFP_RETRY_HARD to __GFP_RETRY_MAY_FAIL as this should be more clear.
>>>
>>> I have kept the RFC status because of the semantic change. The patch 1
>>> is an exception because it should be merge regardless of the rest.
>>>
>>> The main motivation for the change is that the current implementation of
>>> __GFP_REPEAT is not very much useful.
>>>
>>> The documentation says:
>>> * __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt
>>> * _might_ fail. This depends upon the particular VM implementation.
>>>
>>> It just fails to mention that this is true only for large (costly) high
>>> order which has been the case since the flag was introduced. A similar
>>> semantic would be really helpful for smal orders as well, though,
>>> because we have places where a failure with a specific fallback error
>>> handling is preferred to a potential endless loop inside the page
>>> allocator.
>>>
>>> The earlier cleanup dropped __GFP_REPEAT usage for low (!costly) order
>>> users so only those which might use larger orders have stayed. One user
>>> which slipped through cracks is addressed in patch 1.
>>>
>>> Let's rename the flag to something more verbose and use it for existing
>>> users. Semantic for those will not change. Then implement low (!costly)
>>> orders failure path which is hit after the page allocator is about to
>>> invoke the oom killer. Now we have a good counterpart for __GFP_NORETRY
>>> and finally can tell try as hard as possible without the OOM killer.
>>>
>>> Xfs code already has an existing annotation for allocations which are
>>> allowed to fail and we can trivially map them to the new gfp flag
>>> because it will provide the semantic KM_MAYFAIL wants.
>>>
>>> kvmalloc will allow also !costly high order allocations to retry hard
>>> before falling back to the vmalloc.
>>>
>>> The patchset is based on the current linux-next.
>>>
>>> Shortlog
>>> Michal Hocko (4):
>>> s390: get rid of superfluous __GFP_REPEAT
>>> mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic
>>> xfs: map KM_MAYFAIL to __GFP_RETRY_MAYFAIL
>>> mm: kvmalloc support __GFP_RETRY_MAYFAIL for all sizes
>>>
>>> Diffstat
>>> Documentation/DMA-ISA-LPC.txt | 2 +-
>>> arch/powerpc/include/asm/book3s/64/pgalloc.h | 2 +-
>>> arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
>>> arch/s390/mm/pgalloc.c | 2 +-
>>> drivers/mmc/host/wbsd.c | 2 +-
>>> drivers/s390/char/vmcp.c | 2 +-
>>> drivers/target/target_core_transport.c | 2 +-
>>> drivers/vhost/net.c | 2 +-
>>> drivers/vhost/scsi.c | 2 +-
>>> drivers/vhost/vsock.c | 2 +-
>>> fs/btrfs/check-integrity.c | 2 +-
>>> fs/btrfs/raid56.c | 2 +-
>>> fs/xfs/kmem.h | 10 +++++++++
>>> include/linux/gfp.h | 32 +++++++++++++++++++---------
>>> include/linux/slab.h | 3 ++-
>>> include/trace/events/mmflags.h | 2 +-
>>> mm/hugetlb.c | 4 ++--
>>> mm/internal.h | 2 +-
>>> mm/page_alloc.c | 14 +++++++++---
>>> mm/sparse-vmemmap.c | 4 ++--
>>> mm/util.c | 14 ++++--------
>>> mm/vmalloc.c | 2 +-
>>> mm/vmscan.c | 8 +++----
>>> net/core/dev.c | 6 +++---
>>> net/core/skbuff.c | 2 +-
>>> net/sched/sch_fq.c | 2 +-
>>> tools/perf/builtin-kmem.c | 2 +-
>>> 27 files changed, 78 insertions(+), 53 deletions(-)
>>>
>>> --
>>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>>> the body to majordomo@kvack.org. For more info on Linux MM,
>>> see: http://www.linux-mm.org/ .
>>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
next prev parent reply other threads:[~2017-05-24 1:06 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-07 15:48 Michal Hocko
2017-03-07 15:48 ` [PATCH 1/4] s390: get rid of superfluous __GFP_REPEAT Michal Hocko
2017-03-08 8:23 ` Heiko Carstens
2017-03-08 14:11 ` Michal Hocko
2017-03-09 8:27 ` Heiko Carstens
2017-03-07 15:48 ` [RFC PATCH 2/4] mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic Michal Hocko
2017-05-25 1:21 ` NeilBrown
2017-05-31 11:42 ` Michal Hocko
2017-06-03 2:24 ` Wei Yang
2017-06-05 6:43 ` Michal Hocko
2017-06-06 3:04 ` Wei Yang
2017-06-06 12:03 ` Michal Hocko
2017-06-07 2:10 ` Wei Yang
2017-06-09 7:32 ` Michal Hocko
2017-03-07 15:48 ` [RFC PATCH 3/4] xfs: map KM_MAYFAIL to __GFP_RETRY_MAYFAIL Michal Hocko
2017-03-07 17:05 ` Darrick J. Wong
2017-03-08 9:35 ` Michal Hocko
2017-03-08 11:23 ` Tetsuo Handa
2017-03-08 12:54 ` Michal Hocko
2017-03-08 15:06 ` Christoph Hellwig
2017-03-09 9:16 ` Michal Hocko
2017-03-07 15:48 ` [RFC PATCH 4/4] mm: kvmalloc support __GFP_RETRY_MAYFAIL for all sizes Michal Hocko
2017-05-16 9:10 ` [RFC PATCH 0/4 v2] mm: give __GFP_REPEAT a better semantic Michal Hocko
2017-05-23 8:12 ` Vlastimil Babka
2017-05-24 1:06 ` NeilBrown [this message]
2017-05-24 7:34 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87shjvhxmr.fsf@notabene.neil.brown.name \
--to=neilb@suse.com \
--cc=akpm@linux-foundation.org \
--cc=corbet@lwn.net \
--cc=darrick.wong@oracle.com \
--cc=ebiederm@xmission.com \
--cc=hannes@cmpxchg.org \
--cc=heiko.carstens@de.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@kernel.org \
--cc=neilb@suse.de \
--cc=pbonzini@redhat.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox