From: 陈宗志 <baotiao@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Vlastimil Babka <vbabka@suse.cz>, linux-mm@kvack.org
Subject: Re: why the kmalloc return fail when there is free physical address but return success after dropping page caches
Date: Wed, 25 May 2016 17:25:05 +0800 [thread overview]
Message-ID: <CAGbZs7j=c=eRYFGvpv5NRhKs16Vq-cQTcbTZTKa4xKP4QGRuzQ@mail.gmail.com> (raw)
In-Reply-To: <20160518144148.GD21200@dastard>
[-- Attachment #1: Type: text/plain, Size: 5524 bytes --]
Hi Dave
> >> The machine's status is describe as blow:
> >>
> >> the machine has 96 physical memory. And the real use memory is about
> >> 64G, and the page cache use about 32G. we also use the swap area, at
> >> that time we have about 10G(we set the swap max size to 32G). At that
> >> moment, we find xfs report
> >>
> >> |Apr 29 21:54:31 w-openstack86 kernel: XFS: possible memory allocation
> >> deadlock in kmem_alloc (mode:0x250) |
Pretty sure that's a GFP_NOFS allocation context.
You are right, it is a GFP_NOFS operator from the xfs, xfs use GFP_NOFS
flag to avoid recursive filesystem call
> > Just once, or many times?
>
> the message appear many times
> from the code, I know that xfs will try 100 time of kmalloc() function
The curent upstream kernels report much more information - process,
size of allocation, etc.
In general, the cause of such problems is memory fragmentation
preventing a large contiguous allocation from taking place (e.g.
when you try to read a file with millions of extents).
> >> in the system. But there is still 32G page cache.
> >>
> >> So I run
> >>
> >> |echo 3 > /proc/sys/vm/drop_caches |
> >>
> >> to drop the page cache.
> >>
> >> Then the system is fine.
> >
> > Are you saying that the error message was repeated infinitely until you
did the drop_caches?
>
>
> No. the error message don't appear after I drop_cache.
Yes, you are right, before I echo 3 > /proc/sys/vm/drop_caches, the
/proc/buddyinfo is list blow:
Node 0, zone DMA 0 0 0 1 2 1 1
0 1 1 3
Node 0, zone DMA32 2983 2230 1037 290 121 63 47
61 16 0 0
Node 0, zone Normal 13707 1126 285 268 291 160 64
21 11 0 0
Node 1, zone Normal 10678 5041 1167 705 316 158 61
22 0 0 0
after the operator the /proc/buddyinfo is list blow:
Node 0, zone DMA 0 0 0 1 2 1 1
0 1 1 3
Node 0, zone DMA32 61091 22791 3659 348 169 81 89
63 16 0 0
Node 0, zone Normal 781723 532596 246195 57076 9853 4061 1922
799 217 19 0
Node 1, zone Normal 334903 138984 49608 6929 2770 1603 843
447 232 2 0
we can find that after the operator, we get more large size pages
beside the /proc/buddyinfo, is there any other command the get the memory
fragmentation info?
And beside the drop_caches operator, is there any other command can avoid
the memory fragmentation?
IIRC, the reason the system can't recover itself is that memory
compaction is not triggered from GFP_NOFS allocation context, which
means memory reclaim won't try to create contiguous regions by
moving things around and hence the allocation will not succeed until
a significant amount of memory is freed by some other trigger....
The GFP_NOFS will not triggered memory compaction, where can I find the
logic in kernel source code?
thank you
On Wed, May 18, 2016 at 10:41 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Wed, May 18, 2016 at 04:58:31PM +0800, baotiao wrote:
> > Thanks for your reply
> >
> > >> Hello every, I meet an interesting kernel memory problem. Can anyone
> > >> help me explain what happen under the kernel
> > >
> > > Which kernel version is that?
> >
> > The kernel version is 3.10.0-327.4.5.el7.x86_64
>
> RHEL7 kernel. Best you report the problem to your RH support
> contact - the RHEL7 kernels are far different to upstream kernels..
>
> > >> The machine's status is describe as blow:
> > >>
> > >> the machine has 96 physical memory. And the real use memory is about
> > >> 64G, and the page cache use about 32G. we also use the swap area, at
> > >> that time we have about 10G(we set the swap max size to 32G). At that
> > >> moment, we find xfs report
> > >>
> > >> |Apr 29 21:54:31 w-openstack86 kernel: XFS: possible memory allocation
> > >> deadlock in kmem_alloc (mode:0x250) |
>
> Pretty sure that's a GFP_NOFS allocation context.
>
> > > Just once, or many times?
> >
> > the message appear many times
> > from the code, I know that xfs will try 100 time of kmalloc() function
>
> The curent upstream kernels report much more information - process,
> size of allocation, etc.
>
> In general, the cause of such problems is memory fragmentation
> preventing a large contiguous allocation from taking place (e.g.
> when you try to read a file with millions of extents).
>
> > >> in the system. But there is still 32G page cache.
> > >>
> > >> So I run
> > >>
> > >> |echo 3 > /proc/sys/vm/drop_caches |
> > >>
> > >> to drop the page cache.
> > >>
> > >> Then the system is fine.
> > >
> > > Are you saying that the error message was repeated infinitely until
> you did the drop_caches?
> >
> >
> > No. the error message don't appear after I drop_cache.
>
> Of course - freeing memory will cause contiguous free space to
> reform. then the allocation will succeed.
>
> IIRC, the reason the system can't recover itself is that memory
> compaction is not triggered from GFP_NOFS allocation context, which
> means memory reclaim won't try to create contiguous regions by
> moving things around and hence the allocation will not succeed until
> a significant amount of memory is freed by some other trigger....
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
--
---
Blog: http://www.chenzongzhi.info
Twitter: https://twitter.com/baotiao <https://twitter.com/#%21/baotiao>
Git: https://github.com/baotiao
[-- Attachment #2: Type: text/html, Size: 7717 bytes --]
prev parent reply other threads:[~2016-05-25 9:25 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-18 2:38 baotiao
2016-05-18 8:45 ` Vlastimil Babka
2016-05-18 8:58 ` baotiao
2016-05-18 14:41 ` Dave Chinner
2016-05-25 9:25 ` 陈宗志 [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAGbZs7j=c=eRYFGvpv5NRhKs16Vq-cQTcbTZTKa4xKP4QGRuzQ@mail.gmail.com' \
--to=baotiao@gmail.com \
--cc=david@fromorbit.com \
--cc=linux-mm@kvack.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox