From: Motohiro Kosaki <Motohiro.Kosaki@us.fujitsu.com>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
"m.mizuma@jp.fujitsu.com" <m.mizuma@jp.fujitsu.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"iamjoonsoo.kim@lge.com" <iamjoonsoo.kim@lge.com>,
"mhocko@suse.cz" <mhocko@suse.cz>,
"liwanp@linux.vnet.ibm.com" <liwanp@linux.vnet.ibm.com>,
"aneesh.kumar@linux.vnet.ibm.com"
<aneesh.kumar@linux.vnet.ibm.com>,
Motohiro Kosaki JP <kosaki.motohiro@jp.fujitsu.com>
Subject: RE: [PATCH v2 1/1] mm: hugetlb: fix stalling when a large number of hugepages are freed
Date: Mon, 7 Apr 2014 08:22:12 -0700 [thread overview]
Message-ID: <6B2BA408B38BA1478B473C31C3D2074E3097FD1003@SV-EXCHANGE1.Corp.FC.LOCAL> (raw)
In-Reply-To: <1396876864-vnrouoxp@n-horiguchi@ah.jp.nec.com>
> -----Original Message-----
> From: Naoya Horiguchi [mailto:n-horiguchi@ah.jp.nec.com]
> Sent: Monday, April 07, 2014 9:21 AM
> To: m.mizuma@jp.fujitsu.com
> Cc: linux-mm@kvack.org; akpm@linux-foundation.org; iamjoonsoo.kim@lge.com; mhocko@suse.cz; liwanp@linux.vnet.ibm.com;
> aneesh.kumar@linux.vnet.ibm.com; Motohiro Kosaki JP
> Subject: Re: [PATCH v2 1/1] mm: hugetlb: fix stalling when a large number of hugepages are freed
>
> On Mon, Apr 07, 2014 at 05:24:03PM +0900, Masayoshi Mizuma wrote:
> > When I decrease the value of nr_hugepage in procfs a lot, a long
> > stalling happens. It is because there is no chance of context switch during this process.
> >
> > On the other hand, when I allocate a large number of hugepages, there
> > is some chance of context switch. Hence the long stalling doesn't
> > happen during this process. So it's necessary to add the context
> > switch in the freeing process as same as allocating process to avoid the long stalling.
> >
> > When I freed 12 TB hugapages with kernel-2.6.32-358.el6, the freeing
> > process occupied a CPU over 150 seconds and following softlockup
> > message appeared twice or more.
> >
> > --
> > $ echo 6000000 > /proc/sys/vm/nr_hugepages $ cat
> > /proc/sys/vm/nr_hugepages
> > 6000000
> > $ grep ^Huge /proc/meminfo
> > HugePages_Total: 6000000
> > HugePages_Free: 6000000
> > HugePages_Rsvd: 0
> > HugePages_Surp: 0
> > Hugepagesize: 2048 kB
> > $ echo 0 > /proc/sys/vm/nr_hugepages
> >
> > BUG: soft lockup - CPU#16 stuck for 67s! [sh:12883] ...
> > Pid: 12883, comm: sh Not tainted 2.6.32-358.el6.x86_64 #1 Call Trace:
> > [<ffffffff8115a438>] ? free_pool_huge_page+0xb8/0xd0
> > [<ffffffff8115a578>] ? set_max_huge_pages+0x128/0x190
> > [<ffffffff8115c663>] ? hugetlb_sysctl_handler_common+0x113/0x140
> > [<ffffffff8115c6de>] ? hugetlb_sysctl_handler+0x1e/0x20
> > [<ffffffff811f3097>] ? proc_sys_call_handler+0x97/0xd0
> > [<ffffffff811f30e4>] ? proc_sys_write+0x14/0x20 [<ffffffff81180f98>]
> > ? vfs_write+0xb8/0x1a0 [<ffffffff81181891>] ? sys_write+0x51/0x90
> > [<ffffffff810dc565>] ? __audit_syscall_exit+0x265/0x290
> > [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
> > --
> > I have not confirmed this problem with upstream kernels because I am
> > not able to prepare the machine equipped with 12TB memory now.
> > However I confirmed that the amount of decreasing hugepages was
> > directly proportional to the amount of required time.
> >
> > I measured required times on a smaller machine. It showed 130-145
> > hugepages decreased in a millisecond.
> >
> > Amount of decreasing Required time Decreasing rate
> > hugepages (msec) (pages/msec)
> > ------------------------------------------------------------
> > 10,000 pages == 20GB 70 - 74 135-142
> > 30,000 pages == 60GB 208 - 229 131-144
> >
> > It means decrement of 6TB hugepages will trigger a long stalling
> > (about 20sec), in this decreasing rate.
> >
> > * Changes in v2
> > - Adding cond_resched_lock() in return_unused_surplus_pages()
> > Because when freeing a number of surplus pages, same problems happen.
> >
> > Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > Cc: Michal Hocko <mhocko@suse.cz>
> > Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
> > Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
> > Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> > Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
>
> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
>
> Thanks,
> Naoya Horiguchi
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2014-04-07 15:24 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-07 8:24 Masayoshi Mizuma
2014-04-07 13:21 ` Naoya Horiguchi
[not found] ` <1396876864-vnrouoxp@n-horiguchi@ah.jp.nec.com>
2014-04-07 15:22 ` Motohiro Kosaki [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6B2BA408B38BA1478B473C31C3D2074E3097FD1003@SV-EXCHANGE1.Corp.FC.LOCAL \
--to=motohiro.kosaki@us.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=liwanp@linux.vnet.ibm.com \
--cc=m.mizuma@jp.fujitsu.com \
--cc=mhocko@suse.cz \
--cc=n-horiguchi@ah.jp.nec.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox