From: Paul Gortmaker <paul.gortmaker@windriver.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: linux-mm@kvack.org, mgorman@suse.de,
kamezawa.hiroyu@jp.fujitsu.com, dhillf@gmail.com,
aarcange@redhat.com, mhocko@suse.cz, akpm@linux-foundation.org,
hannes@cmpxchg.org, linux-kernel@vger.kernel.org,
cgroups@vger.kernel.org, linux-next@vger.kernel.org
Subject: Re: [PATCH -V6 07/14] memcg: Add HugeTLB extension
Date: Tue, 1 May 2012 20:20:42 -0400 [thread overview]
Message-ID: <CAP=VYLqgaCabQGDVgUXnCwKCZHtz0nWxpm_a6Cgz_ciMzGe9gQ@mail.gmail.com> (raw)
In-Reply-To: <1334573091-18602-8-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
On Mon, Apr 16, 2012 at 6:44 AM, Aneesh Kumar K.V
<aneesh.kumar@linux.vnet.ibm.com> wrote:
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>
> This patch implements a memcg extension that allows us to control HugeTLB
> allocations via memory controller. The extension allows to limit the
Hi Aneesh,
This breaks linux-next on some arch because they don't have any
HUGE_MAX_HSTATE in scope with the current #ifdef layout.
The breakage is in sh4, m68k, s390, and possibly others.
http://kisskb.ellerman.id.au/kisskb/buildresult/6228689/
http://kisskb.ellerman.id.au/kisskb/buildresult/6228670/
http://kisskb.ellerman.id.au/kisskb/buildresult/6228484/
This is a commit in akpm's mmotm queue, which used to be here:
http://userweb.kernel.org/~akpm/mmotm
Of course the above is invalid since userweb.kernel.org is dead.
I don't have a post-kernel.org break-in link handy and a quick
search didn't give me one, but I'm sure you'll recognize the change.
Thanks,
Paul.
--
> HugeTLB usage per control group and enforces the controller limit during
> page fault. Since HugeTLB doesn't support page reclaim, enforcing the limit
> at page fault time implies that, the application will get SIGBUS signal if it
> tries to access HugeTLB pages beyond its limit. This requires the application
> to know beforehand how much HugeTLB pages it would require for its use.
>
> The charge/uncharge calls will be added to HugeTLB code in later patch.
> Support for memcg removal will be added in later patches.
>
> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
> include/linux/hugetlb.h | 1 +
> include/linux/memcontrol.h | 42 ++++++++++++++
> init/Kconfig | 8 +++
> mm/hugetlb.c | 2 +-
> mm/memcontrol.c | 132 ++++++++++++++++++++++++++++++++++++++++++++
> 5 files changed, 184 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 46c6cbd..995c238 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -226,6 +226,7 @@ struct hstate *size_to_hstate(unsigned long size);
> #define HUGE_MAX_HSTATE 1
> #endif
>
> +extern int hugetlb_max_hstate;
> extern struct hstate hstates[HUGE_MAX_HSTATE];
> extern unsigned int default_hstate_idx;
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index f94efd2..1d07e14 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -448,5 +448,47 @@ static inline void sock_release_memcg(struct sock *sk)
> {
> }
> #endif /* CONFIG_CGROUP_MEM_RES_CTLR_KMEM */
> +
> +#ifdef CONFIG_MEM_RES_CTLR_HUGETLB
> +extern int mem_cgroup_hugetlb_charge_page(int idx, unsigned long nr_pages,
> + struct mem_cgroup **ptr);
> +extern void mem_cgroup_hugetlb_commit_charge(int idx, unsigned long nr_pages,
> + struct mem_cgroup *memcg,
> + struct page *page);
> +extern void mem_cgroup_hugetlb_uncharge_page(int idx, unsigned long nr_pages,
> + struct page *page);
> +extern void mem_cgroup_hugetlb_uncharge_memcg(int idx, unsigned long nr_pages,
> + struct mem_cgroup *memcg);
> +
> +#else
> +static inline int
> +mem_cgroup_hugetlb_charge_page(int idx, unsigned long nr_pages,
> + struct mem_cgroup **ptr)
> +{
> + return 0;
> +}
> +
> +static inline void
> +mem_cgroup_hugetlb_commit_charge(int idx, unsigned long nr_pages,
> + struct mem_cgroup *memcg,
> + struct page *page)
> +{
> + return;
> +}
> +
> +static inline void
> +mem_cgroup_hugetlb_uncharge_page(int idx, unsigned long nr_pages,
> + struct page *page)
> +{
> + return;
> +}
> +
> +static inline void
> +mem_cgroup_hugetlb_uncharge_memcg(int idx, unsigned long nr_pages,
> + struct mem_cgroup *memcg)
> +{
> + return;
> +}
> +#endif /* CONFIG_MEM_RES_CTLR_HUGETLB */
> #endif /* _LINUX_MEMCONTROL_H */
>
> diff --git a/init/Kconfig b/init/Kconfig
> index 72f33fa..a3b5665 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -716,6 +716,14 @@ config CGROUP_PERF
>
> Say N if unsure.
>
> +config MEM_RES_CTLR_HUGETLB
> + bool "Memory Resource Controller HugeTLB Extension (EXPERIMENTAL)"
> + depends on CGROUP_MEM_RES_CTLR && HUGETLB_PAGE && EXPERIMENTAL
> + default n
> + help
> + Add HugeTLB management to memory resource controller. When you
> + enable this, you can put a per cgroup limit on HugeTLB usage.
> +
> menuconfig CGROUP_SCHED
> bool "Group CPU scheduler"
> default n
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index a3ac624..8cd89b4 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -35,7 +35,7 @@ const unsigned long hugetlb_zero = 0, hugetlb_infinity = ~0UL;
> static gfp_t htlb_alloc_mask = GFP_HIGHUSER;
> unsigned long hugepages_treat_as_movable;
>
> -static int hugetlb_max_hstate;
> +int hugetlb_max_hstate;
> unsigned int default_hstate_idx;
> struct hstate hstates[HUGE_MAX_HSTATE];
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 901bb03..884f479 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -252,6 +252,10 @@ struct mem_cgroup {
> };
>
> /*
> + * the counter to account for hugepages from hugetlb.
> + */
> + struct res_counter hugepage[HUGE_MAX_HSTATE];
> + /*
> * Per cgroup active and inactive list, similar to the
> * per zone LRU lists.
> */
> @@ -3213,6 +3217,114 @@ static inline int mem_cgroup_move_swap_account(swp_entry_t entry,
> }
> #endif
>
> +#ifdef CONFIG_MEM_RES_CTLR_HUGETLB
> +static bool mem_cgroup_have_hugetlb_usage(struct mem_cgroup *memcg)
> +{
> + int idx;
> + for (idx = 0; idx < hugetlb_max_hstate; idx++) {
> + if ((res_counter_read_u64(&memcg->hugepage[idx], RES_USAGE)) > 0)
> + return 1;
> + }
> + return 0;
> +}
> +
> +int mem_cgroup_hugetlb_charge_page(int idx, unsigned long nr_pages,
> + struct mem_cgroup **ptr)
> +{
> + int ret = 0;
> + struct mem_cgroup *memcg = NULL;
> + struct res_counter *fail_res;
> + unsigned long csize = nr_pages * PAGE_SIZE;
> +
> + if (mem_cgroup_disabled())
> + goto done;
> +again:
> + rcu_read_lock();
> + memcg = mem_cgroup_from_task(current);
> + if (!memcg)
> + memcg = root_mem_cgroup;
> +
> + if (!css_tryget(&memcg->css)) {
> + rcu_read_unlock();
> + goto again;
> + }
> + rcu_read_unlock();
> +
> + ret = res_counter_charge(&memcg->hugepage[idx], csize, &fail_res);
> + css_put(&memcg->css);
> +done:
> + *ptr = memcg;
> + return ret;
> +}
> +
> +void mem_cgroup_hugetlb_commit_charge(int idx, unsigned long nr_pages,
> + struct mem_cgroup *memcg,
> + struct page *page)
> +{
> + struct page_cgroup *pc;
> +
> + if (mem_cgroup_disabled())
> + return;
> +
> + pc = lookup_page_cgroup(page);
> + lock_page_cgroup(pc);
> + if (unlikely(PageCgroupUsed(pc))) {
> + unlock_page_cgroup(pc);
> + mem_cgroup_hugetlb_uncharge_memcg(idx, nr_pages, memcg);
> + return;
> + }
> + pc->mem_cgroup = memcg;
> + SetPageCgroupUsed(pc);
> + unlock_page_cgroup(pc);
> + return;
> +}
> +
> +void mem_cgroup_hugetlb_uncharge_page(int idx, unsigned long nr_pages,
> + struct page *page)
> +{
> + struct page_cgroup *pc;
> + struct mem_cgroup *memcg;
> + unsigned long csize = nr_pages * PAGE_SIZE;
> +
> + if (mem_cgroup_disabled())
> + return;
> +
> + pc = lookup_page_cgroup(page);
> + if (unlikely(!PageCgroupUsed(pc)))
> + return;
> +
> + lock_page_cgroup(pc);
> + if (!PageCgroupUsed(pc)) {
> + unlock_page_cgroup(pc);
> + return;
> + }
> + memcg = pc->mem_cgroup;
> + pc->mem_cgroup = root_mem_cgroup;
> + ClearPageCgroupUsed(pc);
> + unlock_page_cgroup(pc);
> +
> + res_counter_uncharge(&memcg->hugepage[idx], csize);
> + return;
> +}
> +
> +void mem_cgroup_hugetlb_uncharge_memcg(int idx, unsigned long nr_pages,
> + struct mem_cgroup *memcg)
> +{
> + unsigned long csize = nr_pages * PAGE_SIZE;
> +
> + if (mem_cgroup_disabled())
> + return;
> +
> + res_counter_uncharge(&memcg->hugepage[idx], csize);
> + return;
> +}
> +#else
> +static bool mem_cgroup_have_hugetlb_usage(struct mem_cgroup *memcg)
> +{
> + return 0;
> +}
> +#endif /* CONFIG_MEM_RES_CTLR_HUGETLB */
> +
> /*
> * Before starting migration, account PAGE_SIZE to mem_cgroup that the old
> * page belongs to.
> @@ -4955,6 +5067,7 @@ err_cleanup:
> static struct cgroup_subsys_state * __ref
> mem_cgroup_create(struct cgroup *cont)
> {
> + int idx;
> struct mem_cgroup *memcg, *parent;
> long error = -ENOMEM;
> int node;
> @@ -4997,9 +5110,22 @@ mem_cgroup_create(struct cgroup *cont)
> * mem_cgroup(see mem_cgroup_put).
> */
> mem_cgroup_get(parent);
> + /*
> + * We could get called before hugetlb init is called.
> + * Use HUGE_MAX_HSTATE as the max index.
> + */
> + for (idx = 0; idx < HUGE_MAX_HSTATE; idx++)
> + res_counter_init(&memcg->hugepage[idx],
> + &parent->hugepage[idx]);
> } else {
> res_counter_init(&memcg->res, NULL);
> res_counter_init(&memcg->memsw, NULL);
> + /*
> + * We could get called before hugetlb init is called.
> + * Use HUGE_MAX_HSTATE as the max index.
> + */
> + for (idx = 0; idx < HUGE_MAX_HSTATE; idx++)
> + res_counter_init(&memcg->hugepage[idx], NULL);
> }
> memcg->last_scanned_node = MAX_NUMNODES;
> INIT_LIST_HEAD(&memcg->oom_notify);
> @@ -5030,6 +5156,12 @@ free_out:
> static int mem_cgroup_pre_destroy(struct cgroup *cont)
> {
> struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> + /*
> + * Don't allow memcg removal if we have HugeTLB resource
> + * usage.
> + */
> + if (mem_cgroup_have_hugetlb_usage(memcg))
> + return -EBUSY;
>
> return mem_cgroup_force_empty(memcg, false);
> }
> --
> 1.7.10
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-05-02 0:21 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-16 10:44 [PATCH -V6 00/14] memcg: Add memcg extension to control HugeTLB allocation Aneesh Kumar K.V
2012-04-16 10:44 ` [PATCH -V6 01/14] hugetlb: rename max_hstate to hugetlb_max_hstate Aneesh Kumar K.V
2012-05-24 21:11 ` David Rientjes
2012-04-16 10:44 ` [PATCH -V6 02/14] hugetlbfs: don't use ERR_PTR with VM_FAULT* values Aneesh Kumar K.V
2012-05-24 21:17 ` David Rientjes
2012-04-16 10:44 ` [PATCH -V6 03/14] hugetlbfs: Add an inline helper for finding hstate index Aneesh Kumar K.V
2012-05-24 21:22 ` David Rientjes
2012-05-27 20:07 ` Aneesh Kumar K.V
2012-04-16 10:44 ` [PATCH -V6 04/14] hugetlb: Use mmu_gather instead of a temporary linked list for accumulating pages Aneesh Kumar K.V
2012-04-23 23:44 ` Andrew Morton
2012-04-16 10:44 ` [PATCH -V6 05/14] hugetlb: Avoid taking i_mmap_mutex in unmap_single_vma for hugetlb Aneesh Kumar K.V
2012-04-16 10:44 ` [PATCH -V6 06/14] hugetlb: Simplify migrate_huge_page Aneesh Kumar K.V
2012-05-24 21:35 ` David Rientjes
2012-05-27 20:13 ` Aneesh Kumar K.V
2012-04-16 10:44 ` [PATCH -V6 07/14] memcg: Add HugeTLB extension Aneesh Kumar K.V
2012-05-02 0:20 ` Paul Gortmaker [this message]
2012-05-03 4:37 ` Aneesh Kumar K.V
2012-05-24 21:52 ` David Rientjes
2012-05-24 22:57 ` Andrew Morton
2012-05-24 23:20 ` David Rientjes
2012-05-27 20:28 ` Aneesh Kumar K.V
2012-05-30 14:43 ` Aneesh Kumar K.V
2012-06-08 23:06 ` Andrew Morton
2012-06-09 14:16 ` Aneesh Kumar K.V
2012-06-10 1:55 ` David Rientjes
2012-06-10 15:04 ` Aneesh Kumar K.V
2012-06-11 3:55 ` Kamezawa Hiroyuki
2012-06-11 9:23 ` David Rientjes
2012-06-15 22:31 ` Aditya Kali
2012-06-16 20:26 ` David Rientjes
2012-06-11 9:32 ` Michal Hocko
2012-04-16 10:44 ` [PATCH -V6 08/14] hugetlb: add charge/uncharge calls for HugeTLB alloc/free Aneesh Kumar K.V
2012-04-16 10:44 ` [PATCH -V6 09/14] memcg: track resource index in cftype private Aneesh Kumar K.V
2012-04-16 10:44 ` [PATCH -V6 10/14] hugetlbfs: Add memcg control files for hugetlbfs Aneesh Kumar K.V
2012-04-16 23:13 ` Andrew Morton
2012-04-18 6:15 ` [PATCH] memcg: Use scnprintf instead of sprintf Aneesh Kumar K.V
2012-04-18 22:36 ` Andrew Morton
2012-04-18 6:16 ` [PATCH -V6 10/14] hugetlbfs: Add memcg control files for hugetlbfs Aneesh Kumar K.V
2012-04-16 10:44 ` [PATCH -V6 11/14] hugetlbfs: Add a list for tracking in-use HugeTLB pages Aneesh Kumar K.V
2012-04-16 10:44 ` [PATCH -V6 12/14] memcg: move HugeTLB resource count to parent cgroup on memcg removal Aneesh Kumar K.V
2012-04-23 22:45 ` Andrew Morton
2012-04-16 10:44 ` [PATCH -V6 13/14] hugetlb: migrate memcg info from oldpage to new page during migration Aneesh Kumar K.V
2012-04-16 10:44 ` [PATCH -V6 14/14] memcg: Add memory controller documentation for hugetlb management Aneesh Kumar K.V
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAP=VYLqgaCabQGDVgUXnCwKCZHtz0nWxpm_a6Cgz_ciMzGe9gQ@mail.gmail.com' \
--to=paul.gortmaker@windriver.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=cgroups@vger.kernel.org \
--cc=dhillf@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-next@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox