From: Ed Tomlinson <tomlins@cam.org>
To: linux-mm@kvack.org
Subject: Re: [RFC][PATCH] cache shrinking via page age
Date: Sun, 12 May 2002 09:49:12 -0400 [thread overview]
Message-ID: <200205120949.13081.tomlins@cam.org> (raw)
In-Reply-To: <200205111614.29698.tomlins@cam.org>
On May 11, 2002 04:14 pm, Ed Tomlinson wrote:
One additional comment. I have tried modifing kmem_cache_shrink_nr to
free only the number of pages seen by refill_inactive_zone. This scheme
revives the original problem. Think the issue is that, in essence, the the
dentry/inode caches often work in read once mode (thats each object in
a slab is used once...). Without the more aggresive shrink in this patch
the 'read once' slab pages upset the vm balance.
A data point. Comparing this patch to my previous one the inode/entry
caches stabilize at about twice the size here.
Ed Tomlinson
> When running under low vm pressure rmap does not shrink caches. This
> happens since we only call do_try_to_free_pages when we have a shortage.
> On my box the combination of background_aging calling refill_inactive_zone
> is able to supply the pages needed. The end result of this the box acts
> sluggish, with about half my memory used by slab pages (dcache/icache).
> This does correct itself under pressure but it should never get into this
> state in the first place.
>
> Idealy we want all pages to be about the same age. Having half the pages
> in the system 'cold' in the slab cache is not good - it implies the other
> pages are 'hotter' than they need to be.
>
> To fix the situation I move reapable slab pages into the active list. When
> aging moves a page into the inactive dirty list I watch for slab pages and
> record the caches with old pages. After refill_inactive/background_aging
> ends I call a new function, kmem_call_shrinkers. This scans the list of
> slab caches and, via a callback, shrinks caches with old pages. Note that
> we never swap out slab pages they just cycle through active and inactive
> dirty lists.
>
> The end result is that slab caches are shrunk selectivily when they have
> old 'cold' pages. I avoid adding any magic numbers to the vm and create a
> generic interface to allow creators of slab caches to supply the vm with a
> unique method to shrink their caches.
>
> When testing this there is one side effect to remember. Using cat
> /proc/slabinfo references pages - this will tend to keep the slab pages
> warmer than they should be. Like in quantum theory, watching (to often)
> can change results.
>
> I have testing on UP only - think the locking is ok though...
>
> Patch is against 2.4.19-pre7-ac2
>
> Comments?
> Ed Tomlinson
>
> ------------
> # This is a BitKeeper generated patch for the following project:
> # Project Name: Linux kernel tree
> # This patch format is intended for GNU patch command version 2.5 or
> higher. # This patch includes the following deltas:
> # ChangeSet 1.422 -> 1.428
> # fs/dcache.c 1.18 -> 1.20
> # mm/vmscan.c 1.60 -> 1.65
> # include/linux/slab.h 1.9 -> 1.11
> # mm/slab.c 1.16 -> 1.19
> # fs/inode.c 1.35 -> 1.37
> #
> # The following is the BitKeeper ChangeSet Log
> # --------------------------------------------
> # 02/05/10 ed@oscar.et.ca 1.423
> # Use the vm's page aging to tell us when we need to shrink the caches.
> # The vm uses callbacks to tell the slabs caches its time to shrink.
> # --------------------------------------------
> # 02/05/10 ed@oscar.et.ca 1.424
> # Change the way process_shrinks is called so refill_invalid does not
> # need to be changed.
> # --------------------------------------------
> # 02/05/10 ed@oscar.et.ca 1.425
> # Remove debuging stuff
> # --------------------------------------------
> # 02/05/11 ed@oscar.et.ca 1.426
> # Simplify the scheme. Use per cache callbacks instead of per family.
> # This lets us target specific caches instead of being generic. We
> # still include a generic call (kmem_cache_reap) as a failsafe
> # before ooming.
> # --------------------------------------------
> # 02/05/11 ed@oscar.et.ca 1.427
> # Remove debugging printk
> # --------------------------------------------
> # 02/05/11 ed@oscar.et.ca 1.428
> # Change factoring, removing changes from background_aging and putting
> # the kmem_call_shrinkers call in kswapd.
> # --------------------------------------------
> #
> diff -Nru a/fs/dcache.c b/fs/dcache.c
> --- a/fs/dcache.c Sat May 11 15:31:40 2002
> +++ b/fs/dcache.c Sat May 11 15:31:40 2002
> @@ -1186,6 +1186,8 @@
> if (!dentry_cache)
> panic("Cannot create dentry cache");
>
> + kmem_set_shrinker(dentry_cache, (shrinker_t)kmem_shrink_dcache);
> +
> #if PAGE_SHIFT < 13
> mempages >>= (13 - PAGE_SHIFT);
> #endif
> @@ -1278,6 +1280,9 @@
> SLAB_HWCACHE_ALIGN, NULL, NULL);
> if (!dquot_cachep)
> panic("Cannot create dquot SLAB cache");
> +
> + kmem_set_shrinker(dquot_cachep, (shrinker_t)kmem_shrink_dquota);
> +
> #endif
>
> dcache_init(mempages);
> diff -Nru a/fs/inode.c b/fs/inode.c
> --- a/fs/inode.c Sat May 11 15:31:40 2002
> +++ b/fs/inode.c Sat May 11 15:31:40 2002
> @@ -1173,6 +1173,8 @@
> if (!inode_cachep)
> panic("cannot create inode slab cache");
>
> + kmem_set_shrinker(inode_cachep, (shrinker_t)kmem_shrink_icache);
> +
> unused_inodes_flush_task.routine = try_to_sync_unused_inodes;
> }
>
> diff -Nru a/include/linux/slab.h b/include/linux/slab.h
> --- a/include/linux/slab.h Sat May 11 15:31:40 2002
> +++ b/include/linux/slab.h Sat May 11 15:31:40 2002
> @@ -55,6 +55,19 @@
> void (*)(void *, kmem_cache_t *, unsigned long));
> extern int kmem_cache_destroy(kmem_cache_t *);
> extern int kmem_cache_shrink(kmem_cache_t *);
> +
> +typedef int (*shrinker_t)(kmem_cache_t *, int, int);
> +
> +extern void kmem_set_shrinker(kmem_cache_t *, shrinker_t);
> +extern int kmem_call_shrinkers(int, int);
> +extern void kmem_count_page(struct page *);
> +
> +/* shrink drivers */
> +extern int kmem_shrink_default(kmem_cache_t *, int, int);
> +extern int kmem_shrink_dcache(kmem_cache_t *, int, int);
> +extern int kmem_shrink_icache(kmem_cache_t *, int, int);
> +extern int kmem_shrink_dquota(kmem_cache_t *, int, int);
> +
> extern int kmem_cache_shrink_nr(kmem_cache_t *);
> extern void *kmem_cache_alloc(kmem_cache_t *, int);
> extern void kmem_cache_free(kmem_cache_t *, void *);
> diff -Nru a/mm/slab.c b/mm/slab.c
> --- a/mm/slab.c Sat May 11 15:31:40 2002
> +++ b/mm/slab.c Sat May 11 15:31:40 2002
> @@ -213,6 +213,8 @@
> kmem_cache_t *slabp_cache;
> unsigned int growing;
> unsigned int dflags; /* dynamic flags */
> + shrinker_t shrinker; /* shrink callback */
> + int count; /* count used to trigger shrink */
>
> /* constructor func */
> void (*ctor)(void *, kmem_cache_t *, unsigned long);
> @@ -382,6 +384,69 @@
> static void enable_cpucache (kmem_cache_t *cachep);
> static void enable_all_cpucaches (void);
> #endif
> +
> +/* set the shrink family and function */
> +void kmem_set_shrinker(kmem_cache_t * cachep, shrinker_t theshrinker)
> +{
> + cachep->shrinker = theshrinker;
> +}
> +
> +/* used by refill_inactive_zone to determine caches that need shrinking */
> +void kmem_count_page(struct page *page)
> +{
> + kmem_cache_t *cachep = GET_PAGE_CACHE(page);
> + cachep->count++;
> +}
> +
> +/* call the shrink family function */
> +int kmem_call_shrinkers(int priority, int gfp_mask)
> +{
> + int ret = 0;
> + struct list_head *p;
> +
> + if (gfp_mask & __GFP_WAIT)
> + down(&cache_chain_sem);
> + else
> + if (down_trylock(&cache_chain_sem))
> + return 0;
> +
> + list_for_each(p,&cache_chain) {
> + kmem_cache_t *cachep = list_entry(p, kmem_cache_t, next);
> + if (cachep->count > 0) {
> + if (cachep->shrinker == NULL)
> + BUG();
> + ret += (*cachep->shrinker)(cachep, priority, gfp_mask);
> + cachep->count = 0;
> + }
> + }
> + up(&cache_chain_sem);
> + return ret;
> +}
> +
> +/* shink methods */
> +int kmem_shrink_default(kmem_cache_t * cachep, int priority, int gfp_mask)
> +{
> + return kmem_cache_shrink_nr(cachep);
> +}
> +
> +int kmem_shrink_dcache(kmem_cache_t * cachep, int priority, int gfp_mask)
> +{
> + return shrink_dcache_memory(priority, gfp_mask);
> +}
> +
> +int kmem_shrink_icache(kmem_cache_t * cachep, int priority, int gfp_mask)
> +{
> + return shrink_icache_memory(priority, gfp_mask);
> +}
> +
> +#if defined (CONFIG_QUOTA)
> +
> +int kmem_shrink_dquota(kmem_cache_t * cachep, int priority, int gfp_mask)
> +{
> + return shrink_dqcache_memory(priority, gfp_mask);
> +}
> +
> +#endif
>
> /* Cal the num objs, wastage, and bytes left over for a given slab size.
> */ static void kmem_cache_estimate (unsigned long gfporder, size_t size, @@
> -514,6 +579,8 @@
> * vm_scan(). Shouldn't be a worry.
> */
> while (i--) {
> + if (!(cachep->flags & SLAB_NO_REAP))
> + lru_cache_del(page);
> PageClearSlab(page);
> page++;
> }
> @@ -781,6 +848,8 @@
> flags |= CFLGS_OPTIMIZE;
>
> cachep->flags = flags;
> + cachep->shrinker = ( shrinker_t)(kmem_shrink_default);
> + cachep->count = 0;
> cachep->gfpflags = 0;
> if (flags & SLAB_CACHE_DMA)
> cachep->gfpflags |= GFP_DMA;
> @@ -1184,6 +1253,8 @@
> SET_PAGE_CACHE(page, cachep);
> SET_PAGE_SLAB(page, slabp);
> PageSetSlab(page);
> + if (!(cachep->flags & SLAB_NO_REAP))
> + lru_cache_add(page);
> page++;
> } while (--i);
>
> @@ -1903,6 +1974,7 @@
> unsigned long num_objs;
> unsigned long active_slabs = 0;
> unsigned long num_slabs;
> + int ref;
> cachep = list_entry(p, kmem_cache_t, next);
>
> spin_lock_irq(&cachep->spinlock);
> diff -Nru a/mm/vmscan.c b/mm/vmscan.c
> --- a/mm/vmscan.c Sat May 11 15:31:40 2002
> +++ b/mm/vmscan.c Sat May 11 15:31:40 2002
> @@ -102,6 +102,9 @@
> continue;
> }
>
> + if (PageSlab(page))
> + BUG();
> +
> /* Page is being freed */
> if (unlikely(page_count(page)) == 0) {
> list_del(page_lru);
> @@ -244,7 +247,8 @@
> * The page is in active use or really unfreeable. Move to
> * the active list and adjust the page age if needed.
> */
> - if (page_referenced(page) && page_mapping_inuse(page) &&
> + if (page_referenced(page) &&
> + (page_mapping_inuse(page) || PageSlab(page)) &&
> !page_over_rsslimit(page)) {
> del_page_from_inactive_dirty_list(page);
> add_page_to_active_list(page);
> @@ -253,6 +257,12 @@
> }
>
> /*
> + * SlabPages get shrunk in refill_inactive_zone
> + */
> + if (PageSlab(page))
> + continue;
> +
> + /*
> * Page is being freed, don't worry about it.
> */
> if (unlikely(page_count(page)) == 0)
> @@ -446,6 +456,7 @@
> * This function will scan a portion of the active list of a zone to find
> * unused pages, those pages will then be moved to the inactive list.
> */
> +
> int refill_inactive_zone(struct zone_struct * zone, int priority)
> {
> int maxscan = zone->active_pages >> priority;
> @@ -473,7 +484,7 @@
> * bother with page aging. If the page is touched again
> * while on the inactive_clean list it'll be reactivated.
> */
> - if (!page_mapping_inuse(page)) {
> + if (!page_mapping_inuse(page) && !PageSlab(page)) {
> drop_page(page);
> continue;
> }
> @@ -497,8 +508,12 @@
> list_add(page_lru, &zone->active_list);
> } else {
> deactivate_page_nolock(page);
> - if (++nr_deactivated > target)
> + if (PageSlab(page))
> + kmem_count_page(page);
> + else {
> + if (++nr_deactivated > target)
> break;
> + }
> }
>
> /* Low latency reschedule point */
> @@ -513,6 +528,7 @@
> return nr_deactivated;
> }
>
> +
> /**
> * refill_inactive - checks all zones and refills the inactive list as
> needed *
> @@ -577,24 +593,15 @@
>
> /*
> * Eat memory from filesystem page cache, buffer cache,
> - * dentry, inode and filesystem quota caches.
> */
> ret += page_launder(gfp_mask);
> - ret += shrink_dcache_memory(DEF_PRIORITY, gfp_mask);
> - ret += shrink_icache_memory(1, gfp_mask);
> -#ifdef CONFIG_QUOTA
> - ret += shrink_dqcache_memory(DEF_PRIORITY, gfp_mask);
> -#endif
>
> /*
> - * Move pages from the active list to the inactive list.
> + * Move pages from the active list to the inactive list and
> + * shrink caches return pages gained by shrink
> */
> refill_inactive();
> -
> - /*
> - * Reclaim unused slab cache memory.
> - */
> - ret += kmem_cache_reap(gfp_mask);
> + ret += kmem_call_shrinkers(DEF_PRIORITY, gfp_mask);
>
> refill_freelist();
>
> @@ -603,11 +610,14 @@
> run_task_queue(&tq_disk);
>
> /*
> - * Hmm.. Cache shrink failed - time to kill something?
> + * Hmm.. - time to kill something?
> * Mhwahahhaha! This is the part I really like. Giggle.
> */
> - if (!ret && free_min(ANY_ZONE) > 0)
> - out_of_memory();
> + if (!ret && free_min(ANY_ZONE) > 0) {
> + ret += kmem_cache_reap(gfp_mask);
> + if (!ret)
> + out_of_memory();
> + }
>
> return ret;
> }
> @@ -700,6 +710,7 @@
>
> /* Do background page aging. */
> background_aging(DEF_PRIORITY);
> + kmem_call_shrinkers(DEF_PRIORITY, GFP_KSWAPD);
> }
>
> wakeup_memwaiters();
> ------------
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
next prev parent reply other threads:[~2002-05-12 13:49 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-05-11 20:14 Ed Tomlinson
2002-05-12 13:49 ` Ed Tomlinson [this message]
2002-05-14 2:38 ` Ed Tomlinson
2002-05-14 14:20 ` William Lee Irwin III
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200205120949.13081.tomlins@cam.org \
--to=tomlins@cam.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox