linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] swap: send callback when swap slot is freed
@ 2009-08-12 14:37 Nitin Gupta
  0 siblings, 0 replies; 8+ messages in thread
From: Nitin Gupta @ 2009-08-12 14:37 UTC (permalink / raw)
  To: mingo; +Cc: linux-kernel

Currently, we have "swap discard" mechanism which sends a discard bio request
when we find a free cluster during scan_swap_map(). This callback can come a
long time after swap slots are actually freed.

This delay in callback is a great problem when (compressed) RAM [1] is used
as a swap device. So, this change adds a callback which is called as
soon as a swap slot becomes free. For above mentioned case of swapping
over compressed RAM device, this is very useful since we can immediately
free memory allocated for this swap page.

This callback does not replace swap discard support. It is called with
swap_lock held, so it is meant to trigger action that finishes quickly.
However, swap discard is an I/O request and can be used for taking longer
actions.

Links:
[1] http://code.google.com/p/compcache/

Signed-off-by: Nitin Gupta <ngupta@vflare.org>
---

 include/linux/swap.h |    5 +++++
 mm/swapfile.c        |   16 ++++++++++++++++
 2 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 7c15334..4cbe3c4 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -8,6 +8,7 @@
 #include <linux/memcontrol.h>
 #include <linux/sched.h>
 #include <linux/node.h>
+#include <linux/blkdev.h>
 
 #include <asm/atomic.h>
 #include <asm/page.h>
@@ -20,6 +21,8 @@ struct bio;
 #define SWAP_FLAG_PRIO_MASK	0x7fff
 #define SWAP_FLAG_PRIO_SHIFT	0
 
+typedef void (swap_free_notify_fn) (struct block_device *, unsigned long);
+
 static inline int current_is_kswapd(void)
 {
 	return current->flags & PF_KSWAPD;
@@ -155,6 +158,7 @@ struct swap_info_struct {
 	unsigned int max;
 	unsigned int inuse_pages;
 	unsigned int old_block_size;
+	swap_free_notify_fn *swap_free_notify_fn;
 };
 
 struct swap_list_t {
@@ -295,6 +299,7 @@ extern sector_t swapdev_block(int, pgoff_t);
 extern struct swap_info_struct *get_swap_info_struct(unsigned);
 extern int reuse_swap_page(struct page *);
 extern int try_to_free_swap(struct page *);
+extern void set_swap_free_notify(unsigned, swap_free_notify_fn *);
 struct backing_dev_info;
 
 /* linux/mm/thrash.c */
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 8ffdc0d..aa95fc7 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -552,6 +552,20 @@ out:
 	return NULL;
 }
 
+/*
+ * Sets callback for event when swap_map[offset] == 0
+ * i.e. page at this swap offset is no longer used.
+ */
+void set_swap_free_notify(unsigned type, swap_free_notify_fn *notify_fn)
+{
+	struct swap_info_struct *sis;
+	sis = get_swap_info_struct(type);
+	BUG_ON(!sis);
+	sis->swap_free_notify_fn = notify_fn;
+	return;
+}
+EXPORT_SYMBOL(set_swap_free_notify);
+
 static int swap_entry_free(struct swap_info_struct *p,
 			   swp_entry_t ent, int cache)
 {
@@ -583,6 +597,8 @@ static int swap_entry_free(struct swap_info_struct *p,
 			swap_list.next = p - swap_info;
 		nr_swap_pages++;
 		p->inuse_pages--;
+		if (p->swap_free_notify_fn)
+			p->swap_free_notify_fn(p->bdev, offset);
 	}
 	if (!swap_count(count))
 		mem_cgroup_uncharge_swap(ent);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] swap: send callback when swap slot is freed
  2009-08-13  2:30   ` Nitin Gupta
  2009-08-13  6:53     ` Peter Zijlstra
@ 2009-08-13 17:45     ` Hugh Dickins
  1 sibling, 0 replies; 8+ messages in thread
From: Hugh Dickins @ 2009-08-13 17:45 UTC (permalink / raw)
  To: Nitin Gupta
  Cc: Matthew Wilcox, Ingo Molnar, Peter Zijlstra, linux-kernel, linux-mm

On Thu, 13 Aug 2009, Nitin Gupta wrote:
> On 08/13/2009 04:18 AM, Hugh Dickins wrote:
> 
> > But fundamentally, though I can see how this cutdown communication
> > path is useful to compcache, I'd much rather deal with it by the more
> > general discard route if we can.
> 
> I tried this too -- make discard bio request as soon as a swap slot becomes
> free (I can send details if you want). However, I could not get it to work.

I'll send you an updated version of what I experimented with eight
months ago: but changes in the swap_map count handling since then
mean that it might need some subtle adjustments - I'll need to go
over it carefully and retest before sending you.

(But that won't be a waste of my time: I shall soon need to try
that experiment again myself, and I do need to examine those
intervening swap_map count changes more closely.)

> Also, allocating bio to issue discard I/O request looks like a complete
> artifact in compcache case.

Yes, I do understand that feeling.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] swap: send callback when swap slot is freed
  2009-08-13  6:53     ` Peter Zijlstra
@ 2009-08-13 14:44       ` Nitin Gupta
  0 siblings, 0 replies; 8+ messages in thread
From: Nitin Gupta @ 2009-08-13 14:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Hugh Dickins, Matthew Wilcox, Ingo Molnar, linux-kernel, linux-mm

(resending in plain text)

On 08/13/2009 12:23 PM, Peter Zijlstra wrote:
> On Thu, 2009-08-13 at 08:00 +0530, Nitin Gupta wrote:
>>> I don't share Peter's view that it should be using a more general
>>> notifier interface (but I certainly agree with his EXPORT_SYMBOL_GPL).
>> Considering that the callback is made under swap_lock, we should not
>> have an array of callbacks to do. But what if this callback finds other
>> users too? I think we should leave it in its current state till it finds
>> more users and probably add BUG() to make sure callback is not already set.
>>
>> I will make it EXPORT_SYMBOL_GPL.
>
> If its such a tightly coupled system, then why is compcache a module?
>

Keeping everything as separate kernel modules has been the goal of this 
project. However, this callback is the only thing which I could not do 
without this small patching.

Nitin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] swap: send callback when swap slot is freed
  2009-08-13  2:30   ` Nitin Gupta
@ 2009-08-13  6:53     ` Peter Zijlstra
  2009-08-13 14:44       ` Nitin Gupta
  2009-08-13 17:45     ` Hugh Dickins
  1 sibling, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2009-08-13  6:53 UTC (permalink / raw)
  To: ngupta; +Cc: Hugh Dickins, Matthew Wilcox, Ingo Molnar, linux-kernel, linux-mm

On Thu, 2009-08-13 at 08:00 +0530, Nitin Gupta wrote:
> > I don't share Peter's view that it should be using a more general
> > notifier interface (but I certainly agree with his EXPORT_SYMBOL_GPL).
> 
> Considering that the callback is made under swap_lock, we should not 
> have an array of callbacks to do. But what if this callback finds other 
> users too? I think we should leave it in its current state till it finds 
> more users and probably add BUG() to make sure callback is not already set.
> 
> I will make it EXPORT_SYMBOL_GPL.

If its such a tightly coupled system, then why is compcache a module?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] swap: send callback when swap slot is freed
  2009-08-12 22:48 ` Hugh Dickins
  2009-08-13  2:30   ` Nitin Gupta
@ 2009-08-13  2:41   ` Nitin Gupta
  1 sibling, 0 replies; 8+ messages in thread
From: Nitin Gupta @ 2009-08-13  2:41 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Matthew Wilcox, Ingo Molnar, Peter Zijlstra, linux-kernel, linux-mm

On 08/13/2009 04:18 AM, Hugh Dickins wrote:
> On Wed, 12 Aug 2009, Nitin Gupta wrote:
>
>> Currently, we have "swap discard" mechanism which sends a discard bio request
>> when we find a free cluster during scan_swap_map(). This callback can come a
>> long time after swap slots are actually freed.
>>
>> This delay in callback is a great problem when (compressed) RAM [1] is used
>> as a swap device. So, this change adds a callback which is called as
>> soon as a swap slot becomes free. For above mentioned case of swapping
>> over compressed RAM device, this is very useful since we can immediately
>> free memory allocated for this swap page.
>>
>> This callback does not replace swap discard support. It is called with
>> swap_lock held, so it is meant to trigger action that finishes quickly.
>> However, swap discard is an I/O request and can be used for taking longer
>> actions.
>>
>> Links:
>> [1] http://code.google.com/p/compcache/
>
> Please keep this with compcache for the moment (it has no other users).
>

Oh, I missed this one.

This small patch can be considered as first step for merging compcache 
to mainline :)   Actually, it requires callbacks for swapon, swapoff too 
but that, I think, should be done in a separate patches.

BTW, last time compcache was not accepted due to lack of performance 
numbers. Now the project has lot more data for various cases:
http://code.google.com/p/compcache/wiki/Performance
Still need to collect data for worst-case behaviors and such...


Thanks,
Nitin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] swap: send callback when swap slot is freed
  2009-08-12 22:48 ` Hugh Dickins
@ 2009-08-13  2:30   ` Nitin Gupta
  2009-08-13  6:53     ` Peter Zijlstra
  2009-08-13 17:45     ` Hugh Dickins
  2009-08-13  2:41   ` Nitin Gupta
  1 sibling, 2 replies; 8+ messages in thread
From: Nitin Gupta @ 2009-08-13  2:30 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Matthew Wilcox, Ingo Molnar, Peter Zijlstra, linux-kernel, linux-mm

On 08/13/2009 04:18 AM, Hugh Dickins wrote:
> On Wed, 12 Aug 2009, Nitin Gupta wrote:
>
>> Currently, we have "swap discard" mechanism which sends a discard bio request
>> when we find a free cluster during scan_swap_map(). This callback can come a
>> long time after swap slots are actually freed.
>>
>> This delay in callback is a great problem when (compressed) RAM [1] is used
>> as a swap device. So, this change adds a callback which is called as
>> soon as a swap slot becomes free. For above mentioned case of swapping
>> over compressed RAM device, this is very useful since we can immediately
>> free memory allocated for this swap page.
>>
>> This callback does not replace swap discard support. It is called with
>> swap_lock held, so it is meant to trigger action that finishes quickly.
>> However, swap discard is an I/O request and can be used for taking longer
>> actions.
>>
>> Links:
>> [1] http://code.google.com/p/compcache/
>
> Please keep this with compcache for the moment (it has no other users).
>
> I don't share Peter's view that it should be using a more general
> notifier interface (but I certainly agree with his EXPORT_SYMBOL_GPL).

Considering that the callback is made under swap_lock, we should not 
have an array of callbacks to do. But what if this callback finds other 
users too? I think we should leave it in its current state till it finds 
more users and probably add BUG() to make sure callback is not already set.

I will make it EXPORT_SYMBOL_GPL.

> There better not be others hooking in here at the same time (a BUG_ON
> could check that): in fact I don't even want you hooking in here where
> swap_lock is held.  Glancing at compcache, I don't see you violating
> lock hierarchy by that, but it is a worry.
>

I tried an approach that allows releasing swap_lock and 'lazily' make
the callback but this turned out to be pretty messy. So, I think just
adding a note that the callback is done under swap_lock should be better.


> The interface to set the notifier, you currently have it by swap type:
> that would better be by bdev, wouldn't it?  with a search for the right
> slot.  There's nowhere else in ramzswap.c that you rely on swp_entry_t
> and page_private(page), let's keep such details out of compcache.
>

Use of bdev instead of swap_entry_t looks better. I will make this change.


> But fundamentally, though I can see how this cutdown communication
> path is useful to compcache, I'd much rather deal with it by the more
> general discard route if we can.  (I'm one of those still puzzled by
> the way swap is mixed up with block device in compcache: probably
> because I never found time to pay attention when you explained.)
>

I tried this too -- make discard bio request as soon as a swap slot 
becomes free (I can send details if you want). However, I could not get 
it to work. Also, allocating bio to issue discard I/O request looks like 
a complete artifact in compcache case.


> You're right to question the utility of the current swap discard
> placement.  That code is almost a year old, written from a position
> of great ignorance, yet only now do we appear to be on the threshold
> of having an SSD which really supports TRIM (ah, the Linux ATA TRIM
> support seems to have gone missing now, but perhaps it's been
> waiting for a reality to check against too - Willy?).
>

> I won't be surprised if we find that we need to move swap discard
> support much closer to swap_free (though I know from trying before
> that it's much messier there): in which case, even if we decided to
> keep your hotline to compcache (to avoid allocating bios etc.), it
> would be better placed alongside.
>

This new callback and discard can actually co-exist: Use callback to 
trigger small actions and discard for longer actions. Depending on use 
case, you might need both or either one of these.


I am not very sure how willing you are to accept this patch but let me 
send another revision with all the suggestions from you all.


Thanks for looking into this.
Nitin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] swap: send callback when swap slot is freed
  2009-08-12 14:37 Nitin Gupta
@ 2009-08-12 22:48 ` Hugh Dickins
  2009-08-13  2:30   ` Nitin Gupta
  2009-08-13  2:41   ` Nitin Gupta
  0 siblings, 2 replies; 8+ messages in thread
From: Hugh Dickins @ 2009-08-12 22:48 UTC (permalink / raw)
  To: Nitin Gupta
  Cc: Matthew Wilcox, Ingo Molnar, Peter Zijlstra, linux-kernel, linux-mm

On Wed, 12 Aug 2009, Nitin Gupta wrote:

> Currently, we have "swap discard" mechanism which sends a discard bio request
> when we find a free cluster during scan_swap_map(). This callback can come a
> long time after swap slots are actually freed.
> 
> This delay in callback is a great problem when (compressed) RAM [1] is used
> as a swap device. So, this change adds a callback which is called as
> soon as a swap slot becomes free. For above mentioned case of swapping
> over compressed RAM device, this is very useful since we can immediately
> free memory allocated for this swap page.
> 
> This callback does not replace swap discard support. It is called with
> swap_lock held, so it is meant to trigger action that finishes quickly.
> However, swap discard is an I/O request and can be used for taking longer
> actions.
> 
> Links:
> [1] http://code.google.com/p/compcache/

Please keep this with compcache for the moment (it has no other users).

I don't share Peter's view that it should be using a more general
notifier interface (but I certainly agree with his EXPORT_SYMBOL_GPL).
There better not be others hooking in here at the same time (a BUG_ON
could check that): in fact I don't even want you hooking in here where
swap_lock is held.  Glancing at compcache, I don't see you violating
lock hierarchy by that, but it is a worry.

The interface to set the notifier, you currently have it by swap type:
that would better be by bdev, wouldn't it?  with a search for the right
slot.  There's nowhere else in ramzswap.c that you rely on swp_entry_t
and page_private(page), let's keep such details out of compcache.

But fundamentally, though I can see how this cutdown communication
path is useful to compcache, I'd much rather deal with it by the more
general discard route if we can.  (I'm one of those still puzzled by
the way swap is mixed up with block device in compcache: probably
because I never found time to pay attention when you explained.)

You're right to question the utility of the current swap discard
placement.  That code is almost a year old, written from a position
of great ignorance, yet only now do we appear to be on the threshold
of having an SSD which really supports TRIM (ah, the Linux ATA TRIM
support seems to have gone missing now, but perhaps it's been
waiting for a reality to check against too - Willy?).

I won't be surprised if we find that we need to move swap discard
support much closer to swap_free (though I know from trying before
that it's much messier there): in which case, even if we decided to
keep your hotline to compcache (to avoid allocating bios etc.), it
would be better placed alongside.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] swap: send callback when swap slot is freed
@ 2009-08-12 14:37 Nitin Gupta
  2009-08-12 22:48 ` Hugh Dickins
  0 siblings, 1 reply; 8+ messages in thread
From: Nitin Gupta @ 2009-08-12 14:37 UTC (permalink / raw)
  To: mingo; +Cc: linux-kernel

Currently, we have "swap discard" mechanism which sends a discard bio request
when we find a free cluster during scan_swap_map(). This callback can come a
long time after swap slots are actually freed.

This delay in callback is a great problem when (compressed) RAM [1] is used
as a swap device. So, this change adds a callback which is called as
soon as a swap slot becomes free. For above mentioned case of swapping
over compressed RAM device, this is very useful since we can immediately
free memory allocated for this swap page.

This callback does not replace swap discard support. It is called with
swap_lock held, so it is meant to trigger action that finishes quickly.
However, swap discard is an I/O request and can be used for taking longer
actions.

Links:
[1] http://code.google.com/p/compcache/

Signed-off-by: Nitin Gupta <ngupta@vflare.org>
---

 include/linux/swap.h |    5 +++++
 mm/swapfile.c        |   16 ++++++++++++++++
 2 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 7c15334..4cbe3c4 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -8,6 +8,7 @@
 #include <linux/memcontrol.h>
 #include <linux/sched.h>
 #include <linux/node.h>
+#include <linux/blkdev.h>
 
 #include <asm/atomic.h>
 #include <asm/page.h>
@@ -20,6 +21,8 @@ struct bio;
 #define SWAP_FLAG_PRIO_MASK	0x7fff
 #define SWAP_FLAG_PRIO_SHIFT	0
 
+typedef void (swap_free_notify_fn) (struct block_device *, unsigned long);
+
 static inline int current_is_kswapd(void)
 {
 	return current->flags & PF_KSWAPD;
@@ -155,6 +158,7 @@ struct swap_info_struct {
 	unsigned int max;
 	unsigned int inuse_pages;
 	unsigned int old_block_size;
+	swap_free_notify_fn *swap_free_notify_fn;
 };
 
 struct swap_list_t {
@@ -295,6 +299,7 @@ extern sector_t swapdev_block(int, pgoff_t);
 extern struct swap_info_struct *get_swap_info_struct(unsigned);
 extern int reuse_swap_page(struct page *);
 extern int try_to_free_swap(struct page *);
+extern void set_swap_free_notify(unsigned, swap_free_notify_fn *);
 struct backing_dev_info;
 
 /* linux/mm/thrash.c */
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 8ffdc0d..aa95fc7 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -552,6 +552,20 @@ out:
 	return NULL;
 }
 
+/*
+ * Sets callback for event when swap_map[offset] == 0
+ * i.e. page at this swap offset is no longer used.
+ */
+void set_swap_free_notify(unsigned type, swap_free_notify_fn *notify_fn)
+{
+	struct swap_info_struct *sis;
+	sis = get_swap_info_struct(type);
+	BUG_ON(!sis);
+	sis->swap_free_notify_fn = notify_fn;
+	return;
+}
+EXPORT_SYMBOL(set_swap_free_notify);
+
 static int swap_entry_free(struct swap_info_struct *p,
 			   swp_entry_t ent, int cache)
 {
@@ -583,6 +597,8 @@ static int swap_entry_free(struct swap_info_struct *p,
 			swap_list.next = p - swap_info;
 		nr_swap_pages++;
 		p->inuse_pages--;
+		if (p->swap_free_notify_fn)
+			p->swap_free_notify_fn(p->bdev, offset);
 	}
 	if (!swap_count(count))
 		mem_cgroup_uncharge_swap(ent);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-08-13 17:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-12 14:37 [PATCH] swap: send callback when swap slot is freed Nitin Gupta
2009-08-12 14:37 Nitin Gupta
2009-08-12 22:48 ` Hugh Dickins
2009-08-13  2:30   ` Nitin Gupta
2009-08-13  6:53     ` Peter Zijlstra
2009-08-13 14:44       ` Nitin Gupta
2009-08-13 17:45     ` Hugh Dickins
2009-08-13  2:41   ` Nitin Gupta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox