* possible brw_page optimization
@ 2000-01-21 20:21 Chuck Lever
2000-01-26 13:01 ` Stephen C. Tweedie
0 siblings, 1 reply; 6+ messages in thread
From: Chuck Lever @ 2000-01-21 20:21 UTC (permalink / raw)
To: linux-mm
i've been exploring swap compaction and encryption, and found that
brw_page wants to break pages into buffer-sized pieces in order to
schedule I/O. the logic wants to eliminate unnecessary I/O requests, so
it checks each buffer to see if it is up to date; it doesn't schedule
reads for buffers that are already up to date. all buffers are scheduled
unconditionally during a write request.
for compaction or encryption, all buffers must be read in order to get the
whole page and decrypt or decompress it, so i'd like to make
brw_page(READ) read all buffers for a page unconditionally, just like
brw_page(WRITE). at first, i thought a simple flag could request this
change in behavior.
however, looking at brw_page's callers, brw_page(READ) in 2.3.39+ is only
invoked on fresh pages, so i can't see where it's possible to not read all
the buffers for a page in brw_page. seems like the following is a
potential common case optimization of brw_page, with no loss of
performance.
what issues am i missing?
int brw_page(int rw, struct page *page, kdev_t dev, int b[], int size)
{
struct buffer_head *head, *bh, *arr[MAX_BUF_PER_PAGE];
int block, nr = 0;
/*
* We pretty much rely on the page lock for this, because
* create_page_buffers() might sleep.
*/
if (!page->buffers)
create_page_buffers(rw, page, dev, b, size);
head = page->buffers;
bh = head;
do {
arr[nr++] = bh;
atomic_inc(&bh->b_count);
if (rw == WRITE ) {
block = *(b++);
if (!bh->b_blocknr)
bh->b_blocknr = block;
set_bit(BH_Uptodate, &bh->b_state);
set_bit(BH_Dirty, &bh->b_state);
}
bh = bh->b_this_page;
} while (bh != head);
ll_rw_block(rw, nr, arr);
if (rw == READ)
++current->maj_flt;
return 0;
}
- Chuck Lever
--
corporate: <chuckl@netscape.com>
personal: <chucklever@netscape.net> or <cel@monkey.org>
The Linux Scalability project:
http://www.citi.umich.edu/projects/linux-scalability/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: possible brw_page optimization
2000-01-21 20:21 possible brw_page optimization Chuck Lever
@ 2000-01-26 13:01 ` Stephen C. Tweedie
2000-01-26 16:02 ` Chuck Lever
0 siblings, 1 reply; 6+ messages in thread
From: Stephen C. Tweedie @ 2000-01-26 13:01 UTC (permalink / raw)
To: Chuck Lever; +Cc: linux-mm
Hi,
On Fri, 21 Jan 2000 15:21:33 -0500 (EST), Chuck Lever <cel@monkey.org>
said:
> i've been exploring swap compaction and encryption, and found that
> brw_page wants to break pages into buffer-sized pieces in order to
> schedule I/O.
brw_page is there explicitly to perform physical block IO to disk. If
you want to do compression or encription, I'd have thought you want to
do that at a higher level. The clean way to do this would be to provide
a virtual file to swap over, and to allow rw_swap_page_base() to pass
the page read or write to that file's inode's read_/write_page methods.
Then you can do any munging you want on the virtual swap file without
polluting the underlying swap IO code.
--Stephen
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: possible brw_page optimization
2000-01-26 13:01 ` Stephen C. Tweedie
@ 2000-01-26 16:02 ` Chuck Lever
2000-01-26 23:24 ` Stephen C. Tweedie
0 siblings, 1 reply; 6+ messages in thread
From: Chuck Lever @ 2000-01-26 16:02 UTC (permalink / raw)
To: Stephen C. Tweedie; +Cc: linux-mm
On Wed, 26 Jan 2000, Stephen C. Tweedie wrote:
> On Fri, 21 Jan 2000 15:21:33 -0500 (EST), Chuck Lever <cel@monkey.org>
> said:
> > i've been exploring swap compaction and encryption, and found that
> > brw_page wants to break pages into buffer-sized pieces in order to
> > schedule I/O.
>
> brw_page is there explicitly to perform physical block IO to disk. If
> you want to do compression or encription, I'd have thought you want to
> do that at a higher level.
yes, i want to make the policy decisions and do the encryption at the
rw_swap_page_base() level. the decryption/decompression would be handled
by the exit routine.
however, somehow i'd have to guarantee that all buffers associated with a
page that is to be compressed/encrypted are read/written at once. using a
bounce page to handle the ciphertext/compressed page might be enough to do
that, since it would have no buffers already associated with it.
however, i was wondering if the optimization i did was of general use. as
i mentioned, i don't see any place that invokes brw_page() in such a way
as to trigger the logic to read only some of the buffers.
> The clean way to do this would be to provide
> a virtual file to swap over, and to allow rw_swap_page_base() to pass
> the page read or write to that file's inode's read_/write_page methods.
> Then you can do any munging you want on the virtual swap file without
> polluting the underlying swap IO code.
using a unique swap file/device makes it easy to tell when you need to
decrypt a page. :)
- Chuck Lever
--
corporate: <chuckl@netscape.com>
personal: <chucklever@netscape.net> or <cel@monkey.org>
The Linux Scalability project:
http://www.citi.umich.edu/projects/linux-scalability/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: possible brw_page optimization
2000-01-26 16:02 ` Chuck Lever
@ 2000-01-26 23:24 ` Stephen C. Tweedie
2000-01-27 18:50 ` Chuck Lever
0 siblings, 1 reply; 6+ messages in thread
From: Stephen C. Tweedie @ 2000-01-26 23:24 UTC (permalink / raw)
To: Chuck Lever; +Cc: Stephen C. Tweedie, linux-mm
Hi,
On Wed, 26 Jan 2000 11:02:37 -0500 (EST), Chuck Lever <cel@monkey.org>
said:
> however, somehow i'd have to guarantee that all buffers associated with a
> page that is to be compressed/encrypted are read/written at once.
Why? The swapper already does per-page IO locking, so you are protected
against any conflicts while a page is being written out.
>> The clean way to do this would be to provide a virtual file to swap
>> over, and to allow rw_swap_page_base() to pass the page read or write
>> to that file's inode's read_/write_page methods. Then you can do any
>> munging you want on the virtual swap file without polluting the
>> underlying swap IO code.
> using a unique swap file/device makes it easy to tell when you need to
> decrypt a page. :)
Sure, but the inode already gives you such an abstraction --- why invent
a new one?
--Stephen
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: possible brw_page optimization
2000-01-26 23:24 ` Stephen C. Tweedie
@ 2000-01-27 18:50 ` Chuck Lever
2000-01-27 19:52 ` Stephen C. Tweedie
0 siblings, 1 reply; 6+ messages in thread
From: Chuck Lever @ 2000-01-27 18:50 UTC (permalink / raw)
To: Stephen C. Tweedie; +Cc: linux-mm
On Wed, 26 Jan 2000, Stephen C. Tweedie wrote:
> On Wed, 26 Jan 2000 11:02:37 -0500 (EST), Chuck Lever <cel@monkey.org>
> said:
> > however, somehow i'd have to guarantee that all buffers associated with a
> > page that is to be compressed/encrypted are read/written at once.
>
> Why? The swapper already does per-page IO locking, so you are protected
> against any conflicts while a page is being written out.
it's not a locking issue. the encryption algorithm is a block cipher on
the whole page. in order to decrypt a page, you need to be sure you have
all the pieces. you can't read parts of the page and decrypt them.
forgetting about encryption for a moment, you don't think the optimization
is useful in the general case? it's hardly ever used, if at all; plus it
seems to introduce some bugs. that code would be a lot cleaner without
all the bother. the "common case," by far, is to read/write the whole
page.
- Chuck Lever
--
corporate: <chuckl@netscape.com>
personal: <chucklever@netscape.net> or <cel@monkey.org>
The Linux Scalability project:
http://www.citi.umich.edu/projects/linux-scalability/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: possible brw_page optimization
2000-01-27 18:50 ` Chuck Lever
@ 2000-01-27 19:52 ` Stephen C. Tweedie
0 siblings, 0 replies; 6+ messages in thread
From: Stephen C. Tweedie @ 2000-01-27 19:52 UTC (permalink / raw)
To: Chuck Lever; +Cc: Stephen C. Tweedie, linux-mm
Hi,
On Thu, 27 Jan 2000 13:50:23 -0500 (EST), Chuck Lever <cel@monkey.org>
said:
> forgetting about encryption for a moment, you don't think the optimization
> is useful in the general case? it's hardly ever used, if at all; plus it
> seems to introduce some bugs. that code would be a lot cleaner without
> all the bother. the "common case," by far, is to read/write the whole
> page.
Fine, if you can clean up the rw-page code, show us patches, certainly.
It's just not the place to be doing encryption: that's a separate
layering issue. Doing it via a dedicated swap inode would be far
better.
--Stephen
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2000-01-27 19:52 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-01-21 20:21 possible brw_page optimization Chuck Lever
2000-01-26 13:01 ` Stephen C. Tweedie
2000-01-26 16:02 ` Chuck Lever
2000-01-26 23:24 ` Stephen C. Tweedie
2000-01-27 18:50 ` Chuck Lever
2000-01-27 19:52 ` Stephen C. Tweedie
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox