linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nitin Gupta <ngupta@vflare.org>
To: Avi Kivity <avi@redhat.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	jeremy@goop.org, hugh.dickins@tiscali.co.uk, JBeulich@novell.com,
	chris.mason@oracle.com, kurt.hackel@oracle.com,
	dave.mccracken@oracle.com, npiggin@suse.de,
	akpm@linux-foundation.org, riel@redhat.com
Subject: Re: Frontswap [PATCH 0/4] (was Transcendent Memory): overview
Date: Sun, 25 Apr 2010 21:35:34 +0530	[thread overview]
Message-ID: <4BD4684E.9040802@vflare.org> (raw)
In-Reply-To: <4BD4329A.9010509@redhat.com>

On 04/25/2010 05:46 PM, Avi Kivity wrote:
> On 04/25/2010 06:11 AM, Nitin Gupta wrote:
>> On 04/24/2010 11:57 PM, Avi Kivity wrote:
>>   
>>> On 04/24/2010 04:49 AM, Nitin Gupta wrote:
>>>     
>>>>       
>>>>> I see.  So why not implement this as an ordinary swap device, with a
>>>>> higher priority than the disk device?  this way we reuse an API and
>>>>> keep
>>>>> things asynchronous, instead of introducing a special purpose API.
>>>>>
>>>>>
>>>>>          
>>>> ramzswap is exactly this: an ordinary swap device which stores every
>>>> page
>>>> in (compressed) memory and its enabled as highest priority swap.
>>>> Currently,
>>>> it stores these compressed chunks in guest memory itself but it is not
>>>> very
>>>> difficult to send these chunks out to host/hypervisor using virtio.
>>>>
>>>> However, it suffers from unnecessary block I/O layer overhead and
>>>> requires
>>>> weird hooks in swap code, say to get notification when a swap slot is
>>>> freed.
>>>>
>>>>        
>>> Isn't that TRIM?
>>>      
>> No: trim or discard is not useful. The problem is that we require a
>> callback
>> _as soon as_ a page (swap slot) is freed. Otherwise, stale data
>> quickly accumulates
>> in memory defeating the whole purpose of in-memory compressed swap
>> devices (like ramzswap).
>>    
> 
> Doesn't flash have similar requirements?  The earlier you discard, the
> likelier you are to reuse an erase block (or reduce the amount of copying).
> 

No. We do not want to issue discard for every page as soon as it is freed.
I'm not flash expert but I guess issuing erase is just too expensive to be
issued so frequently. OTOH, ramzswap needs a callback for every page and as
soon as it is freed.


>> Increasing the frequency of discards is also not an option:
>>   - Creating discard bio requests themselves need memory and these
>> swap devices
>> come into picture only under low memory conditions.
>>    
> 
> That's fine, swap works under low memory conditions by using reserves.
> 

Ok, but still all this bio allocation and block layer overhead seems
unnecessary and is easily avoidable. I think frontswap code needs
clean up but at least it avoids all this bio overhead.

>>   - We need to regularly scan swap_map to issue these discards.
>> Increasing discard
>> frequency also means more frequent scanning (which will still not be
>> fast enough
>> for ramzswap needs).
>>    
> 
> How does frontswap do this?  Does it maintain its own data structures?
> 

frontswap simply calls frontswap_flush_page() in swap_entry_free() i.e. as
soon as a swap slot is freed. No bio allocation etc.

>>> Maybe we should optimize these overheads instead.  Swap used to always
>>> be to slow devices, but swap-to-flash has the potential to make swap act
>>> like an extension of RAM.
>>>
>>>      
>> Spending lot of effort optimizing an overhead which can be completely
>> avoided
>> is probably not worth it.
>>    
> 
> I'm not sure.  Swap-to-flash will soon be everywhere.   If it's slow,
> people will feel it a lot more than ramzswap slowness.
> 

Optimizing swap-to-flash is surely desirable but this problem is separate
from ramzswap or frontswap optimization. For the latter, I think dealing
with bio's, going through block layer is plain overhead.

>> Also, I think the choice of a synchronous style API for frontswap and
>> cleancache
>> is justified as they want to send pages to host *RAM*. If you want to
>> use other
>> devices like SSDs, then these should be just added as another swap
>> device as
>> we do currently -- these should not be used as frontswap storage
>> directly.
>>    
> 
> Even for copying to RAM an async API is wanted, so you can dma it
> instead of copying.
>

Maybe incremental development is better? Stabilize and refine existing
code and gradually move to async API, if required in future?

Thanks,
Nitin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-04-25 16:08 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-22 13:42 Dan Magenheimer
2010-04-22 15:28 ` Avi Kivity
2010-04-22 15:48   ` Dan Magenheimer
2010-04-22 16:13     ` Avi Kivity
2010-04-22 20:15       ` Dan Magenheimer
2010-04-23  9:48         ` Avi Kivity
2010-04-23 13:47           ` Dan Magenheimer
2010-04-23 13:57             ` Avi Kivity
2010-04-23 14:43               ` Dan Magenheimer
2010-04-23 14:52                 ` Avi Kivity
2010-04-23 15:00                   ` Avi Kivity
2010-04-23 16:26                     ` Dan Magenheimer
2010-04-24 18:25                       ` Avi Kivity
     [not found]                         ` <1c02a94a-a6aa-4cbb-a2e6-9d4647760e91@default4BD43033.7090706@redhat.com>
2010-04-25  0:41                         ` Dan Magenheimer
2010-04-25 12:06                           ` Avi Kivity
2010-04-25 13:12                             ` Dan Magenheimer
2010-04-25 13:18                               ` Avi Kivity
2010-04-28  5:55                               ` Pavel Machek
2010-04-29 14:42                                 ` Dan Magenheimer
2010-04-29 18:59                                   ` Avi Kivity
2010-04-29 19:01                                     ` Avi Kivity
2010-04-29 18:53                                 ` Avi Kivity
2010-04-30  1:45                                 ` Dave Hansen
2010-04-30  7:13                                   ` Avi Kivity
2010-04-30 15:59                                     ` Dan Magenheimer
2010-04-30 16:08                                       ` Dave Hansen
2010-05-10 16:05                                         ` Martin Schwidefsky
2010-04-30 16:16                                       ` Avi Kivity
     [not found]                                         ` <4BDB18CE.2090608@goop.org4BDB2069.4000507@redhat.com>
     [not found]                                           ` <3a62a058-7976-48d7-acd2-8c6a8312f10f@default20100502071059.GF1790@ucw.cz>
2010-04-30 16:43                                         ` Dan Magenheimer
2010-04-30 17:10                                           ` Dave Hansen
2010-04-30 18:08                                           ` Avi Kivity
2010-04-30 17:52                                         ` Jeremy Fitzhardinge
2010-04-30 18:24                                           ` Avi Kivity
2010-04-30 18:59                                             ` Jeremy Fitzhardinge
2010-05-01  8:28                                               ` Avi Kivity
2010-05-01 17:10                                             ` Dan Magenheimer
2010-05-02  7:11                                               ` Pavel Machek
2010-05-02 15:05                                                 ` Dan Magenheimer
2010-05-02 20:06                                                   ` Pavel Machek
2010-05-02 21:05                                                     ` Dan Magenheimer
2010-05-02  7:57                                               ` Nitin Gupta
2010-05-02 16:06                                                 ` Dan Magenheimer
2010-05-02 16:48                                                   ` Avi Kivity
2010-05-02 17:22                                                     ` Dan Magenheimer
2010-05-03  9:39                                                       ` Avi Kivity
2010-05-03 14:59                                                         ` Dan Magenheimer
2010-05-02 15:35                                               ` Avi Kivity
2010-05-02 17:06                                                 ` Dan Magenheimer
2010-05-03  8:46                                                   ` Avi Kivity
2010-05-03 16:01                                                     ` Dan Magenheimer
2010-05-03 19:32                                                       ` Pavel Machek
2010-04-30 16:04                                     ` Dave Hansen
2010-04-23 15:56                   ` Dan Magenheimer
2010-04-24 18:22                     ` Avi Kivity
2010-04-25  0:30                       ` Dan Magenheimer
2010-04-25 12:11                         ` Avi Kivity
     [not found]                           ` <c5062f3a-3232-4b21-b032-2ee1f2485ff0@default4BD44E74.2020506@redhat.com>
2010-04-25 13:37                           ` Dan Magenheimer
2010-04-25 14:15                             ` Avi Kivity
2010-04-25 15:29                               ` Dan Magenheimer
2010-04-26  6:01                                 ` Avi Kivity
2010-04-26 12:45                                   ` Dan Magenheimer
2010-04-26 13:48                                     ` Avi Kivity
2010-04-27 12:56                                 ` Pavel Machek
2010-04-27 14:32                                   ` Dan Magenheimer
2010-04-29 13:02                                     ` Pavel Machek
2010-04-27 11:52                             ` Valdis.Kletnieks
2010-04-27  0:49                           ` Jeremy Fitzhardinge
2010-04-27 12:55                         ` Pavel Machek
2010-04-27 14:43                           ` Nitin Gupta
2010-04-29 13:04                             ` Pavel Machek
2010-04-24  1:49                   ` Nitin Gupta
2010-04-24 18:27                     ` Avi Kivity
2010-04-25  3:11                       ` Nitin Gupta
2010-04-25 12:16                         ` Avi Kivity
2010-04-25 16:05                           ` Nitin Gupta [this message]
2010-04-26  6:06                             ` Avi Kivity
2010-04-26 12:50                               ` Dan Magenheimer
2010-04-26 13:43                                 ` Avi Kivity
2010-04-27  8:29                                   ` Dan Magenheimer
2010-04-27  9:21                                     ` Avi Kivity
2010-04-26 13:47                               ` Nitin Gupta
2010-04-23 16:35             ` Jiahua

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BD4684E.9040802@vflare.org \
    --to=ngupta@vflare.org \
    --cc=JBeulich@novell.com \
    --cc=akpm@linux-foundation.org \
    --cc=avi@redhat.com \
    --cc=chris.mason@oracle.com \
    --cc=dan.magenheimer@oracle.com \
    --cc=dave.mccracken@oracle.com \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=jeremy@goop.org \
    --cc=kurt.hackel@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox