From: david@lang.hm
To: Greg Freemyer <greg.freemyer@gmail.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>,
Matthew Wilcox <willy@linux.intel.com>,
Hugh Dickins <hugh.dickins@tiscali.co.uk>,
Nitin Gupta <ngupta@vflare.org>, Ingo Molnar <mingo@elte.hu>,
Peter Zijlstra <peterz@infradead.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-scsi@vger.kernel.org, linux-ide@vger.kernel.org,
Linux RAID <linux-raid@vger.kernel.org>
Subject: Re: Discard support (was Re: [PATCH] swap: send callback when swap slot is freed)
Date: Thu, 13 Aug 2009 13:44:11 -0700 (PDT) [thread overview]
Message-ID: <alpine.DEB.1.10.0908131342460.28013@asgard.lang.hm> (raw)
In-Reply-To: <87f94c370908131115r680a7523w3cdbc78b9e82373c@mail.gmail.com>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3220 bytes --]
On Thu, 13 Aug 2009, Greg Freemyer wrote:
> On Thu, Aug 13, 2009 at 12:33 PM, <david@lang.hm> wrote:
>> On Thu, 13 Aug 2009, Markus Trippelsdorf wrote:
>>
>>> On Thu, Aug 13, 2009 at 08:13:12AM -0700, Matthew Wilcox wrote:
>>>>
>>>> I am planning a complete overhaul of the discard work. Users can send
>>>> down discard requests as frequently as they like. The block layer will
>>>> cache them, and invalidate them if writes come through. Periodically,
>>>> the block layer will send down a TRIM or an UNMAP (depending on the
>>>> underlying device) and get rid of the blocks that have remained unwanted
>>>> in the interim.
>>>
>>> That is a very good idea. I've tested your original TRIM implementation on
>>> my Vertex yesterday and it was awful ;-). The SSD needs hundreds of
>>> milliseconds to digest a single TRIM command. And since your
>>> implementation
>>> sends a TRIM for each extent of each deleted file, the whole system is
>>> unusable after a short while.
>>> An optimal solution would be to consolidate the discard requests, bundle
>>> them and send them to the drive as infrequent as possible.
>>
>> or queue them up and send them when the drive is idle (you would need to
>> keep track to make sure the space isn't re-used)
>>
>> as an example, if you would consider spinning down a drive you don't hurt
>> performance by sending accumulated trim commands.
>>
>> David Lang
>
> An alternate approach is the block layer maintain its own bitmap of
> used unused sectors / blocks. Unmap commands from the filesystem just
> cause the bitmap to be updated. No other effect.
how does the block layer know what blocks are unused by the filesystem?
or would it be a case of the filesystem generating discard/trim requests
to the block layer so that it can maintain it's bitmap, and then the block
layer generating the requests to the drive below it?
David Lang
> (Big unknown: Where will the bitmap live between reboots? Require DM
> volumes so we can have a dedicated bitmap volume in the mix to store
> the bitmap to? Maybe on mount, the filesystem has to be scanned to
> initially populate the bitmap? Other options?)
>
> Assuming we have a persistent bitmap in place, have a background
> scanner that kicks in when the cpu / disk is idle. It just
> continuously scans the bitmap looking for contiguous blocks of unused
> sectors. Each time it finds one, it sends the largest possible unmap
> down the block stack and eventually to the device.
>
> When normal cpu / disk activity kicks in, this process goes to sleep.
>
> That way much of the smarts are concentrated in the block layer, not
> in the filesystem code. And it is being done when the disk is
> otherwise idle, so you don't have the ncq interference.
>
> Even laptop users should have enough idle cpu available to manage
> this. Enterprise would get the large discards it wants, and
> unmentioned in the previous discussion, mdraid gets the large discards
> it also wants.
>
> ie. If a mdraid raid5/raid6 volume is built of SSDs, it will only be
> able to discard a full stripe at a time. Otherwise the P=D1 ^ D2 logic
> is lost.
>
> Another benefit of the above is the code should be extremely safe and testable.
>
> Greg
>
next prev parent reply other threads:[~2009-08-13 20:44 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-12 14:37 [PATCH] swap: send callback when swap slot is freed Nitin Gupta
2009-08-12 22:48 ` Hugh Dickins
2009-08-13 2:30 ` Nitin Gupta
2009-08-13 6:53 ` Peter Zijlstra
2009-08-13 14:44 ` Nitin Gupta
2009-08-13 17:45 ` Hugh Dickins
2009-08-13 2:41 ` Nitin Gupta
2009-08-13 5:05 ` compcache as a pre-swap area (was: [PATCH] swap: send callback when swap slot is freed) Al Boldi
2009-08-13 17:31 ` Nitin Gupta
2009-08-14 4:02 ` Al Boldi
2009-08-14 4:53 ` compcache as a pre-swap area Nitin Gupta
2009-08-14 15:49 ` Al Boldi
2009-08-15 11:00 ` Al Boldi
2009-08-13 15:13 ` Discard support (was Re: [PATCH] swap: send callback when swap slot is freed) Matthew Wilcox
2009-08-13 15:17 ` david
2009-08-13 15:26 ` Matthew Wilcox
2009-08-13 15:43 ` James Bottomley
2009-08-13 18:22 ` Ric Wheeler
2009-08-13 16:13 ` Nitin Gupta
2009-08-13 16:26 ` Markus Trippelsdorf
2009-08-13 16:33 ` david
2009-08-13 18:15 ` Greg Freemyer
2009-08-13 19:18 ` James Bottomley
2009-08-13 20:31 ` Richard Sharpe
2009-08-14 22:03 ` Mark Lord
2009-08-14 22:54 ` Greg Freemyer
2009-08-15 13:12 ` Mark Lord
2009-08-13 20:44 ` david [this message]
2009-08-13 20:54 ` Bryan Donlan
2009-08-14 22:10 ` Mark Lord
2009-08-14 23:21 ` Chris Worley
2009-08-14 23:45 ` Matthew Wilcox
2009-08-15 0:19 ` Chris Worley
2009-08-15 0:30 ` Greg Freemyer
2009-08-15 0:38 ` Chris Worley
2009-08-15 1:55 ` Greg Freemyer
2009-08-15 13:20 ` Mark Lord
2009-08-16 22:52 ` Chris Worley
2009-08-17 2:03 ` Mark Lord
2009-08-15 12:59 ` James Bottomley
2009-08-15 13:22 ` Mark Lord
2009-08-15 13:55 ` James Bottomley
2009-08-15 17:39 ` jim owens
2009-08-16 17:08 ` Robert Hancock
2009-08-16 14:05 ` Alan Cox
2009-08-16 14:16 ` Mark Lord
2009-08-16 15:34 ` Arjan van de Ven
2009-08-16 15:44 ` Theodore Tso
2009-08-16 17:28 ` Mark Lord
2009-08-16 17:37 ` Mark Lord
2009-08-16 17:37 ` Mark Lord
2009-08-17 16:30 ` Bill Davidsen
2009-08-17 16:56 ` jim owens
2009-08-17 17:14 ` Bill Davidsen
2009-08-17 17:37 ` jim owens
2009-08-16 15:52 ` James Bottomley
2009-08-16 16:32 ` Mark Lord
2009-08-16 18:07 ` James Bottomley
2009-08-16 18:19 ` Mark Lord
2009-08-16 18:24 ` James Bottomley
2009-08-17 16:37 ` Bill Davidsen
2009-08-17 17:08 ` Greg Freemyer
2009-08-17 17:19 ` James Bottomley
2009-08-17 18:16 ` Ric Wheeler
2009-08-17 18:21 ` Greg Freemyer
2009-08-17 19:18 ` James Bottomley
2009-08-17 20:19 ` Mark Lord
2009-08-17 20:28 ` James Bottomley
2009-08-17 20:28 ` Mark Lord
2009-08-16 16:59 ` Christoph Hellwig
2009-08-17 4:24 ` Douglas Gilbert
2009-08-17 13:56 ` James Bottomley
2009-08-17 14:10 ` Matthew Wilcox
2009-08-17 19:12 ` Christoph Hellwig
2009-08-17 19:24 ` James Bottomley
2009-08-16 21:50 ` Discard support Roland Dreier
2009-08-16 22:06 ` Jeff Garzik
2009-08-16 22:13 ` Theodore Tso
2009-08-16 22:51 ` Mark Lord
2009-08-16 19:29 ` Discard support (was Re: [PATCH] swap: send callback when swap slot is freed) Alan Cox
2009-08-16 23:05 ` John Robinson
2009-08-17 2:05 ` Mark Lord
2009-08-13 21:28 ` Greg Freemyer
2009-08-13 22:20 ` Richard Sharpe
2009-08-14 0:19 ` Greg Freemyer
[not found] ` <46b8a8850908131758s781b07f6v2729483c0e50ae7a@mail.gmail.com>
2009-08-14 21:33 ` Greg Freemyer
2009-08-14 21:56 ` Discard support Roland Dreier
2009-08-14 22:10 ` Greg Freemyer
2009-08-13 17:19 ` Discard support (was Re: [PATCH] swap: send callback when swap slot is freed) Hugh Dickins
2009-08-13 18:08 ` Douglas Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.1.10.0908131342460.28013@asgard.lang.hm \
--to=david@lang.hm \
--cc=greg.freemyer@gmail.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-raid@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=markus@trippelsdorf.de \
--cc=mingo@elte.hu \
--cc=ngupta@vflare.org \
--cc=peterz@infradead.org \
--cc=willy@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox