From: jim owens <jowens@hp.com>
To: James Bottomley <James.Bottomley@suse.de>
Cc: Mark Lord <liml@rtr.ca>, Chris Worley <worleys@gmail.com>,
Matthew Wilcox <matthew@wil.cx>, Bryan Donlan <bdonlan@gmail.com>,
david@lang.hm, Greg Freemyer <greg.freemyer@gmail.com>,
Markus Trippelsdorf <markus@trippelsdorf.de>,
Matthew Wilcox <willy@linux.intel.com>,
Hugh Dickins <hugh.dickins@tiscali.co.uk>,
Nitin Gupta <ngupta@vflare.org>, Ingo Molnar <mingo@elte.hu>,
Peter Zijlstra <peterz@infradead.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-scsi@vger.kernel.org, linux-ide@vger.kernel.org,
Linux RAID <linux-raid@vger.kernel.org>
Subject: Re: Discard support (was Re: [PATCH] swap: send callback when swap slot is freed)
Date: Sat, 15 Aug 2009 13:39:45 -0400 [thread overview]
Message-ID: <4A86F2E1.8080002@hp.com> (raw)
In-Reply-To: <1250344518.4159.4.camel@mulgrave.site>
James Bottomley wrote:
>
> That's not really what the enterprise is saying about flush barriers.
> True, not all the performance problems are NCQ queue drain, but for a
> steady workload they are significant.
OK, we now know that SSDs designed only to the letter of the ATA
spec will suck doing discards if we send them down as we are
doing today.
Having finally caught up with this thread, I'm going to add some
comments that James already knows but were not stated that some
of the others apparently don't know :
- The current filesystem/blockdev behavior with discard TRIM was
argued and added quickly because this design was what the
Intel SSD architect told us was "the right thing" in Sept 08.
- In the same workshop, Linus said "I'm tired of hardware
vendors telling me to fix it because they are cheap and lazy",
or something close to that, my memory gets bit-errors :)
- We decided not to track and coalesce the discards in the block
or filesystem layer because of the high memory/performance cost.
There is no cheap way to do this, all of the space management
in filesystems is accepting some cost for some user benefit.
- Many people who live in filesystems (like me) are unconvinced
that discard to SSD or an array will help in real world use,
but the current discard design didn't seem to hurt us either.
***begin rant***
I have not seen any analysis of the benefit and cost to the
end user of the TRIM or array UNMAP. We now see that TRIM
as implemented by some (all?) SSDs will come at high cost.
The cost is all born by the host. Do we get any benefit, or
is it all for the device vendor. And when we subtract the cost
from the benefit, does the user actually benefit and how?
I'm tired of working around shit storage products and broken
device protocols from the "T" committees. I suggest we just
add a "white list" of devices that handle the discard fast
and without us needing NCQ queue drain. Then only send TRIM
to devices that are on the white list and throw the others
away in the block device layer.
I do enterprise systems and the cost of RAM in those systems
is awful. And the databases and applications are always big
memory pigs. Our customers always complain about the kernel
using too much memory and they will go ballistic if we take
1GB from their 512GB system unless we can really show them
significant benefit in their production. And so far all
we have is "this is all good stuff" from array vendors.
[and yes, our hardware guys always give me the most pain]
If continuous discard is going to be a PITA for us, then
I say don't do it. Just let a user-space tool do it when
the admin wants. IMO is no different than defragment,
where my experience with a kernel continuous defragment
was that it made a great sales gimmick, but in real production
most people saw no benefit and some had to shut it off
because it actually hurt them. It is all about workload.
jim
P.S. Matthew, that SSD architect told me personally
that the trim of each 512 byte block before rewrite
will be a performance benefit, so if Intel SSDs are
not on the white list, please slap him for me.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-08-15 17:39 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-12 14:37 [PATCH] swap: send callback when swap slot is freed Nitin Gupta
2009-08-12 22:48 ` Hugh Dickins
2009-08-13 2:30 ` Nitin Gupta
2009-08-13 6:53 ` Peter Zijlstra
2009-08-13 14:44 ` Nitin Gupta
2009-08-13 17:45 ` Hugh Dickins
2009-08-13 2:41 ` Nitin Gupta
2009-08-13 5:05 ` compcache as a pre-swap area (was: [PATCH] swap: send callback when swap slot is freed) Al Boldi
2009-08-13 17:31 ` Nitin Gupta
2009-08-14 4:02 ` Al Boldi
2009-08-14 4:53 ` compcache as a pre-swap area Nitin Gupta
2009-08-14 15:49 ` Al Boldi
2009-08-15 11:00 ` Al Boldi
2009-08-13 15:13 ` Discard support (was Re: [PATCH] swap: send callback when swap slot is freed) Matthew Wilcox
2009-08-13 15:17 ` david
2009-08-13 15:26 ` Matthew Wilcox
2009-08-13 15:43 ` James Bottomley
2009-08-13 18:22 ` Ric Wheeler
2009-08-13 16:13 ` Nitin Gupta
2009-08-13 16:26 ` Markus Trippelsdorf
2009-08-13 16:33 ` david
2009-08-13 18:15 ` Greg Freemyer
2009-08-13 19:18 ` James Bottomley
2009-08-13 20:31 ` Richard Sharpe
2009-08-14 22:03 ` Mark Lord
2009-08-14 22:54 ` Greg Freemyer
2009-08-15 13:12 ` Mark Lord
2009-08-13 20:44 ` david
2009-08-13 20:54 ` Bryan Donlan
2009-08-14 22:10 ` Mark Lord
2009-08-14 23:21 ` Chris Worley
2009-08-14 23:45 ` Matthew Wilcox
2009-08-15 0:19 ` Chris Worley
2009-08-15 0:30 ` Greg Freemyer
2009-08-15 0:38 ` Chris Worley
2009-08-15 1:55 ` Greg Freemyer
2009-08-15 13:20 ` Mark Lord
2009-08-16 22:52 ` Chris Worley
2009-08-17 2:03 ` Mark Lord
2009-08-15 12:59 ` James Bottomley
2009-08-15 13:22 ` Mark Lord
2009-08-15 13:55 ` James Bottomley
2009-08-15 17:39 ` jim owens [this message]
2009-08-16 17:08 ` Robert Hancock
2009-08-16 14:05 ` Alan Cox
2009-08-16 14:16 ` Mark Lord
2009-08-16 15:34 ` Arjan van de Ven
2009-08-16 15:44 ` Theodore Tso
2009-08-16 17:28 ` Mark Lord
2009-08-16 17:37 ` Mark Lord
2009-08-16 17:37 ` Mark Lord
2009-08-17 16:30 ` Bill Davidsen
2009-08-17 16:56 ` jim owens
2009-08-17 17:14 ` Bill Davidsen
2009-08-17 17:37 ` jim owens
2009-08-16 15:52 ` James Bottomley
2009-08-16 16:32 ` Mark Lord
2009-08-16 18:07 ` James Bottomley
2009-08-16 18:19 ` Mark Lord
2009-08-16 18:24 ` James Bottomley
2009-08-17 16:37 ` Bill Davidsen
2009-08-17 17:08 ` Greg Freemyer
2009-08-17 17:19 ` James Bottomley
2009-08-17 18:16 ` Ric Wheeler
2009-08-17 18:21 ` Greg Freemyer
2009-08-17 19:18 ` James Bottomley
2009-08-17 20:19 ` Mark Lord
2009-08-17 20:28 ` James Bottomley
2009-08-17 20:28 ` Mark Lord
2009-08-16 16:59 ` Christoph Hellwig
2009-08-17 4:24 ` Douglas Gilbert
2009-08-17 13:56 ` James Bottomley
2009-08-17 14:10 ` Matthew Wilcox
2009-08-17 19:12 ` Christoph Hellwig
2009-08-17 19:24 ` James Bottomley
2009-08-16 21:50 ` Discard support Roland Dreier
2009-08-16 22:06 ` Jeff Garzik
2009-08-16 22:13 ` Theodore Tso
2009-08-16 22:51 ` Mark Lord
2009-08-16 19:29 ` Discard support (was Re: [PATCH] swap: send callback when swap slot is freed) Alan Cox
2009-08-16 23:05 ` John Robinson
2009-08-17 2:05 ` Mark Lord
2009-08-13 21:28 ` Greg Freemyer
2009-08-13 22:20 ` Richard Sharpe
2009-08-14 0:19 ` Greg Freemyer
[not found] ` <46b8a8850908131758s781b07f6v2729483c0e50ae7a@mail.gmail.com>
2009-08-14 21:33 ` Greg Freemyer
2009-08-14 21:56 ` Discard support Roland Dreier
2009-08-14 22:10 ` Greg Freemyer
2009-08-13 17:19 ` Discard support (was Re: [PATCH] swap: send callback when swap slot is freed) Hugh Dickins
2009-08-13 18:08 ` Douglas Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A86F2E1.8080002@hp.com \
--to=jowens@hp.com \
--cc=James.Bottomley@suse.de \
--cc=bdonlan@gmail.com \
--cc=david@lang.hm \
--cc=greg.freemyer@gmail.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=liml@rtr.ca \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-raid@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=markus@trippelsdorf.de \
--cc=matthew@wil.cx \
--cc=mingo@elte.hu \
--cc=ngupta@vflare.org \
--cc=peterz@infradead.org \
--cc=willy@linux.intel.com \
--cc=worleys@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox