* [Ksummit-discuss] [topic] Richer internal block API @ 2014-05-29 17:49 Daniel Phillips 2014-05-29 18:13 ` Greg KH 0 siblings, 1 reply; 12+ messages in thread From: Daniel Phillips @ 2014-05-29 17:49 UTC (permalink / raw) To: ksummit-discuss, NeilBrown Hi Neil, This will be my annual proposal to open a general discussion about improving the internal block API, to be capable of doing all the things that the ZFS crowd claim are impossible without rampantly violating filesystem/raid layering. Attacking this in a storage-specific venue would also be good, however I view this issue as being at least as central as a number of topics already raised for general consideration. Full disclosure dept: I have an agenda. I want to add the equivalent of Raidz etc to Tux3 without reimplementing a logical volume manager in the filesystem. Regards, Daniel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Ksummit-discuss] [topic] Richer internal block API 2014-05-29 17:49 [Ksummit-discuss] [topic] Richer internal block API Daniel Phillips @ 2014-05-29 18:13 ` Greg KH 2014-05-29 18:13 ` Daniel Phillips 2014-05-30 9:56 ` Lukáš Czerner 0 siblings, 2 replies; 12+ messages in thread From: Greg KH @ 2014-05-29 18:13 UTC (permalink / raw) To: Daniel Phillips; +Cc: ksummit-discuss On Thu, May 29, 2014 at 10:49:13AM -0700, Daniel Phillips wrote: > Hi Neil, > > This will be my annual proposal to open a general discussion about improving > the internal block API, to be capable of doing all the things that the ZFS > crowd claim are impossible without rampantly violating filesystem/raid > layering. Attacking this in a storage-specific venue would also be good, > however I view this issue as being at least as central as a number of topics > already raised for general consideration. Why didn't you bring this up at the filesystem summit a few months ago? That's the best place for it, not at the kernel summit. > Full disclosure dept: I have an agenda. I want to add the equivalent of > Raidz etc to Tux3 without reimplementing a logical volume manager in the > filesystem. Like btrfs is doing? :) greg k-h ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Ksummit-discuss] [topic] Richer internal block API 2014-05-29 18:13 ` Greg KH @ 2014-05-29 18:13 ` Daniel Phillips 2014-05-29 18:23 ` Greg KH 2014-05-30 9:56 ` Lukáš Czerner 1 sibling, 1 reply; 12+ messages in thread From: Daniel Phillips @ 2014-05-29 18:13 UTC (permalink / raw) To: Greg KH; +Cc: ksummit-discuss On 05/29/2014 11:13 AM, Greg KH wrote: > On Thu, May 29, 2014 at 10:49:13AM -0700, Daniel Phillips wrote: >> Hi Neil, >> >> This will be my annual proposal to open a general discussion about improving >> the internal block API, to be capable of doing all the things that the ZFS >> crowd claim are impossible without rampantly violating filesystem/raid >> layering. Attacking this in a storage-specific venue would also be good, >> however I view this issue as being at least as central as a number of topics >> already raised for general consideration. > Why didn't you bring this up at the filesystem summit a few months ago? > That's the best place for it, not at the kernel summit. Sorry, I did not have time to participate this year. I wonder though, why power management is regarded as a summit-worthy topic, but core functionality of the block layer is not. > >> Full disclosure dept: I have an agenda. I want to add the equivalent of >> Raidz etc to Tux3 without reimplementing a logical volume manager in the >> filesystem. > Like btrfs is doing? :) > > greg k-h Not like btrfs is doing, the opposite really. Regards, Daniel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Ksummit-discuss] [topic] Richer internal block API 2014-05-29 18:13 ` Daniel Phillips @ 2014-05-29 18:23 ` Greg KH 2014-05-29 18:43 ` Daniel Phillips 0 siblings, 1 reply; 12+ messages in thread From: Greg KH @ 2014-05-29 18:23 UTC (permalink / raw) To: Daniel Phillips; +Cc: ksummit-discuss On Thu, May 29, 2014 at 11:13:25AM -0700, Daniel Phillips wrote: > On 05/29/2014 11:13 AM, Greg KH wrote: > >On Thu, May 29, 2014 at 10:49:13AM -0700, Daniel Phillips wrote: > >>Hi Neil, > >> > >>This will be my annual proposal to open a general discussion about improving > >>the internal block API, to be capable of doing all the things that the ZFS > >>crowd claim are impossible without rampantly violating filesystem/raid > >>layering. Attacking this in a storage-specific venue would also be good, > >>however I view this issue as being at least as central as a number of topics > >>already raised for general consideration. > >Why didn't you bring this up at the filesystem summit a few months ago? > >That's the best place for it, not at the kernel summit. > Sorry, I did not have time to participate this year. I wonder though, why > power management is regarded as a summit-worthy topic, but core > functionality of the block layer is not. power management covers the whole tree, the block layer is "just" the block layer. > >>Full disclosure dept: I have an agenda. I want to add the equivalent of > >>Raidz etc to Tux3 without reimplementing a logical volume manager in the > >>filesystem. > >Like btrfs is doing? :) > > > >greg k-h > Not like btrfs is doing, the opposite really. Good, post patches then :) greg k-h ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Ksummit-discuss] [topic] Richer internal block API 2014-05-29 18:23 ` Greg KH @ 2014-05-29 18:43 ` Daniel Phillips 2014-05-29 23:43 ` Greg KH 0 siblings, 1 reply; 12+ messages in thread From: Daniel Phillips @ 2014-05-29 18:43 UTC (permalink / raw) To: Greg KH; +Cc: ksummit-discuss On 05/29/2014 11:23 AM, Greg KH wrote: > On Thu, May 29, 2014 at 11:13:25AM -0700, Daniel Phillips wrote: >> ...I wonder though, why >> power management is regarded as a summit-worthy topic, but core >> functionality of the block layer is not. > power management covers the whole tree, the block layer is "just" the > block layer. Power management does not cover more of the tree than the block layer plus memory management plus filesystem plus vfs do, all of which are impacted, and all of which raise user visible API questions. > >>>> Full disclosure dept: I have an agenda. I want to add the equivalent of >>>> Raidz etc to Tux3 without reimplementing a logical volume manager in the >>>> filesystem. >>> Like btrfs is doing? :) >>> >>> greg k-h >> Not like btrfs is doing, the opposite really. > Good, post patches then :) > > greg k-h Is that a recommendation to develop a core API extension in a vacuum? Regards, Daniel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Ksummit-discuss] [topic] Richer internal block API 2014-05-29 18:43 ` Daniel Phillips @ 2014-05-29 23:43 ` Greg KH 2014-05-31 22:44 ` Daniel Phillips 0 siblings, 1 reply; 12+ messages in thread From: Greg KH @ 2014-05-29 23:43 UTC (permalink / raw) To: Daniel Phillips; +Cc: ksummit-discuss On Thu, May 29, 2014 at 11:43:09AM -0700, Daniel Phillips wrote: > >> Not like btrfs is doing, the opposite really. > > Good, post patches then :) > > > Is that a recommendation to develop a core API extension in a vacuum? No, do it like any other core api changes, post patches that explain what you want to do, and people will review them. Come on, you know how this all works, we don't have to have meetings in order to do design decisions that are "large". greg k-h ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Ksummit-discuss] [topic] Richer internal block API 2014-05-29 23:43 ` Greg KH @ 2014-05-31 22:44 ` Daniel Phillips 2014-06-01 2:34 ` Greg KH 2014-06-01 4:31 ` NeilBrown 0 siblings, 2 replies; 12+ messages in thread From: Daniel Phillips @ 2014-05-31 22:44 UTC (permalink / raw) To: Greg KH; +Cc: ksummit-discuss On 05/29/2014 04:43 PM, Greg KH wrote: > ...you know how this all works, we don't have to have meetings in > order to do design decisions that are "large". Perhaps there is something wrong with that approach. Certainly in regards to how to bridge the gap between what we now have for logical volume support, and what we should have, or what BSD has, that approach is demonstrably a perennial failure. After all these years, we still have dm and md as separate islands, no usable snapshotting block device, and roughly zero interaction between filesystems and volume managers. The larger issue would be, why is there no design process in Linux for large design issues? Maybe that is the core topic that is really missing. Regards, Daniel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Ksummit-discuss] [topic] Richer internal block API 2014-05-31 22:44 ` Daniel Phillips @ 2014-06-01 2:34 ` Greg KH 2014-06-02 17:33 ` Martin K. Petersen 2014-06-01 4:31 ` NeilBrown 1 sibling, 1 reply; 12+ messages in thread From: Greg KH @ 2014-06-01 2:34 UTC (permalink / raw) To: Daniel Phillips; +Cc: ksummit-discuss On Sat, May 31, 2014 at 03:44:52PM -0700, Daniel Phillips wrote: > On 05/29/2014 04:43 PM, Greg KH wrote: > >...you know how this all works, we don't have to have meetings in order to > >do design decisions that are "large". > > > Perhaps there is something wrong with that approach. Certainly in regards to > how to bridge the gap between what we now have for logical volume support, > and what we should have, or what BSD has, that approach is demonstrably a > perennial failure. After all these years, we still have dm and md as > separate islands, no usable snapshotting block device, and roughly zero > interaction between filesystems and volume managers. People have talked about this for over a very long time. I've seen Neil give numerous presentations about this for what, a decade now? It must not be important enough for anyone to actually do the work. Or, more likely, no one has been able to convince a company to sponsor the work. So perhaps, it isn't that major of a thing that is needed to be done? > The larger issue would be, why is there no design process in Linux for > large design issues? There is, an "evolutionary" process. If you take a look at a 4 or 5 year old kernel, major things have happened. It's just that if you are in the middle of it all, it doesn't look like "large" things have changed. thanks, greg k-h ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Ksummit-discuss] [topic] Richer internal block API 2014-06-01 2:34 ` Greg KH @ 2014-06-02 17:33 ` Martin K. Petersen 2014-06-02 18:10 ` James Bottomley 0 siblings, 1 reply; 12+ messages in thread From: Martin K. Petersen @ 2014-06-02 17:33 UTC (permalink / raw) To: Greg KH; +Cc: ksummit-discuss >>>>> "Greg" == Greg KH <greg@kroah.com> writes: >> Perhaps there is something wrong with that approach. Certainly in >> regards to how to bridge the gap between what we now have for logical >> volume support, and what we should have, or what BSD has, that >> approach is demonstrably a perennial failure. After all these years, >> we still have dm and md as separate islands, no usable snapshotting >> block device, and roughly zero interaction between filesystems and >> volume managers. Greg> People have talked about this for over a very long time. Lots of talking, indeed. But I think the main problem that there's nothing (or very little) to see here. Move along :) Either you let the filesystem explicitly manage RAID and snapshots (like btrfs) or you let DM or MD do it behind the filesystem's back. What's the point of introducing a new interface to do something that we already have? That doesn't mean that there isn't merit to the "given this cookie, do you happen to have another copy?" call we have discussed in the past. Somebody just needs to do it. But I honestly think that btrfs is a much better approach to that whole thing... -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Ksummit-discuss] [topic] Richer internal block API 2014-06-02 17:33 ` Martin K. Petersen @ 2014-06-02 18:10 ` James Bottomley 0 siblings, 0 replies; 12+ messages in thread From: James Bottomley @ 2014-06-02 18:10 UTC (permalink / raw) To: Martin K. Petersen; +Cc: ksummit-discuss On Mon, 2014-06-02 at 13:33 -0400, Martin K. Petersen wrote: > >>>>> "Greg" == Greg KH <greg@kroah.com> writes: > > >> Perhaps there is something wrong with that approach. Certainly in > >> regards to how to bridge the gap between what we now have for logical > >> volume support, and what we should have, or what BSD has, that > >> approach is demonstrably a perennial failure. After all these years, > >> we still have dm and md as separate islands, no usable snapshotting > >> block device, and roughly zero interaction between filesystems and > >> volume managers. > > Greg> People have talked about this for over a very long time. Agreed; KS would never be the right venue for this, it's a LSF topic. > Lots of talking, indeed. But I think the main problem that there's > nothing (or very little) to see here. Move along :) > > Either you let the filesystem explicitly manage RAID and snapshots (like > btrfs) or you let DM or MD do it behind the filesystem's back. What's > the point of introducing a new interface to do something that we already > have? > > That doesn't mean that there isn't merit to the "given this cookie, do > you happen to have another copy?" call we have discussed in the past. > Somebody just needs to do it. But I honestly think that btrfs is a much > better approach to that whole thing... We actually tried RAID unification between btrfs and dm and md a long time ago. We did make some progress with dm and md, but the use paradigm of btrfs is just a bit too different and it couldn't be made to work without making a huge mess. What's happening now is that we're looking at the token and descriptor APIs (mostly for copy offload) and if we find a good one we could revisit the issue and see if there's other things it might support. When I was a kid, I used to love architecture (in the software sense) because it looked like blue printing the perfect edifice in advance and then just putting the bricks in. Now that I'm older, I far prefer having a set of abstractions that make an outline and being guided by how the pieces fit together because that leaves you open to things the perfect architecture approach forces you to ignore and it fits well with the Linux code and use case requirements. I'm sure when the use case finally arrives we'll be able to refactor around it, but I don't think it's quite here yet. James ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Ksummit-discuss] [topic] Richer internal block API 2014-05-31 22:44 ` Daniel Phillips 2014-06-01 2:34 ` Greg KH @ 2014-06-01 4:31 ` NeilBrown 1 sibling, 0 replies; 12+ messages in thread From: NeilBrown @ 2014-06-01 4:31 UTC (permalink / raw) To: Daniel Phillips; +Cc: ksummit-discuss [-- Attachment #1: Type: text/plain, Size: 1922 bytes --] On Sat, 31 May 2014 15:44:52 -0700 Daniel Phillips <d.phillips@partner.samsung.com> wrote: > On 05/29/2014 04:43 PM, Greg KH wrote: > > ...you know how this all works, we don't have to have meetings in > > order to do design decisions that are "large". > > > Perhaps there is something wrong with that approach. Certainly in > regards to how to bridge the gap between what we now have for logical > volume support, and what we should have, or what BSD has, that approach > is demonstrably a perennial failure. After all these years, we still > have dm and md as separate islands, no usable snapshotting block device, > and roughly zero interaction between filesystems and volume managers. dm-raid.c is a bridge between those islands. Does dm-thin.c not provide usable snapshots? I admit I haven't looked in detail. > The larger issue would be, why is there no design process in Linux for > large design issues? Maybe that is the core topic that is really missing. What sort of "design process" do you imagine? Something like IETF? While it certainly has had some successes I don't see that its process as conducive to quality. The design rule for Linux is simple: show me the code. If if passes review, it goes in. If it doesn't you should know why and can try again. You can certainly start with a design proposal if you like, and you might get valuable feedback from that. The more concrete your design, the easier it is to respond to, so the quality of the responses you get will be higher. But there is no way to escape the fact that, for a "big design" which affects multiple subsystems, you will probably need to develop several prototypes before you find something that works well. Be ready to discard and try again. Like Greg said - it is "evolutionary" and evolution isn't just "survival of the fittest", it is also "death to the weak". NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Ksummit-discuss] [topic] Richer internal block API 2014-05-29 18:13 ` Greg KH 2014-05-29 18:13 ` Daniel Phillips @ 2014-05-30 9:56 ` Lukáš Czerner 1 sibling, 0 replies; 12+ messages in thread From: Lukáš Czerner @ 2014-05-30 9:56 UTC (permalink / raw) To: Greg KH; +Cc: ksummit-discuss On Thu, 29 May 2014, Greg KH wrote: > Date: Thu, 29 May 2014 11:13:19 -0700 > From: Greg KH <greg@kroah.com> > To: Daniel Phillips <d.phillips@partner.samsung.com> > Cc: ksummit-discuss@lists.linuxfoundation.org > Subject: Re: [Ksummit-discuss] [topic] Richer internal block API > > On Thu, May 29, 2014 at 10:49:13AM -0700, Daniel Phillips wrote: > > Hi Neil, > > > > This will be my annual proposal to open a general discussion about improving > > the internal block API, to be capable of doing all the things that the ZFS > > crowd claim are impossible without rampantly violating filesystem/raid > > layering. Attacking this in a storage-specific venue would also be good, > > however I view this issue as being at least as central as a number of topics > > already raised for general consideration. > > Why didn't you bring this up at the filesystem summit a few months ago? > That's the best place for it, not at the kernel summit. Actually we've sort-of started the discussion about this topic at LSF. Dave Chinner was the one who brought this up, the only problem was that his idea was in really early stage and I suppose it still is because I have not heard about this since then. But I agree that this kind of discussion is more suited for LSF rather than kernel summit since it's much more targeted to block vs. file systems interactions. > > > Full disclosure dept: I have an agenda. I want to add the equivalent of > > Raidz etc to Tux3 without reimplementing a logical volume manager in the > > filesystem. > > Like btrfs is doing? :) Well, we want exactly what btrfs is _not_ doing :) -Lukas > > greg k-h > _______________________________________________ > Ksummit-discuss mailing list > Ksummit-discuss@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss > ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2014-06-02 18:10 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-05-29 17:49 [Ksummit-discuss] [topic] Richer internal block API Daniel Phillips 2014-05-29 18:13 ` Greg KH 2014-05-29 18:13 ` Daniel Phillips 2014-05-29 18:23 ` Greg KH 2014-05-29 18:43 ` Daniel Phillips 2014-05-29 23:43 ` Greg KH 2014-05-31 22:44 ` Daniel Phillips 2014-06-01 2:34 ` Greg KH 2014-06-02 17:33 ` Martin K. Petersen 2014-06-02 18:10 ` James Bottomley 2014-06-01 4:31 ` NeilBrown 2014-05-30 9:56 ` Lukáš Czerner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox