linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>,
	Dan Schatzberg <schatzberg.dan@gmail.com>,
	Jens Axboe <axboe@kernel.dk>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Amir Goldstein <amir73il@gmail.com>,
	Li Zefan <lizefan@huawei.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>, Roman Gushchin <guro@fb.com>,
	Shakeel Butt <shakeelb@google.com>,
	Chris Down <chris@chrisdown.name>,
	Yang Shi <yang.shi@linux.alibaba.com>,
	Ingo Molnar <mingo@kernel.org>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"open list:BLOCK LAYER" <linux-block@vger.kernel.org>,
	open list <linux-kernel@vger.kernel.org>,
	"open list:FILESYSTEMS (VFS and infrastructure)"
	<linux-fsdevel@vger.kernel.org>,
	"open list:CONTROL GROUP (CGROUP)" <cgroups@vger.kernel.org>,
	"open list:CONTROL GROUP - MEMORY RESOURCE CONTROLLER (MEMCG)"
	<linux-mm@kvack.org>
Subject: Re: [PATCH v5 0/4] Charge loop device i/o to issuing cgroup
Date: Tue, 5 May 2020 11:38:34 -0400	[thread overview]
Message-ID: <20200505153834.GA12217@mtj.thefacebook.com> (raw)
In-Reply-To: <20200505064114.GI2005@dread.disaster.area>

Hello, Dave.

On Tue, May 05, 2020 at 04:41:14PM +1000, Dave Chinner wrote:
> > OTOH I don't have a great idea how the generic infrastructure should look
> > like...
> 
> I haven't given it any thought - it's not something I have any
> bandwidth to spend time on.  I'll happily review a unified
> generic cgroup-aware kthread-based IO dispatch mechanism, but I
> don't have the time to design and implement that myself....
> 
> OTOH, I will make time to stop people screwing up filesystems and
> block devices with questionable complexity and unique, storage
> device dependent userspace visible error behaviour. This sort of
> change is objectively worse for users than not supporting the
> functionality in the first place.

That probably is too strong a position to hold without spending at least
some thoughts on a subject, whatever the subject may be, and it doesn't seem
like your understanding of userspace implications is accurate.

I don't necessarily disagree that it'd be nice to have a common
infrastructure and there may be some part which can actually be factored
out. However, there isn't gonna be a magic bullet which magically makes
every IO thing in the kernel cgroup aware automatically. Please consider the
followings.

* Avoding IO priority inversions requires splitting IO channels according to
  cgroups and working around (e.g. with backcharging) when they can't be.
  It's a substantial feature which may require substantial changes. Each IO
  subsystem has different constraints and existing structures and many of
  them would require their own solutions. It's not different from different
  filesystems needing their own solutions for similar problems.

* Because different filesystems and IO stacking layers already have their
  own internal infrastructure, the right way to add cgroup support is
  adapting to and modifying the existing infrastructure rather than trying
  to restructure them to use the same cgroup mechanism, which I don't think
  would be possible in many cases.

* Among the three IO stacking / redirecting mechanisms - md/dm, loop and
  fuse - the requirements and what's possible vary quite a bit. md/dm
  definitely need to support full-on IO channel splitting cgroup support.
  loop can go either way, but given existing uses, full splitting makes a
  sense. fuse, as it currently stands, can't support that because the
  priority inversions extend all the way to userspace and the kernel API
  isn't built for that. If it wants to support cgroup containment, each
  instance would have to be assigned to a cgroup.

Between dm/md and loop, it's maybe possible that some of the sub-threading
code can be reused, but I don't see a point in blocking loop updates given
that it clearly fixes userspace visible malfunctions, is not that much code
and how the shared code should look is unclear yet. We'll be able to answer
the sharing question when we actually get to dm/md conversion.

Thanks.

-- 
tejun


  reply	other threads:[~2020-05-05 15:38 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-28 16:13 Dan Schatzberg
2020-04-28 16:13 ` [PATCH v5 1/4] loop: Use worker per cgroup instead of kworker Dan Schatzberg
2020-04-28 16:13 ` [PATCH v5 2/4] mm: support nesting memalloc_use_memcg() Dan Schatzberg
2020-04-28 16:13 ` [PATCH v5 3/4] mm: Charge active memcg when no mm is set Dan Schatzberg
2020-04-28 16:13 ` [PATCH v5 4/4] loop: Charge i/o to mem and blk cg Dan Schatzberg
2020-04-28 21:47 ` [PATCH v5 0/4] Charge loop device i/o to issuing cgroup Dave Chinner
2020-04-29  2:27   ` Johannes Weiner
2020-05-05  6:29     ` Dave Chinner
2020-05-05 13:55       ` Shakeel Butt
2020-05-05 15:02       ` Johannes Weiner
2020-04-29 10:25   ` Jan Kara
2020-04-29 14:22     ` Tejun Heo
2020-04-29 16:21       ` Jan Kara
2020-05-05  6:41     ` Dave Chinner
2020-05-05 15:38       ` Tejun Heo [this message]
2020-04-29 14:03   ` Dan Schatzberg
2020-05-12 13:25 ` Dan Schatzberg
2020-05-12 13:35   ` Christoph Hellwig
2020-05-26 14:28     ` Dan Schatzberg
2020-05-27  5:09       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200505153834.GA12217@mtj.thefacebook.com \
    --to=tj@kernel.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=amir73il@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=chris@chrisdown.name \
    --cc=david@fromorbit.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan@huawei.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhocko@kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=schatzberg.dan@gmail.com \
    --cc=shakeelb@google.com \
    --cc=tglx@linutronix.de \
    --cc=vdavydov.dev@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox