Re: [PATCH] binder_alloc: Move alloc_page() out of mmap_rwsem to reduce the lock duration

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Barry Song <21cnbao@gmail.com>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	"Barry Song" <v-songbaohua@oppo.com>,
	"Arve Hjønnevåg" <arve@android.com>,
	"Todd Kjos" <tkjos@android.com>,
	"Martijn Coenen" <maco@android.com>,
	"Joel Fernandes" <joel@joelfernandes.org>,
	"Christian Brauner" <brauner@kernel.org>,
	"Carlos Llamas" <cmllamas@google.com>,
	"Suren Baghdasaryan" <surenb@google.com>,
	"Tangquan Zheng" <zhengtangquan@oppo.com>
Subject: Re: [PATCH] binder_alloc: Move alloc_page() out of mmap_rwsem to reduce the lock duration
Date: Tue, 3 Sep 2024 12:04:57 +0200	[thread overview]
Message-ID: <2024090331-rewire-ransack-e73b@gregkh> (raw)
In-Reply-To: <CAGsJ_4ybZRudJ+p7pxgb1xH7HP0rKcWW1Dtr_kvb7EUwnqxsQQ@mail.gmail.com>

On Tue, Sep 03, 2024 at 05:07:23PM +0800, Barry Song wrote:
> On Tue, Sep 3, 2024 at 4:57 PM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Tue, Sep 03, 2024 at 10:50:09AM +1200, Barry Song wrote:
> > > From: Barry Song <v-songbaohua@oppo.com>
> > >
> > > The mmap_write_lock() can block all access to the VMAs, for example page
> > > faults. Performing memory allocation while holding this lock may trigger
> > > direct reclamation, leading to others being queued in the rwsem for an
> > > extended period.
> > > We've observed that the allocation can sometimes take more than 300ms,
> > > significantly blocking other threads. The user interface sometimes
> > > becomes less responsive as a result. To prevent this, let's move the
> > > allocation outside of the write lock.
> > > A potential side effect could be an extra alloc_page() for the second
> > > thread executing binder_install_single_page() while the first thread
> > > has done it earlier. However, according to Tangquan's 48-hour profiling
> > > using monkey, the likelihood of this occurring is minimal, with a ratio
> > > of only 1 in 2400. Compared to the significantly costly rwsem, this is
> > > negligible.
> > > On the other hand, holding a write lock without making any VMA
> > > modifications appears questionable and likely incorrect. While this
> > > patch focuses on reducing the lock duration, future updates may aim
> > > to eliminate the write lock entirely.
> > >
> > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > Cc: "Arve Hjønnevåg" <arve@android.com>
> > > Cc: Todd Kjos <tkjos@android.com>
> > > Cc: Martijn Coenen <maco@android.com>
> > > Cc: Joel Fernandes <joel@joelfernandes.org>
> > > Cc: Christian Brauner <brauner@kernel.org>
> > > Cc: Carlos Llamas <cmllamas@google.com>
> > > Cc: Suren Baghdasaryan <surenb@google.com>
> > > Tested-by: Tangquan Zheng <zhengtangquan@oppo.com>
> > > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > > ---
> > >  drivers/android/binder_alloc.c | 18 ++++++++++++++----
> > >  1 file changed, 14 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
> > > index b3acbc4174fb..f20074e23a7c 100644
> > > --- a/drivers/android/binder_alloc.c
> > > +++ b/drivers/android/binder_alloc.c
> > > @@ -227,13 +227,23 @@ static int binder_install_single_page(struct binder_alloc *alloc,
> > >       if (!mmget_not_zero(alloc->mm))
> > >               return -ESRCH;
> > >
> > > +     /*
> > > +      * Don't allocate page in mmap_write_lock, this can block
> > > +      * mmap_rwsem for a long time; Meanwhile, allocation failure
> > > +      * doesn't necessarily need to return -ENOMEM, if lru_page
> > > +      * has been installed, we can still return 0(success).
> > > +      */
> > > +     page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO);
> >
> > But now you are allocating new pages even if binder_get_installed_page()
> > is an error, right?  Doesn't that slow things down?
> 
> very very unlikely, as the ratio is only 1/2400 while write lock 100% blocks
> everyone.

Ok, but:

> > How was this benchmarked?
> 
> i actually put Tangquan's profiling result:
> "
> However, according to Tangquan's 48-hour profiling
>  using monkey, the likelihood of this occurring is minimal, with a ratio
>  of only 1 in 2400. Compared to the significantly costly rwsem, this is
>  negligible."

That's not a benchmark, or any real numbers of how this overall saves
any time.

> > >       /*
> > >        * Protected with mmap_sem in write mode as multiple tasks
> > >        * might race to install the same page.
> > >        */
> > >       mmap_write_lock(alloc->mm);
> > > -     if (binder_get_installed_page(lru_page))
> > > +     if (binder_get_installed_page(lru_page)) {
> > > +             ret = 1;
> >
> > That is not a valid error value :(
> >
> > >               goto out;
> > > +     }
> > >
> > >       if (!alloc->vma) {
> > >               pr_err("%d: %s failed, no vma\n", alloc->pid, __func__);
> > > @@ -241,7 +251,6 @@ static int binder_install_single_page(struct binder_alloc *alloc,
> > >               goto out;
> > >       }
> > >
> > > -     page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO);
> > >       if (!page) {
> > >               pr_err("%d: failed to allocate page\n", alloc->pid);
> > >               ret = -ENOMEM;
> > > @@ -252,7 +261,6 @@ static int binder_install_single_page(struct binder_alloc *alloc,
> > >       if (ret) {
> > >               pr_err("%d: %s failed to insert page at offset %lx with %d\n",
> > >                      alloc->pid, __func__, addr - alloc->buffer, ret);
> > > -             __free_page(page);
> > >               ret = -ENOMEM;
> > >               goto out;
> > >       }
> > > @@ -262,7 +270,9 @@ static int binder_install_single_page(struct binder_alloc *alloc,
> > >  out:
> > >       mmap_write_unlock(alloc->mm);
> > >       mmput_async(alloc->mm);
> > > -     return ret;
> > > +     if (ret && page)
> > > +             __free_page(page);
> > > +     return ret < 0 ? ret : 0;
> >
> > Please only use ? : for when you have to, otherwise please spell it out
> > with a normal if statement:
> >         if (ret < 0)
> >                 return ret;
> >         return 0;
> >
> > But, this is abusing the fact that you set "ret = 1" above, which is
> > going to trip someone up in the future as that is NOT a normal coding
> > pattern we have in the kernel, sorry.
> >
> > If you insist on this change, please rework it to not have that type of
> > "positive means one thing, 0 means another, and negative means yet
> > something else" please.
> 
> I was trying to consolidate all free_page() calls into one place. Otherwise,
> we would need multiple free_page() calls. I'm perfectly fine with having more
> free_page() calls in both the ret = 1 and ret < 0 paths. In that case,
> the ret = 1
> path can be removed if you prefer.

Remember, we write code for people first, and compilers second.  You
have to maintain this code for the next 10+ years, make it _VERY_
obvious what is happening and how it works as you will be coming back to
it and not remembering what was made for what reason at all.

thanks,

greg k-h

next prev parent reply	other threads:[~2024-09-03 10:37 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-02 22:50 Barry Song
2024-09-03  8:57 ` Greg Kroah-Hartman
2024-09-03  9:07   ` Barry Song
2024-09-03 10:04     ` Greg Kroah-Hartman [this message]
2024-09-03 11:01       ` Barry Song
2024-09-03 11:01 ` Hillf Danton
2024-09-03 11:45   ` Barry Song
2024-09-03 13:55     ` Carlos Llamas
2024-09-03 21:10       ` Barry Song
2024-09-03 22:23         ` Carlos Llamas
2024-09-03 22:43           ` Barry Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2024090331-rewire-ransack-e73b@gregkh \
    --to=gregkh@linuxfoundation.org \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arve@android.com \
    --cc=brauner@kernel.org \
    --cc=cmllamas@google.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maco@android.com \
    --cc=surenb@google.com \
    --cc=tkjos@android.com \
    --cc=v-songbaohua@oppo.com \
    --cc=zhengtangquan@oppo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox