From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CA7FCD37B5 for ; Tue, 3 Sep 2024 22:43:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8C1388D01EB; Tue, 3 Sep 2024 18:43:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 872E78D01E4; Tue, 3 Sep 2024 18:43:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C3DA8D01EB; Tue, 3 Sep 2024 18:43:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 4F2A58D01E4 for ; Tue, 3 Sep 2024 18:43:44 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 1400F40A24 for ; Tue, 3 Sep 2024 22:43:44 +0000 (UTC) X-FDA: 82524905568.17.00F21FA Received: from mail-vs1-f51.google.com (mail-vs1-f51.google.com [209.85.217.51]) by imf15.hostedemail.com (Postfix) with ESMTP id 50653A000E for ; Tue, 3 Sep 2024 22:43:42 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=SzCENnej; spf=pass (imf15.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.51 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725403326; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WyKLp2X/aLUkAuQMAIzM6BMt+s7i3E+TaRptCHvpUeU=; b=Yu7qaMih3gqpkcDgkQ+ZM0O3FJGnzQQgB4uivXfLxk+WG1KBc0UY2I46iqfYttMopHzoYa bY4WFVu7b2lNFxD1oQKCYUo/vmouN8eFPgXlouPAn+895Vh6XCIjc/ehuVkBbvVEqVrzoj ivBU8/NLUddklqONAh80iRUCgdN2V7Q= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725403326; a=rsa-sha256; cv=none; b=ewejScNrTIqiZDm/YTfYGGhhxX0cUG8HjpNj7Luyz2y8BHqb6tSkbai+Zx/qMHXgvJLT5Z IFobX357W1TKcL1E7wRLgVNNBjfYdchQMYJQidmW4GnG8DWZx6sxh/VOtDNezbqVt/Lw3G vwtTyqjn9y/O+zfpisjITYUYtC4LC8I= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=SzCENnej; spf=pass (imf15.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.51 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-vs1-f51.google.com with SMTP id ada2fe7eead31-498c4d5a912so2166073137.2 for ; Tue, 03 Sep 2024 15:43:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725403421; x=1726008221; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=WyKLp2X/aLUkAuQMAIzM6BMt+s7i3E+TaRptCHvpUeU=; b=SzCENnejiv7lpAZinTA09wDKqYjtLXvf7uIXQBCxDdVr6AKtdBb2QUNwaxvnpofhhV AUVJ5sxXt2EL05+v+u26+6uB+6pG0H46UWncJq193LBxuW+hPpEze938qWIX8XEM8BRW i9lZx4687K4rc+/NRQRBOvQziqvn1cCrK9w7oefAwkwVRD94vbS92dUXM4ayWaozXokv Qn9xqZpJkaST9CnUFz/mVh6s7z9zspdVNS9mMiSHYADqQySTQlixhc7I6wG0A3AQJQxk H7wvtsKApv0isVcuYZAyKNzLt3z1gPUIi/Ldgm9ik+5QYHI+oMB9/PGYLla3GyDHJLl4 YF8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725403421; x=1726008221; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WyKLp2X/aLUkAuQMAIzM6BMt+s7i3E+TaRptCHvpUeU=; b=q85ojN35r/dyoCFLlnJ+MST7dRiQnCLIccXlVhTwd8YFkbVA9MWogzioiX6HYqlWRf XMkXsEgkJ34zMhnN6+dfuGrebM0WlAgHs4894xsI1ca4UdixVrA4oVOssn6PNS+uTAEd PdVsbB/XtdqgVVX9bv3g9a9diwf2LhiD50QoljSn/ifp5rIl4TNo86zRQIJnAGj5la0a bEIhFLxrLPEyiXZOd9f4Lijm+9jx3BefVgvcLHdZyudBtyX3zCgzv4q+H6+QxSxeZR00 oSQPU5/mtCv15ZUmBvIBUg/Dv5A7Cmt5uMejXEKz0Agzvhe9sz4ScJ3LQU7oYwtyBWfi FV0w== X-Forwarded-Encrypted: i=1; AJvYcCUD8IiuhuWKnRSvIHDwKACHdeq1FASqbFmlKTBekiLTLCPq1L035rwsGrsQbxfVSQTS83oOflsG3w==@kvack.org X-Gm-Message-State: AOJu0Yx8TXvNTfv3h3l79/3EyphAlEbxH1tx04fiSZARG0kxTLYFyBEQ By+5hfwrv1mHmtzmNbg3HaTNnOkz9J7sI3Zt5gxW/Pl1ynTPJvvxkVHeF2ESqeuWGE4E10DTnCr 16tuWLCJ7p9kVfy2lj+T2KQ/6U2E= X-Google-Smtp-Source: AGHT+IFkginAgKVH6MtyiWT0JWfQMlg3M6600l6CL0SZSSNEdqsRrSzzSKjjbrH7j90ZkLGjMDRc5+u0/OoORyC/eA4= X-Received: by 2002:a05:6102:d8a:b0:48f:4eeb:566d with SMTP id ada2fe7eead31-49ba89d5d90mr5955078137.12.1725403421161; Tue, 03 Sep 2024 15:43:41 -0700 (PDT) MIME-Version: 1.0 References: <20240902225009.34576-1-21cnbao@gmail.com> <20240903110109.1696-1-hdanton@sina.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Wed, 4 Sep 2024 10:43:29 +1200 Message-ID: Subject: Re: [PATCH] binder_alloc: Move alloc_page() out of mmap_rwsem to reduce the lock duration To: Carlos Llamas Cc: Hillf Danton , Greg Kroah-Hartman , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Barry Song , Suren Baghdasaryan , Michal Hocko , Tangquan Zheng Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: s9fnk6f4uwnx1dzprao6e8x6mkqgp61c X-Rspamd-Queue-Id: 50653A000E X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1725403422-380226 X-HE-Meta: U2FsdGVkX18JlwXi34e4CheCnH9wyzOIIPTQazZf1LyV9xr95bKrwz0Rt6sTpsRrHJl5HKLO76hxjOfQzMRR+d7c5Owi55hqChaqCtN26TnIchLBqfjihixApbWqic2DvKGpCsOTH9RcqOlr4VwQ/T2f1h2Ayhu+hgZ1wfzrlyOOTqQgbuR14YJSECqqx/FApUpDpvjPv39GDVFVQvQ/SDMJ+Llnu6b44Q3ZLgWy41NrgYG8m85JXq3xzkXhOimTRKvx+JquQuqQO1SzMIUMrex5IaFGlYnjahDJaBOzubqlXIPl3vcR4rirsE9xh3/0AB9ScpzQmM11LnTaWzlkQEt+b9/Vs/LcgyrziQWkrJNKFhma0lANFdTbOhEvo1EB8lFbYbvw1SRP4Ahu/O4g/JvAz8Rr2kvFQzn5bilYiBrOzmshTTkMJ+r9SJ4dby3WkgvLT/Zs9jH63RFu/EltBiFWvmm8YZV33uDnBWomreVNvfmJegXAD3XL5mxBONSFpeOKONAbpEean6r7Qk42PeWxLCWQZbFGTrFKZsD5PjJA4COcI7sjCWG8Qj4L/rAnhDNVcxoA3u2DTMMFvlpaSmkEFrEQPC7txFBxoZdkrwB0rVcI35m+BriP39Fc+lVDHXNr8k9aMkjz6PQLT5tPNJiFMJ43GABFC0Vr+ZGhkWLu5mA8ASECj3RACQvv7VcP8lryp0n72RJy0277GSlyWR78e5fFltLx4wucF7wFIjeFYHb/slcTFi1EYTwGlsIED7G6BejGGEcZ3h83FI3Q0l4McgVpajwvNnBCbRS+G/ezgT1fVSKuApnQARbRSoN5uAilIWB6XewHmaWXyA7iDWzawMGUMDewqzQgGGofhPw6xyCWtRB+3Zzmz9hY7tIlzR7UsHep+3Zr98yGPqKX2XNyKvLqjiPLyoDnmkOVASCWUrUe8IWMo4RJkh9tbjBmRipGJDw4staFR32XVZV Xl8d5L4O yX7BIfmCH94Axs8yu8tpJ7tWoVcwvd00AZJGe6SGOyN2DscaSYhO+3db1mgehHfVN24VrIU0GbbP8S08Dv+E5uFyKMCiuB+U20BTFdpgDEgWZhWhMlrIC2Da0qUe2YEWqGGpBNATreuvNiTcPgHkEkET/Yjr2faC1oG8+/DJ+Bv2MFOdpvcoGJHxhHIifkSEHq3HSuZyYJrmY0kxCDzEL0jK8Z40y7y/1wrxszBHJnzUbuJi2qGlNOZbbIfC/gh/5CaKIgkV2QbRnoMfEHMaL3pVsaXoyEze6j4gGTaFpZVSA9wihBJJd4PY9A2Ibkmz4SBCMFZgZ53BMOH8B+Q0mWZ5sfzJjohnhb5wG9f5nOiwyNo9JD9IP/YlV1XgbSETzHC+DFAQKIpawVMA5arTrdJFHp/OrBMUmuOobxlLB9ZRiVzk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Sep 4, 2024 at 10:23=E2=80=AFAM Carlos Llamas = wrote: > > On Wed, Sep 04, 2024 at 09:10:20AM +1200, Barry Song wrote: > > However, I'm not entirely convinced that this is a problem :-) Concurre= nt > > allocations like this can occur in many places, especially in PFs. Recl= amation > > is not useless because it helps free up memory for others; it's not > > without value. > > Yeah, for binder though there is a high chance of multiple callers > trying to allocate the _same_ page concurrently. What I observed is that > pages being reclaimed "in vain" are often in the same vma and this means > subsequent callers will need to allocate these pages. > > But even if this wasn't an issue, the correct solution would still be to > support concurrent faults. So, in reality it doesn't matter and we still > need to refactor the call to be non-exclusive. > > > I also don't believe binder is one of the largest users executing concu= rrent > > allocations. > > Oh, I have no statistics on this. > > > Awesome! I=E2=80=99m eager to see your patch, and we=E2=80=99re ready t= o help with testing. > > I strongly recommend dropping the write lock entirely. Using > > `mmap_write_lock()` isn=E2=80=99t just a binder-specific concern; it ha= s the > > potential to affect the entire Android system. > > > > In patch 3, I experimented with using `per_vma_lock` as well. I=E2=80= =99m _not_ > > proposing it for merging since you=E2=80=99re already working on it, bu= t I wanted > > to share the idea. (just like patch2, it has only passed build-test) > > Yeap, I'm using the per-vma-lock too per Suren's suggestion. > > > > > [PATCH] binder_alloc: Further move to per_vma_lock from mmap_read_lock > > > > To further reduce the read lock duration, let's try using per_vma_lock > > first. If that fails, we can take the read lock, similar to how page > > fault handlers operate. > > > > Signed-off-by: Barry Song > > --- > > drivers/android/binder_alloc.c | 18 ++++++++++++++---- > > 1 file changed, 14 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_al= loc.c > > index a2281dfacbbc..b40a5dd650c8 100644 > > --- a/drivers/android/binder_alloc.c > > +++ b/drivers/android/binder_alloc.c > > @@ -221,6 +221,8 @@ static int binder_install_single_page(struct > > binder_alloc *alloc, > > struct binder_lru_page *lru_page, > > unsigned long addr) > > { > > + struct vm_area_struct *vma; > > + bool per_vma_lock =3D true; > > struct page *page; > > int ret =3D 0; > > > > @@ -235,10 +237,15 @@ static int binder_install_single_page(struct > > binder_alloc *alloc, > > */ > > page =3D alloc_page(GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO); > > > > - mmap_read_lock(alloc->mm); > > + vma =3D lock_vma_under_rcu(alloc->mm, addr); > > + if (!vma) { > > + per_vma_lock =3D false; > > + mmap_read_lock(alloc->mm); > > + vma =3D find_vma(alloc->mm, addr); > > + } > > > > - /* vma might have been dropped or deattached */ > > - if (!alloc->vma || !find_vma(alloc->mm, addr)) { > > + /* vma might have been dropped, deattached or changed to new on= e */ > > + if (!alloc->vma || !vma || vma !=3D alloc->vma) { > > pr_err("%d: %s failed, no vma\n", alloc->pid, __func__)= ; > > ret =3D -ESRCH; > > goto out; > > @@ -270,7 +277,10 @@ static int binder_install_single_page(struct > > binder_alloc *alloc, > > binder_set_installed_page(lru_page, page); > > spin_unlock(&alloc->lock); > > out: > > - mmap_read_unlock(alloc->mm); > > + if (per_vma_lock) > > + vma_end_read(vma); > > + else > > + mmap_read_unlock(alloc->mm); > > mmput_async(alloc->mm); > > if (ret && page) > > __free_page(page); > > -- > > 2.39.3 (Apple Git-146) > > This looks fairly similar to my patch. However, you would need to handle > the case were vm_insert_page() returns -EBUSY (success that raced) and > also sync with the shrinker callbacks in binder_alloc_free_page() which > is the biggest concern. > > Let's not duplicate efforts though. Can we please wait for my patch? > I'll add you as Co-Developed-by, since you've posted this already? Yes, I=E2=80=99d be more than happy to wait for your patch, as I believe yo= u have much much more experience with binder. > > Regards, > Carlos Llamas Thanks Barry