From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA854C3DA4A for ; Tue, 6 Aug 2024 02:30:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5D86D6B00AA; Mon, 5 Aug 2024 22:30:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 588C66B00C0; Mon, 5 Aug 2024 22:30:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 44FDC6B00C1; Mon, 5 Aug 2024 22:30:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 2901D6B00AA for ; Mon, 5 Aug 2024 22:30:33 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B8E36C0605 for ; Tue, 6 Aug 2024 02:30:32 +0000 (UTC) X-FDA: 82420241904.23.E1394BD Received: from mail-vs1-f54.google.com (mail-vs1-f54.google.com [209.85.217.54]) by imf12.hostedemail.com (Postfix) with ESMTP id E027A40023 for ; Tue, 6 Aug 2024 02:30:30 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="asjkTQZ/"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf12.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.54 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722911380; a=rsa-sha256; cv=none; b=sXaOc8PxPio4AxC+VZvS8he2PuX993vssJt9pwfdGyaTDac9w2uOmQKeVOxMx/c0P/pR43 tUxg4DnQxhMmznPBkm0TuvbmubVR3l8erf5wDL5F4L1Ru93DCFe6m5YBIbjOpKaLNxPqzU fsACwnTXXZ58kTacN0/Cg2bWv2FjL6g= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="asjkTQZ/"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf12.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.54 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722911380; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lRtCi5nTIIRJvWv07+NXoyLmPbrdSthBdBMp8eojHgc=; b=CyZ8DiHU0veMnjcdUirUeVwPsNOvAasENIEPayhJ0yAXrWF09FVMNiA28bGo5cTfp2tBqT n8y0WGAuGL/jKvlwk/OCyvsdH2XyvrEFkmZCSDiltqoiMIcZ5BnCvB58JKFoN3HYzqnZ8g wB0nIv+1dg+lIkEE3DeVXhcKbRHZjbw= Received: by mail-vs1-f54.google.com with SMTP id ada2fe7eead31-4928da539c3so28088137.1 for ; Mon, 05 Aug 2024 19:30:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722911430; x=1723516230; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=lRtCi5nTIIRJvWv07+NXoyLmPbrdSthBdBMp8eojHgc=; b=asjkTQZ/QquZAyp2PUSAoOLNkn0D2ZXxtSzsLp8qg5AZk0ZrX83Hxw3om1WL1+PFRZ XB2DaH/dI1q0szd63ylAkl56UinD22aFlhlx4SExEDwDAYEinDFCby5tyXdnWvhGxmVe bDA3JX0fbhproTGE9ipYtD+wqBKW9yqAt8buN5xusxMMAOUquwjfkGpyvJdiGDY3H1/0 l7qpwld9bMe/84d1QKyvXjyu+lwWxJzUsRKTEkc62hxzVmdxr0atuvGljC1wbOxGFuj2 LC6T4kOc6JfXPmezzatab5vNlpAlx/UPNgvR3WmSN7J8ca2CxI5TmdCDGMXuq+gLNsDa cm9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722911430; x=1723516230; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lRtCi5nTIIRJvWv07+NXoyLmPbrdSthBdBMp8eojHgc=; b=A9dhNNaLU2o1bWSvO9Qwqi6AsRohCMJN9ro9kp54V9yEvTcanroCKeY3zCyFl3xwQ8 LZXE9CsOiP7+KdxW9lT29h6+KvZiE0JSZ3Jvy5vDTiFNEgFkid+Z623SFS5Eff61f/Z/ jYshqPnQKJSzmdk+9aZw36tMWpiya0OIcsF3TqAmbvaJIkeDBkiOxyUs3G/eEm5mK35P A+gAtg6ufCiG+8TUGy8WKKVCVG4vqrrXkePIatqT467zwftSmn+09bHmuhezD3U2HX4A kLSLTHIWVkVHB9M5B0K8lQguLQlv1FeSaj7VZXdDWg3a3SAP0JysRUqlzES/ADMOY3yb K4yQ== X-Forwarded-Encrypted: i=1; AJvYcCX7wdsyaF01oxQextx2bOFal9I6821EyaRYT5rVczXTEKwdr++FjlewdqIsLeU/3rz2CwrCqNuioFhfwdKgIOOk210= X-Gm-Message-State: AOJu0YxIyvNR3btJ7v7pi1OdRXWe4Rfp/7xRUofzzqvC47aIbCm36EnG DEvCkOEq9mIm5yXjhGrTzLferhIVEmU/ZO6Fzhtnfki8Feq03OHpHY/8+I9yheT+Tf2GhpnU16x 55xOTgkZHL64Xw10hD8AVJar1qNg= X-Google-Smtp-Source: AGHT+IFfnssI3iHUD6uHW2GeLD0+CUBMsYgyiffucQINpnMu5oxyIfCPsSxk0LIGmYOSNYqqT3EqqFbdZ7SLXUFwFuc= X-Received: by 2002:a05:6102:c03:b0:48f:e8f9:5d9f with SMTP id ada2fe7eead31-4945bddc51fmr10783248137.7.1722911429807; Mon, 05 Aug 2024 19:30:29 -0700 (PDT) MIME-Version: 1.0 References: <20240805082106.65847-1-jasowang@redhat.com> In-Reply-To: <20240805082106.65847-1-jasowang@redhat.com> From: Barry Song <21cnbao@gmail.com> Date: Tue, 6 Aug 2024 14:30:18 +1200 Message-ID: Subject: Re: [PATCH] vduse: avoid using __GFP_NOFAIL To: Jason Wang Cc: mst@redhat.com, xuanzhuo@linux.alibaba.com, eperezma@redhat.com, maxime.coquelin@redhat.com, xieyongji@bytedance.com, virtualization@lists.linux.dev, linux-kernel@vger.kernel.org, penguin-kernel@i-love.sakura.ne.jp, linux-mm@kvack.org, akpm@linux-foundation.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: E027A40023 X-Stat-Signature: 54z67uk7k9y8jo1k88fx6xs3k13wxfoh X-Rspam-User: X-HE-Tag: 1722911430-31941 X-HE-Meta: U2FsdGVkX18Nd5PQCwKtP2l7Z7MEbqS7dkE0oBrdSQ6GSU2pMnSVNAN1J9JoQY5oX0YjOxnn3rR/OzJqMco8PYmtTwfb28FKiF7554/A+uLax/AGJoYyYqxCCLtx3qIlthIks0LqouQ4GG0M8wTFJXdd/qo1Rx7wH/JZgs6PWvKhyWY5lPMz1UxCcSNIi5bdQDlszp9ynlBeOwk2cBx/nG8Rb15SwEVUYyecQ85Ly8umRLkr8UmWuricIr+ildDu8anhF9JhUnsDqrsqd8c+O/80DA25HtKqZnUu7nYn+DO0ENF+/VI4bx0OaaKyxHHwXRzC+uv5ZlZUPI1ZR89OPpcKTVVR/wm52tlTDe9V7vIpq3r0xgdDcVuMb+Yi9Lr4NXEwASP8kh0nkD12hj5/yMHyYPLLkwpyXBYqJ0DQVzX6q7F5HdIiHVV+Aeir4PYbXHSpbGR89cYdT4+ZRCqSFqtuWrOLF4VRGYlsaAUjGQQXvkpxZeamvk/Xdz8OqyiGqamgvWONS5qGsdIzeyZjfQ9CM22KlE52tM3LoTFP9v7U8fdM8nowwh+OdzDOXcNSGZUxn6+9uyuOhEkHEHSZL6CKLkdwTH7pBZriq4I+q411oI2/il+MShDMirn8BUA+d14NL6qEp7E3cHdi6OXyaZ32S5FziKYulY20n+7ePIghOdnznhg2VO94uTw2zAXnUL7enWbAQSfZlpuKMwy8NRDQgugOMtawgwCg+BmjEL0amN32CSKbmKZP5ouzPvEDb6MlJ7vcKAwbY6sDEFAjnE87NTLxKqINaD5xv6VfoI+fh9c+QFzHny6e2w56DmU+e+zNUSXPwwl3JWZ/1ATUhpCPI+hqWC3VaaFS+SxyxHaWq5RlSrCYHEzj6RRiFhWFjtwp6IC4D4Ez33P91Ab4YKmTb3x1PnGYhQYHQvCl9hhL5JU1s0U6rQWzXR65grMqRNIuG2V/V048zrtsbNH 7g4BvBxk C6nhbtcg7919ARSVju8O42D1UwgOBkG6Sn+c6t9QwHAGnVdiRr8e1W5EeZGbeP4lXdtL6CXhuJ62X7LPYxHh9Y5AqtCUJUV6rDwkETa7UuVMX5+Jk05AbfY/Qzrt1h3MDenUNUuPoy+iYnEeR/O4D7dbSjtyI7qhOIoexCV55tSuWh/YNvJSHFbe8BObmC86IUS+twSNHGWQq2sp5YDsXn7Tgk8q0maDTM6fB/Gd4kQUqlBlcnpkjB6AI9GzIHaKV5MT4/snDC7icKsbYxgfRDdfsbFP2NhXuOFQj7vvyqrY2gfuwpdkwpK1XNcy1ZVK+CrZwMcRq2+S4qJvZzNEjy/5eZBg6jpGQ1+2bUhPCQEgBKU5GBiVflhobq2aFw15p+4cleekpMc1voAOOesBOUS5oDjEi430pkWpV X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Aug 5, 2024 at 8:21=E2=80=AFPM Jason Wang wro= te: > > Barry said [1]: > > """ > mm doesn't support non-blockable __GFP_NOFAIL allocation. Because > __GFP_NOFAIL without direct reclamation may just result in a busy > loop within non-sleepable contexts. > ""=E2=80=9C the current code will result in returning a NULL pointer but not a busy-loop. static inline struct page * __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, struct alloc_context *ac) { ... /* * Make sure that __GFP_NOFAIL request doesn't leak out and make su= re * we always retry */ if (gfp_mask & __GFP_NOFAIL) { /* * All existing users of the __GFP_NOFAIL are blockable, so= warn * of any new users that actually require GFP_NOWAIT */ if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask)) goto fail; ... } ... fail: warn_alloc(gfp_mask, ac->nodemask, "page allocation failure: order:%u", order); got_pg: return page; } We have two choices to address the issue: 1. busy-loop 2. BUG_ON the below patch chose 2: https://lore.kernel.org/linux-mm/20240731000155.109583-5-21cnbao@gmail.com/ > > Unfortuantely, we do that under read lock. A possible way to fix that > is to move the pages allocation out of the lock into the caller, but > having to allocate a huge number of pages and auxiliary page array > seems to be problematic as well per Tetsuon [2]: > > """ > You should implement proper error handling instead of using > __GFP_NOFAIL if count can become large. > """ > > So I choose another way, which does not release kernel bounce pages > when user tries to register usersapce bounce pages. Then we don't need > to do allocation in the path which is not expected to be fail (e.g in > the release). We pay this for more memory usage but further > optimizations could be done on top. > > [1] https://lore.kernel.org/all/CACGkMEtcOJAA96SF9B8m-nZ1X04-XZr+nq8ZQ2sa= LnUdfOGOLg@mail.gmail.com/T/#m3caef86a66ea6318ef94f9976ddb3a0ccfe6fcf8 > [2] https://lore.kernel.org/all/CACGkMEtcOJAA96SF9B8m-nZ1X04-XZr+nq8ZQ2sa= LnUdfOGOLg@mail.gmail.com/T/#m7ad10eaba48ade5abf2d572f24e185d9fb146480 > > Fixes: 6c77ed22880d ("vduse: Support using userspace pages as bounce buff= er") > Signed-off-by: Jason Wang > --- > drivers/vdpa/vdpa_user/iova_domain.c | 18 ++++++++++-------- > drivers/vdpa/vdpa_user/iova_domain.h | 1 + > 2 files changed, 11 insertions(+), 8 deletions(-) > > diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_use= r/iova_domain.c > index 791d38d6284c..933d2f7cd49a 100644 > --- a/drivers/vdpa/vdpa_user/iova_domain.c > +++ b/drivers/vdpa/vdpa_user/iova_domain.c > @@ -162,6 +162,7 @@ static void vduse_domain_bounce(struct vduse_iova_dom= ain *domain, > enum dma_data_direction dir) > { > struct vduse_bounce_map *map; > + struct page *page; > unsigned int offset; > void *addr; > size_t sz; > @@ -178,7 +179,10 @@ static void vduse_domain_bounce(struct vduse_iova_do= main *domain, > map->orig_phys =3D=3D INVALID_PHYS_ADDR)) > return; > > - addr =3D kmap_local_page(map->bounce_page); > + page =3D domain->user_bounce_pages ? > + map->user_bounce_page : map->bounce_page; > + > + addr =3D kmap_local_page(page); > do_bounce(map->orig_phys + offset, addr + offset, sz, dir= ); > kunmap_local(addr); > size -=3D sz; > @@ -270,9 +274,8 @@ int vduse_domain_add_user_bounce_pages(struct vduse_i= ova_domain *domain, > memcpy_to_page(pages[i], 0, > page_address(map->bounce_p= age), > PAGE_SIZE); > - __free_page(map->bounce_page); > } > - map->bounce_page =3D pages[i]; > + map->user_bounce_page =3D pages[i]; > get_page(pages[i]); > } > domain->user_bounce_pages =3D true; > @@ -297,17 +300,16 @@ void vduse_domain_remove_user_bounce_pages(struct v= duse_iova_domain *domain) > struct page *page =3D NULL; > > map =3D &domain->bounce_maps[i]; > - if (WARN_ON(!map->bounce_page)) > + if (WARN_ON(!map->user_bounce_page)) > continue; > > /* Copy user page to kernel page if it's in use */ > if (map->orig_phys !=3D INVALID_PHYS_ADDR) { > - page =3D alloc_page(GFP_ATOMIC | __GFP_NOFAIL); > + page =3D map->bounce_page; > memcpy_from_page(page_address(page), > - map->bounce_page, 0, PAGE_SIZE); > + map->user_bounce_page, 0, PAGE_S= IZE); > } > - put_page(map->bounce_page); > - map->bounce_page =3D page; > + put_page(map->user_bounce_page); > } > domain->user_bounce_pages =3D false; > out: > diff --git a/drivers/vdpa/vdpa_user/iova_domain.h b/drivers/vdpa/vdpa_use= r/iova_domain.h > index f92f22a7267d..7f3f0928ec78 100644 > --- a/drivers/vdpa/vdpa_user/iova_domain.h > +++ b/drivers/vdpa/vdpa_user/iova_domain.h > @@ -21,6 +21,7 @@ > > struct vduse_bounce_map { > struct page *bounce_page; > + struct page *user_bounce_page; > u64 orig_phys; > }; > > -- > 2.31.1 >