From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 393AAC52D7F for ; Fri, 16 Aug 2024 00:17:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A79C66B00CC; Thu, 15 Aug 2024 20:17:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A29146B00CF; Thu, 15 Aug 2024 20:17:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 917D76B010E; Thu, 15 Aug 2024 20:17:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 757AC6B00CC for ; Thu, 15 Aug 2024 20:17:29 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 2651B1218B5 for ; Fri, 16 Aug 2024 00:17:29 +0000 (UTC) X-FDA: 82456194618.08.CBEF524 Received: from mail-vs1-f48.google.com (mail-vs1-f48.google.com [209.85.217.48]) by imf27.hostedemail.com (Postfix) with ESMTP id 5F83A4000A for ; Fri, 16 Aug 2024 00:17:27 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=JFZK8Oqn; spf=pass (imf27.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.48 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723767434; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=d2YrHTGtF+jpWM5UfiAgi4NfEMMQPWNtG43MOc6x070=; b=EL5O8/GB0pc5cjBWuwzVmka8t6KpiKyeRinPlNC4BU4Vb3KqlrXdb5HDV2oHO/O40rAv8W eNMWlFMBMNe4+jchq7QI2w+W1ZAtiOqs1EGRjYfHS2u0Iv5uG1mk28UmhjF8dvVP6eE+lO IDMT8KsRVm1a+/7G7VK90abW3aJq7c8= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=JFZK8Oqn; spf=pass (imf27.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.48 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723767434; a=rsa-sha256; cv=none; b=Zm45AX4okK7vRUTyeAK6jYgB36Ew/rJG1dq9dTtvk1Yb6ZcKV58k2DDUMKVAp1qqdpu2Dv TZReK1AT3+GOJRDKYgTF+RFcwwRQYy5OtApjCd5AV+jcRyyq7FIqlBMkwFRbCZJkrbpzAY QzeM56LMGH7XJFIRNw4Nrmxldvfohfc= Received: by mail-vs1-f48.google.com with SMTP id ada2fe7eead31-492a01bce97so559244137.2 for ; Thu, 15 Aug 2024 17:17:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723767446; x=1724372246; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=d2YrHTGtF+jpWM5UfiAgi4NfEMMQPWNtG43MOc6x070=; b=JFZK8OqnKnLdL95W+ADWguxZFMLg/Guvq4Va52lB05L1Pf/np1ARaEdZDEwwyO0nNi NvCMZEv/JstSDexrXdRcEWImpVXNNh1pktYt6QU+8JKwE5fialJidA8U3QFhGmGeD2rb RC0DFcgq+8OTKWhwLg99zPzG53ckSK0hT61HaQhTETp6QQi7J/ohwK8hR8/KOlzVx/wa EJYPB7Q8Cg6/4puKSN5JcIOsgUxspPBDSb1DE+0OE93jGX1E8FusEiN/LNrB3Ulyf4X+ tFBhuLTpC1+OoSgiKk9KSttamTI75BXOjsyHHenRrQO4fAG5/MwRzVy+td5AGddq4qQF Qx6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723767446; x=1724372246; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=d2YrHTGtF+jpWM5UfiAgi4NfEMMQPWNtG43MOc6x070=; b=SoFGALfUh8gx2euy5AMDVwsosrfoJEYTKk5fRGBZYVs9SEPdG03uwIqFEdhIYcNgQa E9/sARP538OHxhzEQTc/wTRXEd6YVFDU827QxN3tKjxTd0ITHczItDSCRemaLj7D0w3r lPGNS+x/hvhsCxFyK+w5vqEBlujSu0ydJVZrEIRc8i5lp8yf41aAKkRtomx/RdJSVG34 /oUD8OGpp7tGzmo0kMjJhBwWr1L/9PFgMnpDZwFeNXSzZ300MzQZ5wprNHSS1DuLgLdG EVhwos+kbt6c+8dmGd1tzCfvJx8I/Lgx4pSrYoP2xmmf/maSXkAMGk1GVMHmy97ba1RD 3hFQ== X-Forwarded-Encrypted: i=1; AJvYcCVS1YozWywVHZhoC/fPu6HLSA14aaTWSMPz8SSrCLYnngLVa0V+rysvk4JZxlBgol6BXTaag7UtjXhA3YBqxf5lEwo= X-Gm-Message-State: AOJu0YxismLoDeVGQZFVN2RJQJwiqwVRgZ+77G/iQU9MoLwOHNyqICvU ahOqcwot3jKj7Qm4o6x+7pd9d5s3Fcp4Kz3P8hvyqtOAg89l9sP+uk0rSSb85gFERB/qjUeKVgF lIPw2FEosKVXzlFalSkqPvuP5ZTM= X-Google-Smtp-Source: AGHT+IG2EoyBilLVQRqv96CAY/4lhYl5dk+AsKpgy8jtfXimSLSsr1+QAjSQU2FxDWwQSJPkVV+riCGX4oxoxLLt55Q= X-Received: by 2002:a05:6102:6cd:b0:48f:e9d9:3ad4 with SMTP id ada2fe7eead31-4977997657fmr1965982137.19.1723767446049; Thu, 15 Aug 2024 17:17:26 -0700 (PDT) MIME-Version: 1.0 References: <20240808054320.10017-1-jasowang@redhat.com> In-Reply-To: <20240808054320.10017-1-jasowang@redhat.com> From: Barry Song <21cnbao@gmail.com> Date: Fri, 16 Aug 2024 12:17:14 +1200 Message-ID: Subject: Re: [PATCH v2] vduse: avoid using __GFP_NOFAIL To: Jason Wang Cc: mst@redhat.com, xuanzhuo@linux.alibaba.com, eperezma@redhat.com, maxime.coquelin@redhat.com, xieyongji@bytedance.com, virtualization@lists.linux.dev, linux-kernel@vger.kernel.org, penguin-kernel@i-love.sakura.ne.jp, linux-mm@kvack.org, akpm@linux-foundation.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: hunfapib3fojppx78nwkc5fek4iw7kwp X-Rspamd-Queue-Id: 5F83A4000A X-Rspamd-Server: rspam11 X-HE-Tag: 1723767447-976907 X-HE-Meta: U2FsdGVkX1+eqvUtiDUZIp113vh6cDa2+F17fe4O8l2Psdh5atkyuECa7RbJtT0fTXQrGWErVDo0pg/1POzj1nH7KrzeG8enxtz303RxDAKRFtJgp7mKEkxuX/vuZjRh2rTNBoSVuJ9DG1CyhJw3hjfyziXVZTiX1d/BstYiGpgkkmorjuvEzx4WwM5ojli5XXtZE+KmGrMaMtyBZ6XCm/6H92awpKFzfeJLU4UvDHcUc5ecBL9p2wohDhYzFCfXUvLp9LSdUoGDTGH7TG+0HufgXUsKPkM8q8nUuQsyIUAGaBJnDqFW+fAlyf7hTMDk2N2wBGN1L24z9i+PYHEDkKQF4TxhQM4nV0Bz+/wDlO/zScE2+rRny9+BnpTvUqyFJPIuaDqS4HEKtUz2g3FBRWcS48itfPt1BC5NFmlpmxx1eBoxDzIBL+dRiVWcaanqFTdEHwVVno2R04G+mL9au+9eZ8ZAYOYAZqXgCBBQAFIgRMcI2nOa1ZYGESW9N8tP/T7ZmvseGyJuNOGwGClL7N6yHc77/b/gjhETj9m0+zNMA5h7rSLdC6Fe1NX8WuCQAfc0wsMZi0EM3WZibQ7E7U+zsLwga2LqUM0nMi6/dbe+BuA0CgkVyzHatOTH1VXP5mW1yEHxzmijZVpHn9uYEiXyWdxzBPvYGME2LyPcY3c0G0penUEKxa4wC6V0Ts3w0761Wzv0Xsa2D5tsaLaGVYtl6/xiI9RlM6o/cJGPXSBdXEfQ3SlzZ7gyDCT5s8gbdAk9NDRSjSfmIahUJ+/P4lfr/7D+kXZ/3U5q0ZseDK+2D60AzmPPOMIrMMJlLeauyRICT+rgsUj9Vq34ykpKpBjAFYm+WmyXLWQLdAvXKhXElZd1+AwFpwYrT0cy3OEMNNH/uTC6ZT0kawXHvYbyYMvyva0tCU6Xai8fHSzjIS4O1KbepTjVrCToT0Gztfz2zhvxPIMFU92B2oCkPaV nL/jrWZd o2HbrxkSvsTt9sITet6GyVXPuZjr6B7suNydVdXKG8/e5yh5lViVrfmieBdBvZXB7XH+Ox25XLs5Ii0hGRmwAyY+cjYTfMHkHLpaeVWQ0V8AcPUFB8mxwMYgrqhZPQKGJVBdgW2K5XWPcHsHbf32qhqVLutoCfoBPUzq58K3iiOf+2GmxSVJIMBPQceu3rMw5Ure5f9szt1XQ32MoKI3oMJ+CGaZlApMU0MNGgVnZCiOxyb1FRHdfwC66y8VQ4sXbxGBgAwYevpKuvOsZtJCiIMOinRUg/5pZgtTDrxielVnhqZD5jHv54Yof+MXj1b2g8rENczZ9SzNrU0E3NiQaLaKYJuyRJ9uDD0tPoXbMpWOh50gARNgHR0WTkPZuacj7P/hR22NehW2WpXjdlmwfBRcHHBRoeRV33lvnjEqivV05L1lbOFEhbqPzvq/0V2sQiyTdXdf7l2UFE3CbrebMOiEgs/EsWNDnArP0UC9DvIp208BlbWqyLlggHe2uCSF402nA X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Aug 8, 2024 at 5:43=E2=80=AFPM Jason Wang wro= te: > > Barry said [1]: > > """ > mm doesn't support non-blockable __GFP_NOFAIL allocation. Because > __GFP_NOFAIL without direct reclamation may just result in a busy > loop within non-sleepable contexts. > > The current code will result in returning a NULL pointer but > not a busy-loop. > > static inline struct page * > __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, > struct alloc_context *ac) > { > ... > /* > * Make sure that __GFP_NOFAIL request doesn't leak out and make = sure > * we always retry > */ > if (gfp_mask & __GFP_NOFAIL) { > /* > * All existing users of the __GFP_NOFAIL are blockable, = so warn > * of any new users that actually require GFP_NOWAIT > */ > if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask)) > goto fail; > ... > } > ... > fail: > warn_alloc(gfp_mask, ac->nodemask, > "page allocation failure: order:%u", order); > got_pg: > return page; > } > > We have two choices to address the issue: > 1. busy-loop > 2. BUG_ON > > the below patch chose 2: > https://lore.kernel.org/linux-mm/20240731000155.109583-5-21cnbao@gmail.co= m/ > ""=E2=80=9C > > Unfortuantely, we do that under read lock. A possible way to fix that > is to move the pages allocation out of the lock into the caller, but > having to allocate a huge number of pages and auxiliary page array > seems to be problematic as well per Tetsuon [2]: > > """ > You should implement proper error handling instead of using > __GFP_NOFAIL if count can become large. > """ > > So I choose another way, which does not release kernel bounce pages > when user tries to register usersapce bounce pages. Then we don't need > to do allocation in the path which is not expected to be fail (e.g in > the release). We pay this for more memory usage as we don't release > kernel bounce pages but further optimizations could be done on top. > > [1] https://lore.kernel.org/all/CACGkMEtcOJAA96SF9B8m-nZ1X04-XZr+nq8ZQ2sa= LnUdfOGOLg@mail.gmail.com/T/#m3caef86a66ea6318ef94f9976ddb3a0ccfe6fcf8 > [2] https://lore.kernel.org/all/CACGkMEtcOJAA96SF9B8m-nZ1X04-XZr+nq8ZQ2sa= LnUdfOGOLg@mail.gmail.com/T/#m7ad10eaba48ade5abf2d572f24e185d9fb146480 > > Fixes: 6c77ed22880d ("vduse: Support using userspace pages as bounce buff= er") > Reviewed-by: Xie Yongji > Tested-by: Xie Yongji > Signed-off-by: Jason Wang Hi Jason, Is this the final version to fix __GFP_NOFAIL issue, or will you have a new version? Do you prefer it to go through the mm tree or the drivers subsystem tree? If it's the former, I can send a new version that includes this one in my series[1]. [1] https://lore.kernel.org/linux-mm/20240731000155.109583-1-21cnbao@gmail.= com/ > --- > Changes since V1: > - Tweak the commit log > - Assign map->user_bounce_page to NULL for safety > --- > drivers/vdpa/vdpa_user/iova_domain.c | 19 +++++++++++-------- > drivers/vdpa/vdpa_user/iova_domain.h | 1 + > 2 files changed, 12 insertions(+), 8 deletions(-) > > diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_use= r/iova_domain.c > index 791d38d6284c..58116f89d8da 100644 > --- a/drivers/vdpa/vdpa_user/iova_domain.c > +++ b/drivers/vdpa/vdpa_user/iova_domain.c > @@ -162,6 +162,7 @@ static void vduse_domain_bounce(struct vduse_iova_dom= ain *domain, > enum dma_data_direction dir) > { > struct vduse_bounce_map *map; > + struct page *page; > unsigned int offset; > void *addr; > size_t sz; > @@ -178,7 +179,10 @@ static void vduse_domain_bounce(struct vduse_iova_do= main *domain, > map->orig_phys =3D=3D INVALID_PHYS_ADDR)) > return; > > - addr =3D kmap_local_page(map->bounce_page); > + page =3D domain->user_bounce_pages ? > + map->user_bounce_page : map->bounce_page; > + > + addr =3D kmap_local_page(page); > do_bounce(map->orig_phys + offset, addr + offset, sz, dir= ); > kunmap_local(addr); > size -=3D sz; > @@ -270,9 +274,8 @@ int vduse_domain_add_user_bounce_pages(struct vduse_i= ova_domain *domain, > memcpy_to_page(pages[i], 0, > page_address(map->bounce_p= age), > PAGE_SIZE); > - __free_page(map->bounce_page); > } > - map->bounce_page =3D pages[i]; > + map->user_bounce_page =3D pages[i]; > get_page(pages[i]); > } > domain->user_bounce_pages =3D true; > @@ -297,17 +300,17 @@ void vduse_domain_remove_user_bounce_pages(struct v= duse_iova_domain *domain) > struct page *page =3D NULL; > > map =3D &domain->bounce_maps[i]; > - if (WARN_ON(!map->bounce_page)) > + if (WARN_ON(!map->user_bounce_page)) > continue; > > /* Copy user page to kernel page if it's in use */ > if (map->orig_phys !=3D INVALID_PHYS_ADDR) { > - page =3D alloc_page(GFP_ATOMIC | __GFP_NOFAIL); > + page =3D map->bounce_page; > memcpy_from_page(page_address(page), > - map->bounce_page, 0, PAGE_SIZE); > + map->user_bounce_page, 0, PAGE_S= IZE); > } > - put_page(map->bounce_page); > - map->bounce_page =3D page; > + put_page(map->user_bounce_page); > + map->user_bounce_page =3D NULL; > } > domain->user_bounce_pages =3D false; > out: > diff --git a/drivers/vdpa/vdpa_user/iova_domain.h b/drivers/vdpa/vdpa_use= r/iova_domain.h > index f92f22a7267d..7f3f0928ec78 100644 > --- a/drivers/vdpa/vdpa_user/iova_domain.h > +++ b/drivers/vdpa/vdpa_user/iova_domain.h > @@ -21,6 +21,7 @@ > > struct vduse_bounce_map { > struct page *bounce_page; > + struct page *user_bounce_page; > u64 orig_phys; > }; > > -- > 2.31.1 > Thanks Barry