From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5E75C3DA7F for ; Mon, 5 Aug 2024 08:26:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6B4DB6B008A; Mon, 5 Aug 2024 04:26:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 664386B0092; Mon, 5 Aug 2024 04:26:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4DE376B0093; Mon, 5 Aug 2024 04:26:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 285776B008A for ; Mon, 5 Aug 2024 04:26:05 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 90750A017E for ; Mon, 5 Aug 2024 08:26:04 +0000 (UTC) X-FDA: 82417509048.01.4373F02 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf28.hostedemail.com (Postfix) with ESMTP id 7CFEFC001A for ; Mon, 5 Aug 2024 08:26:01 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=amzWsDSx; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf28.hostedemail.com: domain of mst@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=mst@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722846293; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IdnDLnsXWEQ6RVAumOXnOOh7mrwkaymI4Jc/Y5qxz/8=; b=uKYKzhC8kPyzyWBjMtHnfXO7swwCBere693KjinXaLdJ7kihSif3A+oWJw0MHCGR8RMh3q OYESb+vjzT60pJnbquSNLOvnKdoStlcl3iBKntPHwNPKOvlCz4FWjxLj7oGndkiBH/ihgr iPGjDVIRyMBQbGLaAtwsSbRJxuv2k+M= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722846293; a=rsa-sha256; cv=none; b=GXhuTrcQ0XRFke8M3xnfPzkHOVTXoj4VqQ0RWpcYLj1r9sPz7zmPduihVjeaBHJJT6Z8Bh VHRmwaojdJ7Q4sTdCjUSQJulETGzt2++bYTuctSvxrD45KdByQepz8AKobBXH7ALQUX/BM W5u9IW5AkgDCTggWUTDgoiOpifhfc28= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=amzWsDSx; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf28.hostedemail.com: domain of mst@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=mst@redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1722846360; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IdnDLnsXWEQ6RVAumOXnOOh7mrwkaymI4Jc/Y5qxz/8=; b=amzWsDSxVGfYfFhlnFWiE5MeVEdI/6n0EGXLYn/ZmZAY+zlDFY4WZrbsc4k0V8eGpxeHLW RAjSzitqiiJp0aD2e1G5k53TTxCVr4KFBH3e0NUloHWPcUgg5rzAS7W/APzjJe5Bel+qej A/qswJ1AW7I6DPvYkh/Fehhzf6+IPkM= Received: from mail-lj1-f198.google.com (mail-lj1-f198.google.com [209.85.208.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-120-olQg6MWxO9qcnUIsApheAA-1; Mon, 05 Aug 2024 04:25:59 -0400 X-MC-Unique: olQg6MWxO9qcnUIsApheAA-1 Received: by mail-lj1-f198.google.com with SMTP id 38308e7fff4ca-2ef2b0417cdso100015711fa.3 for ; Mon, 05 Aug 2024 01:25:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722846357; x=1723451157; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IdnDLnsXWEQ6RVAumOXnOOh7mrwkaymI4Jc/Y5qxz/8=; b=WdtL+00jwQqnjpr6p/YdK6DGdaGrY3raOe6uKuwl3sx3xrggXl53WRVDWuNYA+0UDh Xf3TzLFzFx0ufR3BdlQ626ZJOYhiC5W/9Cyk+/Ad0/tVsgfP9Hwv5vG0s3rU+apv7VjA BlxwRhCg57zyw0eez3KiSygJWLbImU0nTMSrqKldge5pVBtlQhhG/7gq1fT4YFpK7LwZ fyQHNj1xA6LQfq8cxYZSMAz9Oy/yLeZd3foso6PhkZxkY6cjiuw9W2hayCS+myeJ5GsP Pyhe7glfpS8ERh1GW09lx+txLxvJN/Ueoep8iTlUi4VxXSOqVfTcQAIOlZMmZxFTEL0e bZMg== X-Forwarded-Encrypted: i=1; AJvYcCUvuBAtXJQNDLgZapSRzVVhRA8Hs23pFIWEE5ieyWn2dVqrbbXaGGBqDO/YaWxNkCrC70nwXeUoBrwymd0jqNtFgSw= X-Gm-Message-State: AOJu0Yykg215qUJ4s2/qOyOGT75y7HOc/rtAV981LuprfKqLAPeXSUn9 cD7Uepsn6cNQAUXU5K8g/TIRLIttHNJeWxyCF1wWXQNBtNxSy4GJOusWceix3VLwH66xZvss60O uy41t0nEGd6A5nitl3a/ZtOpYZzLm596Wc+IApWVx/u/TtIuv X-Received: by 2002:a2e:9ecb:0:b0:2ee:87b9:91a7 with SMTP id 38308e7fff4ca-2f15aa87cedmr67675981fa.18.1722846357416; Mon, 05 Aug 2024 01:25:57 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEsXonxhnbF7y/SEo+8UNCkXh1OH8gxWMWHsntkfxFG8gi9ahQmlsboo+pmk8i2fc4VpWKqgw== X-Received: by 2002:a2e:9ecb:0:b0:2ee:87b9:91a7 with SMTP id 38308e7fff4ca-2f15aa87cedmr67675681fa.18.1722846356422; Mon, 05 Aug 2024 01:25:56 -0700 (PDT) Received: from redhat.com ([2a02:14f:17d:dd95:f049:da1a:7ecb:6d9]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4282bb9d54esm188671405e9.43.2024.08.05.01.25.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Aug 2024 01:25:55 -0700 (PDT) Date: Mon, 5 Aug 2024 04:25:52 -0400 From: "Michael S. Tsirkin" To: Jason Wang Cc: xuanzhuo@linux.alibaba.com, eperezma@redhat.com, maxime.coquelin@redhat.com, xieyongji@bytedance.com, virtualization@lists.linux.dev, linux-kernel@vger.kernel.org, 21cnbao@gmail.com, penguin-kernel@i-love.sakura.ne.jp, linux-mm@kvack.org, akpm@linux-foundation.org Subject: Re: [PATCH] vduse: avoid using __GFP_NOFAIL Message-ID: <20240805042421-mutt-send-email-mst@kernel.org> References: <20240805082106.65847-1-jasowang@redhat.com> MIME-Version: 1.0 In-Reply-To: <20240805082106.65847-1-jasowang@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 7CFEFC001A X-Stat-Signature: u4xmehsorq1mstfe6ft3d1u3cayn694y X-Rspam-User: X-HE-Tag: 1722846361-671360 X-HE-Meta: U2FsdGVkX18+pIo1SnuNiMptggiXdCSv/vl4iBhy1sGHggg97xNHXg+6H8ykOPh3KI6bkRekEW6+WR5/nEHIss2ggDtLacYK5zUYedL8gkYq32ST1v9TgiayZQkMxO8jlriig3UuuV6G6RS+cfBKGvKyIT91C1A/MtXt1niPn6Jpu9uMzNO6mw+kyb5YY0PNxd9LUBo8wQcyP3ovOp1q9Pvx6LybYd10gYXEG/l31YI/ar5z+IrGj2p0rHpvmYqF+X8cwCWBpBxoanTw93QOloJpV8RyTZ/4f6P6dmkZx2u5rZbd9fXf7pe/bfW4cbxcpUgUEP9qjG/cJIRXU0pf+fB7FSZ2s6ZK4Ajl1F2kXIOH32+cDiL+0VR+WwBqb7flURKU1Rx5EtPBY9gh0gDl/3Od2ZEn/gwVUVbygThefH4SXKbrXr1P6HztvYUzk9dlvX+mOk2GSSvDrJZPHmHESw539e19j/9ATH1nku6VzgZcF2Mux2C9xhj0jOffyNcIynXh6VW77AWShimuG2SSea72UgB8n8ulxHUN855jGWdvIWOiMiR+emfowVCfjrNUorGzPztlj12ARIbVmlaKNR3TjLOrvz7NPwdBjJNI2XP9fAvNhsLR9sKAFPOumt1GNQdGfXRRYTgKKGWJWxUJqJOyEdB1kq2+aITVjSt+o+Eo0DzpLUlQID+Bv09XshGEIwPsfp9Q2gBiDYhAq8eV0mxZmUc//jNdXv6NRvtK7fxeJdwjTpk1/OKTcBgkRZH1ok8YpiG2bxaNBlLFeqTykLZd222cJSPnJ8ANXRoEC2fy02KbMzkbsUiC5UyNM0OMTCN2IdQcdNOvcQd+JNiSxnCfZgrbFkF/G3MAC/hzj0uqFsrij0f5i8URxNPsToizaxc4wnWspfPGdKefJceoGb/WfE+jPfwTfYtXK20TDO5MBjGdZ/qXpC7lZcrl/RedCi3v52lba+Bsm9TGD3Y 6y9sygBG wbOKwK4P9l8oLf2nNoKFblEuycbrTMBKdblqV6WsZQYRXCQiBO+Ns+b8EWGn+CpKut/BPNGYOxNasbl8ARU/lDxFZhssfsCJny356W+6ml1/Yaa63C9yjNGYMcyZeiJBQ/YVF4E9+DzTvFoyN7skMv1CDokXsdMTu8o+yzRqFUcHgXL9tZRa/Abf1sU66EJmeA0aglNomeWp4CV/st0GWQDOaOeXlrhijn/CrUR2YJWM4Oy3CoT/p10zOyU0MtceyXNAOxQn7Rha+pn52V76+I0xPg9xbQApqyutahfboWP3obDI/oPm6FX8CYswvB23f9TzX5YqKc1XS213NoyaKn3nfrInMkNydX7obvyJA1DNa++kqJQD9xUOpc/vbilV9qva2DZFvbnx7u2NfHl+G3NjsC3J360s/Mwwo+nkunih/3MRcMWTf//gXn4Fvm6Tn2hQQzVe86/kbImu28xp7bg9XO7yTZ5g85tJr+zhTYCEXqaf09wAOxGBHtw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Aug 05, 2024 at 04:21:06PM +0800, Jason Wang wrote: > Barry said [1]: > > """ > mm doesn't support non-blockable __GFP_NOFAIL allocation. Because > __GFP_NOFAIL without direct reclamation may just result in a busy > loop within non-sleepable contexts. > ""“ > > Unfortuantely, we do that under read lock. A possible way to fix that > is to move the pages allocation out of the lock into the caller, but > having to allocate a huge number of pages and auxiliary page array > seems to be problematic as well per Tetsuon [2]: > > """ > You should implement proper error handling instead of using > __GFP_NOFAIL if count can become large. > """ > > So I choose another way, which does not release kernel bounce pages > when user tries to register usersapce bounce pages. Then we don't need userspace > to do allocation in the path which is not expected to be fail (e.g in > the release). We pay this for more memory usage but further what does "we pay this for more memory usage" mean? Do you mean "we pay for this by using more memory"? How much more? > optimizations could be done on top. > > [1] https://lore.kernel.org/all/CACGkMEtcOJAA96SF9B8m-nZ1X04-XZr+nq8ZQ2saLnUdfOGOLg@mail.gmail.com/T/#m3caef86a66ea6318ef94f9976ddb3a0ccfe6fcf8 > [2] https://lore.kernel.org/all/CACGkMEtcOJAA96SF9B8m-nZ1X04-XZr+nq8ZQ2saLnUdfOGOLg@mail.gmail.com/T/#m7ad10eaba48ade5abf2d572f24e185d9fb146480 > > Fixes: 6c77ed22880d ("vduse: Support using userspace pages as bounce buffer") > Signed-off-by: Jason Wang > --- > drivers/vdpa/vdpa_user/iova_domain.c | 18 ++++++++++-------- > drivers/vdpa/vdpa_user/iova_domain.h | 1 + > 2 files changed, 11 insertions(+), 8 deletions(-) > > diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c > index 791d38d6284c..933d2f7cd49a 100644 > --- a/drivers/vdpa/vdpa_user/iova_domain.c > +++ b/drivers/vdpa/vdpa_user/iova_domain.c > @@ -162,6 +162,7 @@ static void vduse_domain_bounce(struct vduse_iova_domain *domain, > enum dma_data_direction dir) > { > struct vduse_bounce_map *map; > + struct page *page; > unsigned int offset; > void *addr; > size_t sz; > @@ -178,7 +179,10 @@ static void vduse_domain_bounce(struct vduse_iova_domain *domain, > map->orig_phys == INVALID_PHYS_ADDR)) > return; > > - addr = kmap_local_page(map->bounce_page); > + page = domain->user_bounce_pages ? > + map->user_bounce_page : map->bounce_page; > + > + addr = kmap_local_page(page); > do_bounce(map->orig_phys + offset, addr + offset, sz, dir); > kunmap_local(addr); > size -= sz; > @@ -270,9 +274,8 @@ int vduse_domain_add_user_bounce_pages(struct vduse_iova_domain *domain, > memcpy_to_page(pages[i], 0, > page_address(map->bounce_page), > PAGE_SIZE); > - __free_page(map->bounce_page); > } > - map->bounce_page = pages[i]; > + map->user_bounce_page = pages[i]; > get_page(pages[i]); > } > domain->user_bounce_pages = true; > @@ -297,17 +300,16 @@ void vduse_domain_remove_user_bounce_pages(struct vduse_iova_domain *domain) > struct page *page = NULL; > > map = &domain->bounce_maps[i]; > - if (WARN_ON(!map->bounce_page)) > + if (WARN_ON(!map->user_bounce_page)) > continue; > > /* Copy user page to kernel page if it's in use */ > if (map->orig_phys != INVALID_PHYS_ADDR) { > - page = alloc_page(GFP_ATOMIC | __GFP_NOFAIL); > + page = map->bounce_page; > memcpy_from_page(page_address(page), > - map->bounce_page, 0, PAGE_SIZE); > + map->user_bounce_page, 0, PAGE_SIZE); > } > - put_page(map->bounce_page); > - map->bounce_page = page; > + put_page(map->user_bounce_page); > } > domain->user_bounce_pages = false; > out: > diff --git a/drivers/vdpa/vdpa_user/iova_domain.h b/drivers/vdpa/vdpa_user/iova_domain.h > index f92f22a7267d..7f3f0928ec78 100644 > --- a/drivers/vdpa/vdpa_user/iova_domain.h > +++ b/drivers/vdpa/vdpa_user/iova_domain.h > @@ -21,6 +21,7 @@ > > struct vduse_bounce_map { > struct page *bounce_page; > + struct page *user_bounce_page; > u64 orig_phys; > }; > > -- > 2.31.1