From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90082C3DA7F for ; Mon, 12 Aug 2024 07:21:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0A47B6B0092; Mon, 12 Aug 2024 03:21:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 054616B0098; Mon, 12 Aug 2024 03:21:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E5E1A6B009F; Mon, 12 Aug 2024 03:21:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C1E1F6B0092 for ; Mon, 12 Aug 2024 03:21:44 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 36C10A6D29 for ; Mon, 12 Aug 2024 07:21:44 +0000 (UTC) X-FDA: 82442748528.11.7A9FA58 Received: from mail-pj1-f54.google.com (mail-pj1-f54.google.com [209.85.216.54]) by imf14.hostedemail.com (Postfix) with ESMTP id B1EC1100008 for ; Mon, 12 Aug 2024 07:21:41 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=OydEjEpl; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=none (imf14.hostedemail.com: domain of xieyongji@bytedance.com has no SPF policy when checking 209.85.216.54) smtp.mailfrom=xieyongji@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723447225; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Gc9rFMB/nIG43R05ENPWmh8znCe093RvwxOUzoaD3nc=; b=wV1WALyYD/3AZdpaosVG5876mAAb1XSFWlVW/pmD21eKT39MJcpqARJhHhIYya+AMQIyJs sdus+MaK1sac47coNn2b+fR/n9l+hNXNzw35CFMsvpoZvCfZsNUaARMTm+6BhFtoLPa5ly Bypxl+WZt+wClAOr4zwXyu8Qfm22N8c= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723447225; a=rsa-sha256; cv=none; b=YEPrHx9TTCkRHymsMOHiXyXplK/rWcKvIv/u8eVXRTOA3pT6qyXjQtVNVgfQc3sh+zO08I E+O/GYVGWj5Y+l1FOnu+fYNhx8uR0qm0+FSW4vrGQ5/sN+k383iFGgt5MFQAIakVn1xUOO m3I/m4xPKIfafnvaAAMVLBDtMhxv2ok= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=OydEjEpl; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=none (imf14.hostedemail.com: domain of xieyongji@bytedance.com has no SPF policy when checking 209.85.216.54) smtp.mailfrom=xieyongji@bytedance.com Received: by mail-pj1-f54.google.com with SMTP id 98e67ed59e1d1-2cb4b6ecb3dso2638872a91.3 for ; Mon, 12 Aug 2024 00:21:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1723447300; x=1724052100; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Gc9rFMB/nIG43R05ENPWmh8znCe093RvwxOUzoaD3nc=; b=OydEjEplvfTnepjGjTcQrxNxi/8wwlTNai+iIh3DH1n2HbSLojk+cG50VFlsvQ+GYX qqKpK/9UTuN/vi3+pW0o7pasXv82fj6p0UoNw6sBot1oqS7hzw+enq2NJByOkdp37Xim Mi7DpeJtpmNUwe8r70Cy4ZzbWZ+KX7RMzCNVCn3qX9yWbznzfNTyn4om/x8w5psS6QEh 9pmMcjMYimSo6c72K3U9jHoAnbTFr3l2dMJdBC0yDf5Ithq1A+fmcunB3dlm1pYAA1Zr X01gLa+Sb1fYqBYJBKBrghP0w6bETjFeoYNlUzjdxlYzn8/VbKnjzXpkfKCUkAC0NffY cEDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723447300; x=1724052100; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Gc9rFMB/nIG43R05ENPWmh8znCe093RvwxOUzoaD3nc=; b=aBi73Ijt096Z6tkfQPzBfpHNzW0LSBkWELRED86MAO4XKu0JvK0VJUwl4Cfm9u9vDS Ad/3NDqoNB8+da12t0hAW/Wy2GedUvvhx0mVRyxnfxUYfUbUNORqdIyCGCRVHOPlPE7Y JcUEngpc3TVHW6Xk6JMHgZKsVIfrp+TCNVxp+7M8bS6Tw/5f55ZNNepTPDNbCwqwUpBX BnwEFBpczvovhfU+9DRbG59qKGeTMjuXDMEK9S2o9b4L+rYz7pwW/O8EkKAe1Pkxf7DY +jX2mmfeyHNnlGNTQKvj2P+yVlU2on9DWjneLzgWsj//r3hFs9qZR9rpDNjBwisRZ1pZ rpIQ== X-Forwarded-Encrypted: i=1; AJvYcCXCVraxsPXISsslaheJfcvPQNahDkrwfLogIFC3VBHrsGgyr8hZMgPCtUqR93FnxArE6lozsZ5eShNeZ9FWm9DN91s= X-Gm-Message-State: AOJu0YwOs9rjo7UNT80Gm4JPn/NbPyRs9MzwMezyXRDKBxGAzOzkWB2R raV/5OFyZOA2GUhYjlvnJNRDVU6HJiTAVbXIu7oYyfsUZ1c7q7GW9jL8brLG2odNxxYWD72jqq7 yRlCYcVkxjD4lWdpsdigl3r28DolOkg4AYZM6 X-Google-Smtp-Source: AGHT+IGQ2qzReh2+VWgM62ZqUt67uGinfqCdpOyHAtZrbUjyZcrygoPn4dl3XzhRPpe+f4DRP+pSYmvPd3hj1Y2rOZA= X-Received: by 2002:a17:90b:183:b0:2c8:f3b4:a3df with SMTP id 98e67ed59e1d1-2d1e806a38cmr6050040a91.42.1723447300157; Mon, 12 Aug 2024 00:21:40 -0700 (PDT) MIME-Version: 1.0 References: <20240805082106.65847-1-jasowang@redhat.com> In-Reply-To: From: Yongji Xie Date: Mon, 12 Aug 2024 15:21:29 +0800 Message-ID: Subject: Re: Re: Re: [PATCH] vduse: avoid using __GFP_NOFAIL To: Jason Wang Cc: "Michael S. Tsirkin" , Xuan Zhuo , Eugenio Perez Martin , Maxime Coquelin , virtualization@lists.linux.dev, linux-kernel , 21cnbao@gmail.com, penguin-kernel@i-love.sakura.ne.jp, linux-mm@kvack.org, Andrew Morton Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: B1EC1100008 X-Stat-Signature: ywf8abdb5j154hgbcf7dwgqx8srsekgw X-Rspam-User: X-HE-Tag: 1723447301-430904 X-HE-Meta: U2FsdGVkX19J/pQPhE3ZjBCswRN/D2+WzPKJfrkTDejCix6rkV10NKDwHjMJ1joPwzBpAv682+n/MP615V6d+n2MnFtRpVEMZ3WBB8F2epjUeiQgbqjxBss94acFwQDQHlZyyphCeLm2kWXW6VVjrsws+W6/pwIfJXbwXqbNuKaN0B4H0oSmOm7qH9jCM+3dlmNYXoWlCXieL0CbvVXqsWQSBGvZ1efdVfiRA5OTXyktcC53aqoRVsaWCv4dZK2Pu4zJgEWn+VkNCtc/o7I341dWelBTOsBJ3OHM9w+XWiRr9ZV4aX1M87n6fQ59tL+/Og8HLdYcCkItvDkQewy/xBYfoW1AnhybBD136jCzJmd3EIhgdQnRA7UBc4uAhgEyOM2o37Ydz/lq+6JCP1N9bmQyZITPgMzGbxaf4OzqxiZHghglMevumcMeSJwWDsBpNNNIeSPkw9e/JQrCSl0ncLtbsoPrv8w9aDJ8s80zuS7EsK+UE5thy1DrVwKuRZXewIpeNjVbJ1Bx7dAoTOPLZ/0t5Ec7AA4BrIkzBFn3gSJj5uNI9lc7LViA2V8tml7vSw5atITFomHbJ+WfiYfcsbpKkmm5A5R2WAs34q7DOQq9Qsji8kplhYbcYtg7EPNcZOpOafm/hneGRCOOkA1Ox86X+NRZwefUboo5GZMw5k5LeI0tIamsPYXrcL1xnvRx9vl0+gwPGKeNFUFgBbUDdHj2MLNrSQbin9Pa5omYPB5QnxBf+XEZBLzltKgljWj6ekFAUVrhlMyWNbbq9l/9wn90zuSJoA3fEkx5gB5QYOZXyCGn9Ydmo0po44L2M9AFRpm6UR9tGOMVvN38asESaKdFPILQrMYp1hiAh+PHs5SwuUbAcLnzuW33AVg4rZc/WyW4TvcqGWS/dauFPHStJe63M5sNN9yF1oah1SosCno/L3WRVbJHjNcV8h1FzaJ5gaVHUYbJ8AKeYXAgipV w5TlVZpd lOmDspMHW8f4AAOMn7/geZjx71gMj/WoD48ANqnJQiGgpE5/G5xtbYDROrAVrMtVhg06S6PpAYrtA0chm/KUq4ShmNkpH57o/Od0Xie2rD1m9SSax4knmBR48mZeWWGSGN9qN8zIlWOuh27JKxsc4cfravtse7OvKar2JldcKOggkffZhXny46emhMpt7lJVNHLRIjTtOIxJVdJ3zdAZXujrrvYOvWA6vT2K3fd0mDVr3KSWe6Rcr7XG/wu7a7i2DYm6Xjhtw02xztNrqSvpWld+sv5DkTTHj6P8hL5fPaGcrdrBeqB752o1z4RuQCqapnYf7BLhkUKchnlZZxOHcIrU+Dd8cHaNBr5/W1DPKBE8Ryiyd4iW5G35IwW3YhgzmQ9FbWt5XauhZXxEaanQardqGow== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Aug 12, 2024 at 3:00=E2=80=AFPM Jason Wang wr= ote: > > On Thu, Aug 8, 2024 at 6:52=E2=80=AFPM Yongji Xie wrote: > > > > On Thu, Aug 8, 2024 at 10:58=E2=80=AFAM Jason Wang wrote: > > > > > > On Wed, Aug 7, 2024 at 2:52=E2=80=AFPM Yongji Xie wrote: > > > > > > > > On Mon, Aug 5, 2024 at 4:21=E2=80=AFPM Jason Wang wrote: > > > > > > > > > > Barry said [1]: > > > > > > > > > > """ > > > > > mm doesn't support non-blockable __GFP_NOFAIL allocation. Because > > > > > __GFP_NOFAIL without direct reclamation may just result in a busy > > > > > loop within non-sleepable contexts. > > > > > ""=E2=80=9C > > > > > > > > > > Unfortuantely, we do that under read lock. A possible way to fix = that > > > > > is to move the pages allocation out of the lock into the caller, = but > > > > > having to allocate a huge number of pages and auxiliary page arra= y > > > > > seems to be problematic as well per Tetsuon [2]: > > > > > > > > > > """ > > > > > You should implement proper error handling instead of using > > > > > __GFP_NOFAIL if count can become large. > > > > > """ > > > > > > > > > > So I choose another way, which does not release kernel bounce pag= es > > > > > when user tries to register usersapce bounce pages. Then we don't= need > > > > > to do allocation in the path which is not expected to be fail (e.= g in > > > > > the release). We pay this for more memory usage but further > > > > > optimizations could be done on top. > > > > > > > > > > [1] https://lore.kernel.org/all/CACGkMEtcOJAA96SF9B8m-nZ1X04-XZr+= nq8ZQ2saLnUdfOGOLg@mail.gmail.com/T/#m3caef86a66ea6318ef94f9976ddb3a0ccfe6f= cf8 > > > > > [2] https://lore.kernel.org/all/CACGkMEtcOJAA96SF9B8m-nZ1X04-XZr+= nq8ZQ2saLnUdfOGOLg@mail.gmail.com/T/#m7ad10eaba48ade5abf2d572f24e185d9fb146= 480 > > > > > > > > > > Fixes: 6c77ed22880d ("vduse: Support using userspace pages as bou= nce buffer") > > > > > Signed-off-by: Jason Wang > > > > > --- > > > > > > > > Reviewed-by: Xie Yongji > > > > Tested-by: Xie Yongji > > > > > > Thanks. > > > > > > > > > > > Have tested it with qemu-storage-daemon [1]: > > > > > > > > $ qemu-storage-daemon \ > > > > --chardev socket,id=3Dcharmonitor,path=3D/tmp/qmp.sock,server= =3Don,wait=3Doff \ > > > > --monitor chardev=3Dcharmonitor \ > > > > --blockdev driver=3Dhost_device,cache.direct=3Don,aio=3Dnative,= filename=3D/dev/nullb0,node-name=3Ddisk0 > > > > \ > > > > --export type=3Dvduse-blk,id=3Dvduse-test,name=3Dvduse-test,nod= e-name=3Ddisk0,writable=3Don > > > > > > > > [1] https://github.com/bytedance/qemu/tree/vduse-umem > > > > > > Great, would you want to post them to the Qemu? > > > > > > > Looks like qemu-storage-daemon would not benefit from this feature > > which is designed for some hugepage users such as SPDK/DPDK. > > Yes, but maybe for testing purposes like here? > OK for me. Thanks, Yongji