From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6D19C3DA7F for ; Mon, 12 Aug 2024 07:00:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 412236B00A1; Mon, 12 Aug 2024 03:00:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3C3346B00A2; Mon, 12 Aug 2024 03:00:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 263CA6B00A3; Mon, 12 Aug 2024 03:00:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 09AF76B00A1 for ; Mon, 12 Aug 2024 03:00:26 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A244A1C2974 for ; Mon, 12 Aug 2024 07:00:25 +0000 (UTC) X-FDA: 82442694810.20.940DEE6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf12.hostedemail.com (Postfix) with ESMTP id A3A4A40024 for ; Mon, 12 Aug 2024 07:00:23 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=IThX+lwE; spf=pass (imf12.hostedemail.com: domain of jasowang@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=jasowang@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723446012; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oSsSzksoUZv3/VShmqUKFIu+S4eeg5+mXA9YiUlfHtE=; b=kBkExFO6QHSewDIFg34HPFJDzZjvPxcCBmjljEAhIWllvQphM6elm7rF18ENc9thu/c48S HkNzR7ZiI+x4I17zP2Z6uJJ2KaT2OLMCd/eLkWS19OBsE7O+qwaz/DjsOzLVmlOr61fVd+ JaZaPwSD+CRvWlLX1opaPuBMvZ96eqI= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=IThX+lwE; spf=pass (imf12.hostedemail.com: domain of jasowang@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=jasowang@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723446012; a=rsa-sha256; cv=none; b=MebWdKqNaGsTrBIJzKGqkKFY2F6DGnRVwiQsIBWL3yjcSOLgQX/xynmhUlZIvoAA+bMId3 pX/ioJBdCwmu+mcZ3q0Bgd6LTjZl29PxE3JstVfb7xybsEpXnC/+XFxq6fWtCesZadrY5y hmfTB58WrSH5E7iA8BE1FSmMXY2nRPI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1723446023; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oSsSzksoUZv3/VShmqUKFIu+S4eeg5+mXA9YiUlfHtE=; b=IThX+lwEbC3JhJlskRI5El0p7y587WMtqqZEbwMmv2MXGkVOGyzvhJaHjwPlnQtI3PlAKs cIz3KGBWaZ1vFeiCbpVMzgEwbBmGROIDuHp5AKDJrlo8LqHr+L8BKiiA4sMpgKnsVlr0Z+ YpgFpXBNGIb+ocRCdttrRZ3ckExaS5o= Received: from mail-pj1-f71.google.com (mail-pj1-f71.google.com [209.85.216.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-35-TXrTHESJONKSrolppOBNCg-1; Mon, 12 Aug 2024 03:00:21 -0400 X-MC-Unique: TXrTHESJONKSrolppOBNCg-1 Received: by mail-pj1-f71.google.com with SMTP id 98e67ed59e1d1-2cfe9270d4aso5542024a91.0 for ; Mon, 12 Aug 2024 00:00:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723446020; x=1724050820; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oSsSzksoUZv3/VShmqUKFIu+S4eeg5+mXA9YiUlfHtE=; b=EmSNa6QOYE5z5dfY2mpYTV6k4VrGUx9vXNZl4mKVoyWjYz7ylW39EJdzCG9u7LCQrP l+Vap1zUvSEYK+x6QngkfTJp1Eswb81kptutgiJF8MJdADxrwumA+XwnYn6IuQtvpFCr BONjnosLHUogYAty7AWb0Sfd3bqQ8IePBXiA1gXY92KpDrzbrkJvT7mcJDldY+JZwVDy jXjfBbhvzT7gq7HafwuCdpKc9xVmOghr6s1d5Q0OgJhjJcY5jv0siPB6IRkLOcXJoqFx NHQ1Hk950eB5Y9nuUUJyUh87bBE/PfT42iAth6RmFJ1Uif65rk8zsy5Df6GKHmvDcUh1 GMbA== X-Forwarded-Encrypted: i=1; AJvYcCVwqdHr+6UpAzCUOlVwfL7C++f/Or0Ooj5pDPp0JDUZty1phNiC7T9YsITszjuIK5DKvJx+hwvyqp7Uj46rd0AE4UM= X-Gm-Message-State: AOJu0YwylQa5cJ5rWTq12fEV6UrRRQN4ko1gHrK/O4mTcfD07yqgLQdc 1SQD70CkYW0+V+NjnuuFAjMk+O/9V0jxaK4YZt8IIiNDOYGewsTmmbHYjLVa4acAgq6MvnUJbBO PUYf43tgR2ASX8jHpcyf0qD0kfwHHn5/gi689LDYhLIKedafYuWJXJkssYf+XagKF0ocSt4VtjG tLeEpS6GbxksPe2kJtaKRuAac= X-Received: by 2002:a17:90b:4f8a:b0:2c9:e0e3:e507 with SMTP id 98e67ed59e1d1-2d1e7c5e412mr9009101a91.0.1723446020282; Mon, 12 Aug 2024 00:00:20 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHpbWsbXYgU8ce3mzCHOLbVvxlpYnUfy/J6VF+x38GwFCbri9IY+Rpchg44Neo2WJoxq/aQmGduo7dzNVi0jm4= X-Received: by 2002:a17:90b:4f8a:b0:2c9:e0e3:e507 with SMTP id 98e67ed59e1d1-2d1e7c5e412mr9009053a91.0.1723446019049; Mon, 12 Aug 2024 00:00:19 -0700 (PDT) MIME-Version: 1.0 References: <20240805082106.65847-1-jasowang@redhat.com> In-Reply-To: From: Jason Wang Date: Mon, 12 Aug 2024 15:00:08 +0800 Message-ID: Subject: Re: Re: [PATCH] vduse: avoid using __GFP_NOFAIL To: Yongji Xie Cc: "Michael S. Tsirkin" , Xuan Zhuo , Eugenio Perez Martin , Maxime Coquelin , virtualization@lists.linux.dev, linux-kernel , 21cnbao@gmail.com, penguin-kernel@i-love.sakura.ne.jp, linux-mm@kvack.org, Andrew Morton X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: hjrgmkdrt9ru3183ixtrdbmqx1xj1h74 X-Rspamd-Queue-Id: A3A4A40024 X-Rspamd-Server: rspam11 X-HE-Tag: 1723446023-884190 X-HE-Meta: U2FsdGVkX18OQ74/wW+6imTUEhcG0w6Rrr6QJWZSqAj8RtVa7JCNoF++5jZixoJohYfIYG6S6vwy2J3De4Z9xMb2Lqzn7Ox16yvj8MGj0XsDqb5sKq/UpviwDs96VbSzbkhZh+KauBuVHwe4jkqpMnaWr+AKCnzq89B8l8iiVNWeFkktmYtur5o9QZuzUyMBjx1pBxTYisih4ItZlCLq7t13kK+rsD4lpjMnDl/7Ro7DBWV+LlDN5kLGsY+uHqpzyycfcKuNP0G2D+05IMAOM/5mBC2OnGd+xULXkjfeaZr+rYd55OtH5P44zd4skQQkDl/Kykme+RycUAuR+r91Y9QU5TNh3YOYh7Y2VYyrvqLZhCJ5gXXe+zoytq2EhdlperzOD4HANKYjwM2LVbUTtFRgdeeOmY9gdXchAZT6J9bO5rRz3xHQUkzs1ZyT4eF4JZiJwOz5klUSOCcCC+ZwDBKTpuqY5KgQ+7BN9YIugNSvCZX+O/WYrtXrfv5cDcOdWcNXV9lfwRF3/ZgWDQaafX4ZV3v9SU4IpS5l49r9el5DB865Fzb3QQGNg68/WuKocQeXg/XUgQGsNQZ/I0ryjkVctQOIqMcDvFfmb/So9JqSriEPl2CDyGNCcXho/OPAyWEfvDBDyFhYcLP8fVPb7Vf0PrhoV/q8SxEuHvwnOC9Svf9KWo9ihDKatwWUaVt+p9DlLA8HCuW+soHx4dU5E4NzO1dQ5xpdzlFrjEXoT/1wIqpkKYRtkUjKRXx7VygXHsAsbNsMg0LudAd1ecWGdqcVYL/0/N59vBv5w/ddDXdx7Ox38MMIKV+swUY5JkYa3BAmFp8LaDNLioqTnTvkisRjmLgC3NiDt7FZR+jtm2Q0wNBV+/u48m2/MRNIloTuuE6kS6ikDkf0EGjR/20oIDViLxeixgzYqrtwtpVUnGqJRqOe4xadlgp5N6oVNZu8PmVMhoPtTV9YpvzEFNv Mn2StWrX t+25/RCyb55QzohWFVV1VKL7OXjgjhUDT8uZJnF0oPMsZWzgrHSD/j3s8Ji4mi9ukfpHJZiljYmSl962iAU1iIsmOYQV8ys0kj2QrOaWXmwOlWI7VvEJ7TEyCPR3G94Ed1pKe1O7FiQ4qDTPE+4F7VPKVWUMbSDhaDSnrjzr9zyoKNWqzpIoG3IXvivfTQu/S6w5wdz9XYQZH+wae7Yu7A0q6E0NcvLVNoM8Q/D8LIVvTTUEo6uGpvhOddqVyu8e0UWsjXmNQI8Gk9CJVoBjGHnF/fFRqFkc93PcfyzQZYqtB56gKlEPg/mO5lGsnz3QPrhg8jsSEp8tx93V3vbRrK8P0t7AMfjjg5TG36dDFKrZXGNsW/J45JZBxls/WxlPXGnZTgBZ8jXqS3qgpbPuV+xk3yDSL0W7Bwe0SlldtVP/HziA2tvZgoPxQFIYgMu3+rH79dsREt/+yG5U= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Aug 8, 2024 at 6:52=E2=80=AFPM Yongji Xie = wrote: > > On Thu, Aug 8, 2024 at 10:58=E2=80=AFAM Jason Wang = wrote: > > > > On Wed, Aug 7, 2024 at 2:52=E2=80=AFPM Yongji Xie wrote: > > > > > > On Mon, Aug 5, 2024 at 4:21=E2=80=AFPM Jason Wang wrote: > > > > > > > > Barry said [1]: > > > > > > > > """ > > > > mm doesn't support non-blockable __GFP_NOFAIL allocation. Because > > > > __GFP_NOFAIL without direct reclamation may just result in a busy > > > > loop within non-sleepable contexts. > > > > ""=E2=80=9C > > > > > > > > Unfortuantely, we do that under read lock. A possible way to fix th= at > > > > is to move the pages allocation out of the lock into the caller, bu= t > > > > having to allocate a huge number of pages and auxiliary page array > > > > seems to be problematic as well per Tetsuon [2]: > > > > > > > > """ > > > > You should implement proper error handling instead of using > > > > __GFP_NOFAIL if count can become large. > > > > """ > > > > > > > > So I choose another way, which does not release kernel bounce pages > > > > when user tries to register usersapce bounce pages. Then we don't n= eed > > > > to do allocation in the path which is not expected to be fail (e.g = in > > > > the release). We pay this for more memory usage but further > > > > optimizations could be done on top. > > > > > > > > [1] https://lore.kernel.org/all/CACGkMEtcOJAA96SF9B8m-nZ1X04-XZr+nq= 8ZQ2saLnUdfOGOLg@mail.gmail.com/T/#m3caef86a66ea6318ef94f9976ddb3a0ccfe6fcf= 8 > > > > [2] https://lore.kernel.org/all/CACGkMEtcOJAA96SF9B8m-nZ1X04-XZr+nq= 8ZQ2saLnUdfOGOLg@mail.gmail.com/T/#m7ad10eaba48ade5abf2d572f24e185d9fb14648= 0 > > > > > > > > Fixes: 6c77ed22880d ("vduse: Support using userspace pages as bounc= e buffer") > > > > Signed-off-by: Jason Wang > > > > --- > > > > > > Reviewed-by: Xie Yongji > > > Tested-by: Xie Yongji > > > > Thanks. > > > > > > > > Have tested it with qemu-storage-daemon [1]: > > > > > > $ qemu-storage-daemon \ > > > --chardev socket,id=3Dcharmonitor,path=3D/tmp/qmp.sock,server=3Do= n,wait=3Doff \ > > > --monitor chardev=3Dcharmonitor \ > > > --blockdev driver=3Dhost_device,cache.direct=3Don,aio=3Dnative,fi= lename=3D/dev/nullb0,node-name=3Ddisk0 > > > \ > > > --export type=3Dvduse-blk,id=3Dvduse-test,name=3Dvduse-test,node-= name=3Ddisk0,writable=3Don > > > > > > [1] https://github.com/bytedance/qemu/tree/vduse-umem > > > > Great, would you want to post them to the Qemu? > > > > Looks like qemu-storage-daemon would not benefit from this feature > which is designed for some hugepage users such as SPDK/DPDK. Yes, but maybe for testing purposes like here? Thanks > > Thanks, > Yongji >