From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3A76C3DA4A for ; Mon, 5 Aug 2024 10:42:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 410D36B0098; Mon, 5 Aug 2024 06:42:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3C0846B009B; Mon, 5 Aug 2024 06:42:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 263F36B009C; Mon, 5 Aug 2024 06:42:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 04D776B0098 for ; Mon, 5 Aug 2024 06:42:22 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 7E0B61A022D for ; Mon, 5 Aug 2024 10:42:22 +0000 (UTC) X-FDA: 82417852524.23.E9A9A07 Received: from mail-pf1-f174.google.com (mail-pf1-f174.google.com [209.85.210.174]) by imf27.hostedemail.com (Postfix) with ESMTP id B7DB740007 for ; Mon, 5 Aug 2024 10:42:19 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=eCJdO9Fd; spf=pass (imf27.hostedemail.com: domain of xieyongji@bytedance.com designates 209.85.210.174 as permitted sender) smtp.mailfrom=xieyongji@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722854533; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FefjCS9JO0pgTqWLtDs1WkzzdqkkS+cMM43SOFPyy/k=; b=SUYFLs778oogo5wFcZgKGmG1D2ifvbZ5N5SFKcm1OOnL2rXklg3UpRhjWpSW4WEGDa4B/e CvIWIAcMrqrG4qG/qjXSGNnGQN0qW4jOv1hK338GiXWjTWiOdYhzpaDuyZNTrF40L2xLJ6 oSH3otrWZkQpn492F5w/6QltW2IDnRI= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=eCJdO9Fd; spf=pass (imf27.hostedemail.com: domain of xieyongji@bytedance.com designates 209.85.210.174 as permitted sender) smtp.mailfrom=xieyongji@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722854533; a=rsa-sha256; cv=none; b=SaEkTIfIDPjTAJX3XnXxYz2A3Xtm0SDplQLw7B6n5c9a8YCvbxIJ+D/Gdgq72qoasLTaO4 3LLr1s0lEDb9Jg2EVo8ryzcb+9O8hKyZpKep3eLa3h540wB0+1ibJQhln4lPOyRJrAoJml /BCpYKd3oIWMtyiS3q77kLDrM9CPGBM= Received: by mail-pf1-f174.google.com with SMTP id d2e1a72fcca58-70d2b921c48so7960604b3a.1 for ; Mon, 05 Aug 2024 03:42:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1722854538; x=1723459338; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=FefjCS9JO0pgTqWLtDs1WkzzdqkkS+cMM43SOFPyy/k=; b=eCJdO9Fde/sbbnx5dEj872Drm90vjNYSuzvGkc1BJZ0yVPMgdDnsS7FnwfskGl36Qq K7euN0cUbVPPJm/lprEQW8Fy0Q1Bo4BXaFTmMVWFx518CiXuz/GxZl7t02PJXvRn4XHt r5t5ZGq2HsSIQvnmdQMeQoYydZfVKBieSqjE3TfQuLyb0j5faPFWq2qWy8MUm0Ez7ODP oY/C/SZuC93d/8/CWL3LTr5HOrF7uR3Am5zAjYXH/PCmbG7zFnbcJV16ZfZut5bRFXDh /WaFHuMZc1O9EyUhjEVBwKIBj/2rGBnkPeiaLAQ28BEDHiFZALida1Uj0YbMHFStlaSm I76w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722854538; x=1723459338; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FefjCS9JO0pgTqWLtDs1WkzzdqkkS+cMM43SOFPyy/k=; b=NzpPAfj6uKPv1FiA6Ma7i+zg2mu9w1TOalA9DatJVbONtsqKfc98yR32NsQYPWNuJR arWJLwNOP1chTfuy3CqemhwEldnng1gmuEe9MuFVPaZcIuWna5CNld7N5xhSnEcKC0q1 yfzgtAvMUkPwG2vxE8QTdbU+XRC7ECFOvU/xnaH9fsibt9M3o766NsE5Ilb7yUcRQDva l5cXa3xqseq3bBMdWOlT/UY0r2UEJYnfrLULc/i8poEs3ZvOX1uPUtPROGs6wT0DF00p ENV4JfR1nk4qd24lV9uU1vuYs6Bq6g4hUanD2ajMa+4zt3Uprw71Bo/vzZi0sPcO0YXd dIjg== X-Forwarded-Encrypted: i=1; AJvYcCWy+lXVsqKN5QO4baIRtmMvuElENE9D6yXYGNfs0Bys5alZFsxsxYqSwFZ19+OPGYiDc1+gp9EZ6jvG9h9ihYLKfyQ= X-Gm-Message-State: AOJu0Yy+yNbyLg+eBJRFKzF1iD6uZ+IYA8XCLM13Maz/PGgf0Q5btOZe Ul4bFHvbtjZTx9r1n65ejFsXTQsaKjD/I70UTZCdOCIj3rQ89uUkfO7Vxi56GqH66pfYLb+FOc+ 2zy7o/XHqGjqvg5CV25aKtrYn2vHUFtVbuUdm X-Google-Smtp-Source: AGHT+IFw4VZjzhffqjzriPK34h+9Q9/nNAVuqTGi+iy66jK9C2/MunTIueyB4IGjU77PXNHL75GZq3CubKPUdyD4hdU= X-Received: by 2002:a05:6a20:c916:b0:1c4:8824:24cb with SMTP id adf61e73a8af0-1c69956aaabmr11279652637.12.1722854538144; Mon, 05 Aug 2024 03:42:18 -0700 (PDT) MIME-Version: 1.0 References: <20240805082106.65847-1-jasowang@redhat.com> In-Reply-To: From: Yongji Xie Date: Mon, 5 Aug 2024 18:42:07 +0800 Message-ID: Subject: Re: [PATCH] vduse: avoid using __GFP_NOFAIL To: Jason Wang Cc: Maxime Coquelin , Xuan Zhuo , "Michael S. Tsirkin" , Eugenio Perez Martin , virtualization@lists.linux.dev, linux-kernel , 21cnbao@gmail.com, penguin-kernel@i-love.sakura.ne.jp, linux-mm@kvack.org, Andrew Morton Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: ut67zmsp9obm1daurw6nxg1mycnkxki3 X-Rspamd-Queue-Id: B7DB740007 X-Rspamd-Server: rspam11 X-HE-Tag: 1722854539-92904 X-HE-Meta: U2FsdGVkX188G4gHShdm+GNgUf1Q6JOaf+QV1WXG0RCXqJtcjmBEtsdlujNcYOannIfauoceO1RnPVNHtFAlEf7UTeLlaFYAb1oyv1YkGzsVATYZjJxawQvrYCRUKoyJGusB/1e49z2rXETc/+EJJMe0dd44qv8zw0/yNo2kp5395BJL28Wt4dWDugRoLtSSaMijtcmwyNS8HBKyyAJhMvwc/EodE2LIKVyrq5vP04BiK6FCDd3dqd/tYsIVDr2GGDGNJ0tQhlKfaEYKEBsJWb4VG3sMo//bKo4brb+J78LoaGoDKA2K/wYPPxGV+nApoWsWgmUKO4DARoiweiL5NEnYJ+Jo+2WP+LklL3AlDYdEczV880PMYh2bydfqfVGF+r0Uo0QBOG+6bfe2I0CHuablD71h6XEB65Vvws4chXm/2AKj79eOEHbtqXx6+ltfS9uIPtdKtjDVpgV27s3Xht0QJmULNw7ValM9qZ+tYN4BSrP7g9jYIPJ5c8QifyVUZBC8CpLbd/NPn5+DeDU23FvBXtufDWIWYY5GCKNfe9Zn9sFB/UVO7HS/7AvM80vBsUj/LvHq8qWEGPO4px20ZpbeAj7nTf/IPfjNk+m28YPEgjxOCB2wGe8vubrrCAiVOgbznCdFQe0TFVrSp2V5yYGU1gF0wQwl30qaWDK5G+D7O9ZPbygo7aUjcEGv65bgSmetKJtIBhVHzRtHWxZhDMaFddO8DWClHJrWQu1Lm4HGtvlDenf8Cs91NAD9q/H3pbIR6b14juo4wv5DkTTh8gTkHiGGBwqpOwPZR2yuxFdzJwzDVi36jts+CGKaNca3UdFzz1lwqHDYnZopWaWoqR/ZEDhadjpnJ6OLHUJHJ1V9SQzSGfjHiKGk6dIwHt5YLivhDQaU1WTyFmX3vkWVJdv0SpniM6FGnYyCFATAVCAYBXTlhQF+FR6T2BMGwStHPSfY7hItztY2jFaIMqU geKRfSBy jOjdbotw8gnlPXBKaVLQ/ovI7r+QAhD1uvM8ObktZl0dTR11Y8k63JNUTD0jWvrqIsxgVwsyTX7/dx+gMToNPWuDR9D0h/cVKI8LwlE3IlDsPHAFfBMw0vYM3i9diU6m080aXnKGbfwbNGJcUIVF8x90DyV21nSeqLSPCUJHWb68qzZnlckZaLP/Xp7l/9LKPZESyl75KPh7jskEl4KLlL63NUdTPtFtrWzkmR4ux3KyK4m6uMMmXsKoI+E5mulyKQg2TjF0F8d4iZpKyHRVgUxK7aksdOHQM9ma0UlS/PvIf/yCpmq+CgNcaTiwPBihU6LNHjP+fJS7ia1HHDNe92iMOUQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Aug 5, 2024 at 4:24=E2=80=AFPM Jason Wang wro= te: > > On Mon, Aug 5, 2024 at 4:21=E2=80=AFPM Jason Wang w= rote: > > > > Barry said [1]: > > > > """ > > mm doesn't support non-blockable __GFP_NOFAIL allocation. Because > > __GFP_NOFAIL without direct reclamation may just result in a busy > > loop within non-sleepable contexts. > > ""=E2=80=9C > > > > Unfortuantely, we do that under read lock. A possible way to fix that > > is to move the pages allocation out of the lock into the caller, but > > having to allocate a huge number of pages and auxiliary page array > > seems to be problematic as well per Tetsuon [2]: > > > > """ > > You should implement proper error handling instead of using > > __GFP_NOFAIL if count can become large. > > """ > > I think the problem is it's hard to do the error handling in fops->release() currently. So can we temporarily hold the user page refcount, and release it when vduse_dev_open()/vduse_domain_release() is executed. The kernel page allocation and memcpy can be done in vduse_dev_open() which allows some error handling. > > So I choose another way, which does not release kernel bounce pages > > when user tries to register usersapce bounce pages. Then we don't need > > to do allocation in the path which is not expected to be fail (e.g in > > the release). We pay this for more memory usage but further > > optimizations could be done on top. > > > > [1] https://lore.kernel.org/all/CACGkMEtcOJAA96SF9B8m-nZ1X04-XZr+nq8ZQ2= saLnUdfOGOLg@mail.gmail.com/T/#m3caef86a66ea6318ef94f9976ddb3a0ccfe6fcf8 > > [2] https://lore.kernel.org/all/CACGkMEtcOJAA96SF9B8m-nZ1X04-XZr+nq8ZQ2= saLnUdfOGOLg@mail.gmail.com/T/#m7ad10eaba48ade5abf2d572f24e185d9fb146480 > > > > Fixes: 6c77ed22880d ("vduse: Support using userspace pages as bounce bu= ffer") > > Signed-off-by: Jason Wang > > --- > > Note for YongJi: > > I can only test it without usersapce bounce pages as neither DPDK nor > libvduse users use that. Would you want to review and have a test for > this? > I can do some tests for it. Thanks, Yongji