From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9865CA0ED3 for ; Tue, 3 Sep 2024 00:36:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BAA268D011D; Mon, 2 Sep 2024 20:36:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B5A5F8D0119; Mon, 2 Sep 2024 20:36:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9FA6D8D011D; Mon, 2 Sep 2024 20:36:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 819E28D0119 for ; Mon, 2 Sep 2024 20:36:16 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 07F2B1407F1 for ; Tue, 3 Sep 2024 00:36:16 +0000 (UTC) X-FDA: 82521560352.17.6531E01 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf18.hostedemail.com (Postfix) with ESMTP id D098E1C0019 for ; Tue, 3 Sep 2024 00:36:13 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=aQo33qHZ; spf=pass (imf18.hostedemail.com: domain of jasowang@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=jasowang@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725323649; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hNYgwgbX+AdqyBu8oPb0eLYitaIrgeI5+U9GRPp9PIA=; b=VbsWlszvA8wTGm+HFlWH3fK7dKzG7LDcEONY/+7RfEedadkmq4HpL8EdraDIWneXYOjJtX g2MlMaAMp8wi1xqrxU6HEqrotoq16X/4GfMIgGJdllMigYfBVk6ael5gEuXfuQbL6Xw9kj P+//ysgIDruZOO5igFnnhFZ8iOs8Rsk= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=aQo33qHZ; spf=pass (imf18.hostedemail.com: domain of jasowang@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=jasowang@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725323649; a=rsa-sha256; cv=none; b=ozxzKGKIXn/7LGGdLaBk1YwJNwHndiA9/Fs3N0V9eg0ZKjRrn3WQRb3xwg1x3Ze24dztM7 tG8pVM074xGamDrbKEX6gftPunwonXnACmzAg8mFkRLGaqlBJxDMA5WjJlzlQmB2yNndA2 HsEacJh5bD3QZqk+yb2oi/aXCI9lSQw= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1725323773; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hNYgwgbX+AdqyBu8oPb0eLYitaIrgeI5+U9GRPp9PIA=; b=aQo33qHZzmpBDZ8dSWDR7NGBsVcpUKnRuI4WgVJbgimSapMDwDvdJc3p8fyIWPedjz9TZ3 A4Hto9TULLergzpiGMTB73oq+S2u/dY57MJBnjQj5SFomjmJRb7sKTh2I0ZPu8yjLp1zFT INfxaZgWloGZ+Pf58QeV1pAIklkas9Y= Received: from mail-pl1-f197.google.com (mail-pl1-f197.google.com [209.85.214.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-639-6wPbsl7pNya3Uyn8mu6xrA-1; Mon, 02 Sep 2024 20:36:12 -0400 X-MC-Unique: 6wPbsl7pNya3Uyn8mu6xrA-1 Received: by mail-pl1-f197.google.com with SMTP id d9443c01a7336-1ff24acb60dso44034615ad.0 for ; Mon, 02 Sep 2024 17:36:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725323771; x=1725928571; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hNYgwgbX+AdqyBu8oPb0eLYitaIrgeI5+U9GRPp9PIA=; b=sEV8C3TTrTjSJmpNUcB8iHMsUa8zFrTgPL8Sq0WoOuhGGu2bxLMVw4mW8k0W79zgdC tnNL176DssOznNkrJ7UGGjtn48YT1q9MF+7Hr37s7pdGMt4VPG+6l1U4Rn62oiwXEXxE 9ezrv3M656NsenY6eROagzXRmB3IBDb7MeLbTVrXSCia2470+cArlUkJZUkXqDkZjW2x YUfRlb7E32ZgYtEPn2MLyevIrai8cNrgUj06h8UDenp3JSW6TbrSJyhrAxmU52mHtOrl l5RTUecfYKhLZY3IghZuqwPRXTgr/Iv+cZKxNHkJCdd2AfkWsniRL9VMJuHFmotrgX7j 2hVQ== X-Forwarded-Encrypted: i=1; AJvYcCVxazhOoajksOgt21rve557efioYJkLzZnne01wWSRDR+ZnQWQbDS9gBG+9paICe1WCSjiHyy6iAQ==@kvack.org X-Gm-Message-State: AOJu0YyZvoaDB2uPXyfGFO0LkpBmCKF6WWbUoI7FfSG+cKUZHp4VRctj beSrNjMmbQAW4qthXva/rJupM5jhs70Q8Q8qJEBrHmxnpegkMM4yk1Si856u4OYhi9EbhR85Em3 BeceN2qffM1W06PYoJ0b5w4Al0KmwhvgY5yQIcQ6e7N2dnfvVdGB5TQIxerFbB+BpccAww32WVB 9eoAwMg9dstgQ6yeraJPU6nDk= X-Received: by 2002:a17:902:da84:b0:205:5bc9:37df with SMTP id d9443c01a7336-2055bc93a76mr91587835ad.30.1725323770836; Mon, 02 Sep 2024 17:36:10 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHr172wcAJdS9YPgUVg44xHiQjMvqrmTor8v5kr0wfkmFU8k8NWQKur45nn4LsoWrLg3gJT1O7fkvFB8FRHtvQ= X-Received: by 2002:a17:902:da84:b0:205:5bc9:37df with SMTP id d9443c01a7336-2055bc93a76mr91587415ad.30.1725323770020; Mon, 02 Sep 2024 17:36:10 -0700 (PDT) MIME-Version: 1.0 References: <20240830202823.21478-1-21cnbao@gmail.com> <20240830202823.21478-2-21cnbao@gmail.com> <0804627b-49da-4ee0-a09a-19b87a7fdc3d@redhat.com> In-Reply-To: From: Jason Wang Date: Tue, 3 Sep 2024 08:35:58 +0800 Message-ID: Subject: Re: [PATCH v4 1/3] vduse: avoid using __GFP_NOFAIL To: David Hildenbrand Cc: Barry Song <21cnbao@gmail.com>, akpm@linux-foundation.org, linux-mm@kvack.org, virtualization@lists.linux.dev, 42.hyeyoo@gmail.com, cl@linux.com, hailong.liu@oppo.com, hch@infradead.org, iamjoonsoo.kim@lge.com, mhocko@suse.com, penberg@kernel.org, rientjes@google.com, roman.gushchin@linux.dev, torvalds@linux-foundation.org, urezki@gmail.com, v-songbaohua@oppo.com, vbabka@suse.cz, laoar.shao@gmail.com, Xie Yongji X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: D098E1C0019 X-Stat-Signature: xssfx55ya766ygge96bxafr9nawa54s1 X-Rspam-User: X-HE-Tag: 1725323773-870171 X-HE-Meta: U2FsdGVkX19OndamCyj9+AqmAksnhL2ZzneKCLX0BzjfIS2efJ5T1hylt2y/yoEO9UMk7ROfFWVAphsCcfpyzO/q2rNk2Kc8PjUeS1JdPJjTj8SBLayA0M3gUs1AuwaRj9ylQnDcGBLc7vOxQfmOC7Kdk83JPL9CIJwrl/vqwfWHNhs4KefQK0AZbnTce1I06aJ4EmVbBDqZ7sn9y3NQaWkx5bdRiiqAfqKKzalaxjv3903XxPwVv4LUvHKgJ7UDF4wf79LBvt2pYAp1rHFGJ2IzjyH6ThGeCnSgAa680uP/uWoyWvBZjknvEL9U1WCtwXK8TCqymfXWt17KGaBVZaQ9GXRQvx+M0Y34yJwMSpaG1J2rEulz2pi1eia1U4F+rKrh8wvZMeMBpO09JXqzWa/k+boL8SDp0LVSx7oaMaOJUF8N3NgYRDdc4e+zPcyJpwujtzVCKThhb2w812FH4xsND3Gh4/5nDpN94gLtsUZG3JQ33O8sv2P16hhikxzlpyBdWxiCOLP4bLkI3C5ZkszO0G14iQ/03eDUMWmosYaZebIx8mFFd4EIRo1I+prmJLXyzJXwis+9s3oAT5PD2Q89DMGO8yZFG/PNGesqg9LoXPaQpwqt3Jcvok4EG20KTYUyQ6DVfO+B6v49C7i2xMYc++JMK4T6AHkZvtWoq72Zn0pVmhHGpAJQLhpFy8s1/EE2xf9mdC5gXitC36cmWuhTaQnyu6Dc0VMhzb+Q+JnynzAMfieisvbkShdPXnmdmLXjPkcR5LuTfmIHFGt/4ysNDT/9exykRcIbnvO4lDjdsvFd5UQVfftImyqPeW3Ugy4a7ElhWqV5a5AV6yhRVdMAX0wIt+kM6tf9i2iLQroYRx83d4JEEcNnMKw5kQ18j6b9L9IPCPFOid4Qfv1Lvh8Yhc18pggQobCC6sQFXYN1/WZbsH+YwFmZ4WuXaN2Wki2V/DgH/y7hvYTFiV+ j4fm0tnr by4B2NHynyGa7lE/RjIMGSedO/B8o0tY2ntJkrvVkH6VKvz/hJs/OLmc4LJp1T8ZKovka4cu/vO8a00kYIYoztDfHmG/qoisFm1C8gcrmRO9Kz8YbfvzmdnHjQPrpud0RByYNc64O8M8lhvmnFV21hwXqLZjU2fY1UC6Cd+jiCxGiirx9iyGNRZWS0p51HvfjkEhl8YmDmt3r29TuoOcvFUZsF9ovY6M82er5hz8AU0thdkHTl/+mo9vMgtJKBnAFgNvFzR0FugdhnJGXmxOZbSk/b7jef5sjkl2IsTggeD7OUJxzD7YNXQpEwo0NBO8OWCwzQv+xgLFLHD6LG9SiGV/kgfU6snyS2cesIEz4kgwhSOCmDB53uRA9gw2xXm3XC7ShwgFYqLlqn1WgLrkQYVTXaEo4NidCzJVn5RC0XRU6lJojeWplQAef5TkpUv1txH3j X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Sep 2, 2024 at 4:30=E2=80=AFPM David Hildenbrand = wrote: > > On 02.09.24 09:58, Jason Wang wrote: > > On Mon, Sep 2, 2024 at 3:33=E2=80=AFPM David Hildenbrand wrote: > >> > >> On 30.08.24 22:28, Barry Song wrote: > >>> From: Jason Wang > >>> > >>> mm doesn't support non-blockable __GFP_NOFAIL allocation. Because > >>> persisting in providing __GFP_NOFAIL services for non-block users > >>> who cannot perform direct memory reclaim may only result in an > >>> endless busy loop. > >>> > >>> Therefore, in such cases, the current mm-core may directly return > >>> a NULL pointer: > >>> > >>> static inline struct page * > >>> __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, > >>> struct alloc_contex= t *ac) > >>> { > >>> ... > >>> if (gfp_mask & __GFP_NOFAIL) { > >>> /* > >>> * All existing users of the __GFP_NOFAIL are block= able, so warn > >>> * of any new users that actually require GFP_NOWAI= T > >>> */ > >>> if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask)= ) > >>> goto fail; > >>> ... > >>> } > >>> ... > >>> fail: > >>> warn_alloc(gfp_mask, ac->nodemask, > >>> "page allocation failure: order:%u", order)= ; > >>> got_pg: > >>> return page; > >>> } > >>> > >>> Unfortuantely, vpda does that nofail allocation under non-sleepable > >>> lock. A possible way to fix that is to move the pages allocation out > >>> of the lock into the caller, but having to allocate a huge number of > >>> pages and auxiliary page array seems to be problematic as well per > >>> Tetsuon: " You should implement proper error handling instead of > >>> using __GFP_NOFAIL if count can become large." > >>> > >>> So I choose another way, which does not release kernel bounce pages > >>> when user tries to register userspace bounce pages. Then we can > >>> avoid allocating in paths where failure is not expected.(e.g in > >>> the release). We pay this for more memory usage as we don't release > >>> kernel bounce pages but further optimizations could be done on top. > >>> > >>> Fixes: 6c77ed22880d ("vduse: Support using userspace pages as bounce = buffer") > >>> Reviewed-by: Xie Yongji > >>> Tested-by: Xie Yongji > >>> Signed-off-by: Jason Wang > >>> [v-songbaohua@oppo.com: Refine the changelog] > >>> Signed-off-by: Barry Song > >>> --- > >>> drivers/vdpa/vdpa_user/iova_domain.c | 19 +++++++++++-------- > >>> drivers/vdpa/vdpa_user/iova_domain.h | 1 + > >>> 2 files changed, 12 insertions(+), 8 deletions(-) > >>> > >>> diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa= _user/iova_domain.c > >>> index 791d38d6284c..58116f89d8da 100644 > >>> --- a/drivers/vdpa/vdpa_user/iova_domain.c > >>> +++ b/drivers/vdpa/vdpa_user/iova_domain.c > >>> @@ -162,6 +162,7 @@ static void vduse_domain_bounce(struct vduse_iova= _domain *domain, > >>> enum dma_data_direction dir) > >>> { > >>> struct vduse_bounce_map *map; > >>> + struct page *page; > >>> unsigned int offset; > >>> void *addr; > >>> size_t sz; > >>> @@ -178,7 +179,10 @@ static void vduse_domain_bounce(struct vduse_iov= a_domain *domain, > >>> map->orig_phys =3D=3D INVALID_PHYS_ADDR)) > >>> return; > >>> > >>> - addr =3D kmap_local_page(map->bounce_page); > >>> + page =3D domain->user_bounce_pages ? > >>> + map->user_bounce_page : map->bounce_page; > >>> + > >>> + addr =3D kmap_local_page(page); > >>> do_bounce(map->orig_phys + offset, addr + offset, sz, = dir); > >>> kunmap_local(addr); > >>> size -=3D sz; > >>> @@ -270,9 +274,8 @@ int vduse_domain_add_user_bounce_pages(struct vdu= se_iova_domain *domain, > >>> memcpy_to_page(pages[i], 0, > >>> page_address(map->bounc= e_page), > >>> PAGE_SIZE); > >>> - __free_page(map->bounce_page); > >>> } > >>> - map->bounce_page =3D pages[i]; > >>> + map->user_bounce_page =3D pages[i]; > >>> get_page(pages[i]); > >>> } > >>> domain->user_bounce_pages =3D true; > >>> @@ -297,17 +300,17 @@ void vduse_domain_remove_user_bounce_pages(stru= ct vduse_iova_domain *domain) > >>> struct page *page =3D NULL; > >>> > >>> map =3D &domain->bounce_maps[i]; > >>> - if (WARN_ON(!map->bounce_page)) > >>> + if (WARN_ON(!map->user_bounce_page)) > >>> continue; > >>> > >>> /* Copy user page to kernel page if it's in use */ > >>> if (map->orig_phys !=3D INVALID_PHYS_ADDR) { > >>> - page =3D alloc_page(GFP_ATOMIC | __GFP_NOFAIL); > >>> + page =3D map->bounce_page; > >> > >> Why don't we need a kmap_local_page(map->bounce_page) here, but we mig= ht > >> perform one / have performed one in vduse_domain_bounce? > > > > I think it's another bug that needs to be fixed. > > > > Yongji, do you want to fix this? > > Or maybe it works because "map->bounce_page" is now always a kernel > page, Yes, the userspace bounce page is not user_bounce_page. > and never one from user space that might reside in highmem. Right. So we are actually fine :) Thanks > > -- > Cheers, > > David / dhildenb >