From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49653CA0ED3 for ; Mon, 2 Sep 2024 07:58:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE3D58D009E; Mon, 2 Sep 2024 03:58:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C93F68D0098; Mon, 2 Sep 2024 03:58:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B39858D009E; Mon, 2 Sep 2024 03:58:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 9200B8D0098 for ; Mon, 2 Sep 2024 03:58:42 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 1BD99161ABB for ; Mon, 2 Sep 2024 07:58:42 +0000 (UTC) X-FDA: 82519046484.08.692B9B4 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf29.hostedemail.com (Postfix) with ESMTP id 9B17B120015 for ; Mon, 2 Sep 2024 07:58:38 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Vtu+0qec; spf=pass (imf29.hostedemail.com: domain of jasowang@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=jasowang@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725263895; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NECJn/hf3RY85bFxDr+7/yBNmaMtfM+SK6hVtERdSsI=; b=qTwHUyR2FP8bGMt/xtwuApa3o0ruLshletpDopokKwJ+1gTjI9qnqxpjU1LUUKJk1zIAJE KZg8r3xJAp09wONjjP1mwCR7oNew31JL24hUM3UGX9Zty3qFFlrOWoYtBsUThYSInDJRoP 042lkCKpbj7FBfo8fkQVXBH2GFs6KG4= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Vtu+0qec; spf=pass (imf29.hostedemail.com: domain of jasowang@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=jasowang@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725263895; a=rsa-sha256; cv=none; b=GCvCesbIYRFpuTJYqDoivzdE9s73wDq3XW23LfWo3OxON4PNuXLuxVMF1fmsvm9hBiipNt TR6xt1NGgVpSRj05o36+GMYtKQ1RkMs1aoTfhDfzDC5QW9Bqp8oPTnBhA/9a4lbhKzgL+V 9Gt46i5bz7yvCqwlRPtqAcu9g069mL4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1725263918; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NECJn/hf3RY85bFxDr+7/yBNmaMtfM+SK6hVtERdSsI=; b=Vtu+0qeceJbFEznNqzpgkft75NvqESlfaXdTgEHmBwS0Qd2tdRNZ1W84HGDQzynoRbDuTr +u3qcz7T7jpY/iCfCAld7N5oGsNmARzVM5RjLVPJMywEc5HzLnkpQfCskJgR3o2PLM+sLK /JBk50njpg/fedxYt879TWcKbXBAOuk= Received: from mail-pj1-f71.google.com (mail-pj1-f71.google.com [209.85.216.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-92-6odUgnA6NQy80RuyImQadw-1; Mon, 02 Sep 2024 03:58:36 -0400 X-MC-Unique: 6odUgnA6NQy80RuyImQadw-1 Received: by mail-pj1-f71.google.com with SMTP id 98e67ed59e1d1-2d441cdb503so3989801a91.1 for ; Mon, 02 Sep 2024 00:58:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725263915; x=1725868715; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NECJn/hf3RY85bFxDr+7/yBNmaMtfM+SK6hVtERdSsI=; b=j0+i0Q9QMpE67Q/8XoZ57EVveEszdJ8NfcAmyoGWU2g83kPob+vrmoeTaOUuKK54Eo OdvRKZ9fgNwybZBokTFRDGACuA1ZlJBfMVkkKPMxg8wJQx8pCFu9vVl+yCJ9iStCToTe Q0pbO5nqm2CyL9OyEXB9LB6DaFnYL/Ap92koUwS2r2mhizOmvyGw0Ctn16006RmZaKD3 iYX+/mw9d5PoKYnxr4hXDiZeQ/Yinvjq7v/t15sYKIQbI/rshbeQtouZCJw2TeBsRKnN B4AvsF54ZqclVpKRaMngPAT+KAQDfjCYZB+fBtgmp7ozhG7nTa7tHnlAMXpqnvYFZSld 2Dlg== X-Forwarded-Encrypted: i=1; AJvYcCWxLgqlIhFCeeIRbU8BkCXoxiCszI4BKtuiX1Rvvp0LlvlzMYlDQ53uXyykjTtz735IP9L3mSXZ+w==@kvack.org X-Gm-Message-State: AOJu0Yx+QrYHWL44jA2DL0qCnlBXGf8mJ7Ia7UUjaSd/Mzgf49la0BsA RNESZCPZNjdV9ERG0HhfvT0pdMCBZj7SV3oHE9DSaLaCRblsIk9hOaNNcDC3zyxKdfuYYfy+/Gz RFqbHBEFfbawBOnwsLQKPD5nsUR3PSoPb4GTfjxgvKh7mlwI4fM+RIlj0zz2avv8js/IFbZBxEk Jtrnvm2k46hIC6y4FA4NSwAyJ8e0cQdhk= X-Received: by 2002:a17:90b:1897:b0:2d8:9226:aa94 with SMTP id 98e67ed59e1d1-2d89226bbabmr5419807a91.1.1725263914616; Mon, 02 Sep 2024 00:58:34 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGyHZ6f1mIMCnXW8b12KM4KaR2GDnh0unB84S72waJyrJzC11O44dTASCgepEPNt1ZH+w8lYR5yZ1H/ZVzQ1sk= X-Received: by 2002:a17:90b:1897:b0:2d8:9226:aa94 with SMTP id 98e67ed59e1d1-2d89226bbabmr5419775a91.1.1725263914061; Mon, 02 Sep 2024 00:58:34 -0700 (PDT) MIME-Version: 1.0 References: <20240830202823.21478-1-21cnbao@gmail.com> <20240830202823.21478-2-21cnbao@gmail.com> <0804627b-49da-4ee0-a09a-19b87a7fdc3d@redhat.com> In-Reply-To: <0804627b-49da-4ee0-a09a-19b87a7fdc3d@redhat.com> From: Jason Wang Date: Mon, 2 Sep 2024 15:58:22 +0800 Message-ID: Subject: Re: [PATCH v4 1/3] vduse: avoid using __GFP_NOFAIL To: David Hildenbrand Cc: Barry Song <21cnbao@gmail.com>, akpm@linux-foundation.org, linux-mm@kvack.org, virtualization@lists.linux.dev, 42.hyeyoo@gmail.com, cl@linux.com, hailong.liu@oppo.com, hch@infradead.org, iamjoonsoo.kim@lge.com, mhocko@suse.com, penberg@kernel.org, rientjes@google.com, roman.gushchin@linux.dev, torvalds@linux-foundation.org, urezki@gmail.com, v-songbaohua@oppo.com, vbabka@suse.cz, laoar.shao@gmail.com, Xie Yongji X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: o1isbpsau8bj9ju5p8s3guwdeftaffj5 X-Rspamd-Queue-Id: 9B17B120015 X-Rspamd-Server: rspam11 X-HE-Tag: 1725263918-972431 X-HE-Meta: U2FsdGVkX19ZvOXLKdTYz4i0zp02Xg1mzozvNguhsBjtBP4rBscrrppnpi9ckXRKCk+qHFjRJJ7peILBILHx1mftKM8RBSjEtZj1QkLrASO1MvXrU2icPw5iv0aTUx0rN/db7eyWBZSmNFbgFTXAG4llufFwaHNMzEIS4pAQ8c75/Jx9Embk36h5KRcTvgXQJlTINsjkJrPKwZis7yMIFRHiFVEawuXRCC7gEkRPGPKRHY1pdCgn8rlE5M4tSyUSMhz/7HYizn6loV8x4ZuEn0Wmp0AtQ+O/DF0g2uBkWwqfFiQ776LwCs0rwtHHgyOdYTAkSPiOdOgFUq+0cQ2jtHUDI3Ld6e6fEE+VHV9lAX8eqyzqFviV60govVYnhf0GPQi4tXGLBLe9pqIE+UCZbheQ1ivnkp7663cAxSbIuE1KfqhtHPHEtVCo3mR797hmhDajtTDnYoKfc1Dd4ZU0/dX+7vm+3IMbP9i4OI986jA72pnL1XT1JaX25uAVozoPO2pe+slRo2lfm7fmZEvgHpreDQreijlzCWaEpvuNDfeblejd2BDer90xU7EixmQ20y//ZdSd38ZNomTqiWy0NCXvGoe3huNAk3RTzN70/H205H3Weyi6Q3ySxwIgdieDPVbNrnal2p5NdodPsM7bcMI05LAjvbX26+Nk9nce97HdnoqAl/dSvlhVxjQ/dWe5UlcP5FESWsA8koRdKcDeP8vqtpIoUCO4/3nm6qRG+GsxBZIXavTA94lbqjUlsYXByJWb8tAuuJygC/FqQdGBZvHuX5UB4nWrp1T2EmbHao6bsdLzYZr4djW0kcBYicGFBYONXdYqHefaZmApgyhKSc21gIfoOUIcumTQbEYSO2r3X8+GuWe7m/gV+Nk776KbJHOBmqkNkJIykQVSgED3egZYRahFLQoCnak8FaPVTsQi7vrR7iaTJpRAb+01ckY0nOxWkpAU+U3HxpIhWRw F8U6NER7 az2J/nLebItSg3APbAP34SPhpu+TgaMnWkmU83TQ7ZDdcXhdvatZrBZ/Cl/WOTMCdpudiYd0tPQkaE6y20fI4YCVzGX5ZLVX7kPTScuTWDPt8HGoDIPEHHuf0qG5o9CaKNz9X4lMJrYIPymoOGRusv0NxP1J4cpIm42hR9E0aJbvF75rPomdE6e3CbceV4eYuFSkP1+2Z+ySJBJg8dO4ZyEyYerQU/2qzBby47hYO72ftHXGn/7eQHfd2qBTkUdDTFjr2fSLu+PYTC4aukufzhHhxyNoB2pvaDtHr+PvqMcRPTv8jCX0nc39SC5GCYW/eY6b/dEGoZT2nDb58gtLbhd5egdvAacgnv7rWSam8qN4riu/9hz9Vj9mDmDXo4T8wDcxuyOUbqUNNNdayAM2ovZtAARQm0MsVum44d7zMwGASHYm9wiHSCxlvKErPGbECsOSw X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Sep 2, 2024 at 3:33=E2=80=AFPM David Hildenbrand = wrote: > > On 30.08.24 22:28, Barry Song wrote: > > From: Jason Wang > > > > mm doesn't support non-blockable __GFP_NOFAIL allocation. Because > > persisting in providing __GFP_NOFAIL services for non-block users > > who cannot perform direct memory reclaim may only result in an > > endless busy loop. > > > > Therefore, in such cases, the current mm-core may directly return > > a NULL pointer: > > > > static inline struct page * > > __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, > > struct alloc_context *= ac) > > { > > ... > > if (gfp_mask & __GFP_NOFAIL) { > > /* > > * All existing users of the __GFP_NOFAIL are blockabl= e, so warn > > * of any new users that actually require GFP_NOWAIT > > */ > > if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask)) > > goto fail; > > ... > > } > > ... > > fail: > > warn_alloc(gfp_mask, ac->nodemask, > > "page allocation failure: order:%u", order); > > got_pg: > > return page; > > } > > > > Unfortuantely, vpda does that nofail allocation under non-sleepable > > lock. A possible way to fix that is to move the pages allocation out > > of the lock into the caller, but having to allocate a huge number of > > pages and auxiliary page array seems to be problematic as well per > > Tetsuon: " You should implement proper error handling instead of > > using __GFP_NOFAIL if count can become large." > > > > So I choose another way, which does not release kernel bounce pages > > when user tries to register userspace bounce pages. Then we can > > avoid allocating in paths where failure is not expected.(e.g in > > the release). We pay this for more memory usage as we don't release > > kernel bounce pages but further optimizations could be done on top. > > > > Fixes: 6c77ed22880d ("vduse: Support using userspace pages as bounce bu= ffer") > > Reviewed-by: Xie Yongji > > Tested-by: Xie Yongji > > Signed-off-by: Jason Wang > > [v-songbaohua@oppo.com: Refine the changelog] > > Signed-off-by: Barry Song > > --- > > drivers/vdpa/vdpa_user/iova_domain.c | 19 +++++++++++-------- > > drivers/vdpa/vdpa_user/iova_domain.h | 1 + > > 2 files changed, 12 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_u= ser/iova_domain.c > > index 791d38d6284c..58116f89d8da 100644 > > --- a/drivers/vdpa/vdpa_user/iova_domain.c > > +++ b/drivers/vdpa/vdpa_user/iova_domain.c > > @@ -162,6 +162,7 @@ static void vduse_domain_bounce(struct vduse_iova_d= omain *domain, > > enum dma_data_direction dir) > > { > > struct vduse_bounce_map *map; > > + struct page *page; > > unsigned int offset; > > void *addr; > > size_t sz; > > @@ -178,7 +179,10 @@ static void vduse_domain_bounce(struct vduse_iova_= domain *domain, > > map->orig_phys =3D=3D INVALID_PHYS_ADDR)) > > return; > > > > - addr =3D kmap_local_page(map->bounce_page); > > + page =3D domain->user_bounce_pages ? > > + map->user_bounce_page : map->bounce_page; > > + > > + addr =3D kmap_local_page(page); > > do_bounce(map->orig_phys + offset, addr + offset, sz, dir= ); > > kunmap_local(addr); > > size -=3D sz; > > @@ -270,9 +274,8 @@ int vduse_domain_add_user_bounce_pages(struct vduse= _iova_domain *domain, > > memcpy_to_page(pages[i], 0, > > page_address(map->bounce_p= age), > > PAGE_SIZE); > > - __free_page(map->bounce_page); > > } > > - map->bounce_page =3D pages[i]; > > + map->user_bounce_page =3D pages[i]; > > get_page(pages[i]); > > } > > domain->user_bounce_pages =3D true; > > @@ -297,17 +300,17 @@ void vduse_domain_remove_user_bounce_pages(struct= vduse_iova_domain *domain) > > struct page *page =3D NULL; > > > > map =3D &domain->bounce_maps[i]; > > - if (WARN_ON(!map->bounce_page)) > > + if (WARN_ON(!map->user_bounce_page)) > > continue; > > > > /* Copy user page to kernel page if it's in use */ > > if (map->orig_phys !=3D INVALID_PHYS_ADDR) { > > - page =3D alloc_page(GFP_ATOMIC | __GFP_NOFAIL); > > + page =3D map->bounce_page; > > Why don't we need a kmap_local_page(map->bounce_page) here, but we might > perform one / have performed one in vduse_domain_bounce? I think it's another bug that needs to be fixed. Yongji, do you want to fix this? Thanks > > Maybe we should simply use > > memcpy_page(map->bounce_page, 0, map->user_bounce_page, 0, PAGE_SIZE) > > ? > > > -- > Cheers, > > David / dhildenb >