From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28D75CD4F25 for ; Wed, 4 Sep 2024 20:11:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 818588D0006; Wed, 4 Sep 2024 16:11:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7C9658D0005; Wed, 4 Sep 2024 16:11:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 68F7A8D0006; Wed, 4 Sep 2024 16:11:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 477428D0005 for ; Wed, 4 Sep 2024 16:11:32 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 05902120455 for ; Wed, 4 Sep 2024 20:11:32 +0000 (UTC) X-FDA: 82528150824.04.8D1BD31 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf22.hostedemail.com (Postfix) with ESMTP id 6101FC0007 for ; Wed, 4 Sep 2024 20:11:30 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=a+2jlSz2; dmarc=none; spf=none (imf22.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725480619; a=rsa-sha256; cv=none; b=51BAmV11TVsb8yxxlxc395druVaJYw1bktMY/pcShuPrZC/wUJzO82BkyTP5nmoIBqqu7D PMFntPEK0fZWlc05n9M4YGOTD5dLQagJmh41nnYfm9YvzL8BtsyBl8D5tk2JX72cLZ0zXJ Tumu8NMh3h+qelffUnPYBSKTkqvlkco= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=a+2jlSz2; dmarc=none; spf=none (imf22.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725480619; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1U7KUvPp1Czxb6kvJzP8pS/9Se/dVPf/3+p50vQO6XQ=; b=7Y63WmccRf9t4Gb8R8flOsFJSqE0zNmPMO45I1E8hk390PJgTRigTmiz/P1h2oRGNbBnfi KoZHA7r9InQjschEdFElMwK2MHtW0DXcRNnNkqzkK6MN4TLSD36rSYZMaIajiUVfglmNcW 2TEURZxKrD3P/IXB7ZjKHWK+nCBrQq4= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=1U7KUvPp1Czxb6kvJzP8pS/9Se/dVPf/3+p50vQO6XQ=; b=a+2jlSz2hdeTxhNuzbpkZVA9q1 n0zgdJsldJ9F847EeiVb0Y8iQQ9RGq5zX9yNveEQUZ/G3s7jIQhPx6udQ6N1TdB4aQ9tHAgbW8iLX cs8bFw8bBsREKWfZryxpn5ZDkYLEi4KEMBN6ZyEMDeJPa3PPzbA4NNIcnCu8i7AfVRcEsrCtx5/vr GXDZl5lF4FPKIYtTiG4pos4UvTkZWedloEF8BbSUKhA3uvEwJsEnH35NIG1PnXWdrxyAntX9YNuST mf7yQuSwtqxE5V32JNHZHYd0RC12/9ODw4/XmoBIpg/eWbVzZ/gBKx53+2FOrvV431AA9uBmePIay SQ9HXJaA==; Received: from willy by casper.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1slwLe-00000001M3F-3WPj; Wed, 04 Sep 2024 20:11:26 +0000 Date: Wed, 4 Sep 2024 21:11:26 +0100 From: Matthew Wilcox To: Steve Sistare Cc: linux-mm@kvack.org, Vivek Kasireddy , Muchun Song , Andrew Morton , Peter Xu , David Hildenbrand , Jason Gunthorpe Subject: Re: [PATCH] mm/hugetlb: simplify refs in memfd_alloc_folio Message-ID: References: <1725478868-61732-1-git-send-email-steven.sistare@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 6101FC0007 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: dqr91rof8ud5hcty7tq4q1dhca4f4mx9 X-HE-Tag: 1725480690-65906 X-HE-Meta: U2FsdGVkX1/TXpQryHRkuKRqoA8+iKbNVS6mludlq+6S6nP6JW7ldGrxyFUsYLvxuszbVIBhKiwglUXoeSiDL8GQHiHIUm5e+91v5FQ4/69JJJFaO5Y77g9t5eCmfPOkUofLyiyHz+lF4GqzPkjYJL3dOpK/PijtkE7psV4c+LdqFWBLWuSAzePo9Z/IDBglxrhbV2XdEMJdWZ0+Sajtcpk7KoGvhBqI2ASjFra3lvrB080Dl9tjs6LM/fY3TuGLL3BhJP/pDsBFc2Y+SPtZLXTJ10VLLCcQkNCjlmO4ZgctSwVS66QfHjHKIcX0xozvis1x10kFoOO8r/wwDFYJSQF2nRxXgGUYPVsXSldlMz07TMHM82gstuj/m56ZDARmfQwDEjGGkfhI1C6W/TchLo05BXXSw6KN/UdQxFrOcuS/JrEpMxx/rESs+y+1jeNNdS0ujTKpmVxIcf5LsSXBkXxjhPCFAlQuN81TVeje/LsM14fMOFggyHxIjarsmXGkcHVvrqZXrahwuR1T6dBHHipUmUh9rg2JcXBiM6/602jLCKcSQ0f+PdVWQe90ArdF4It1OCC1uIlzxa1SsxfLSuHrH7Ot5g3U3BJc5LPuQAH8BX/3+9u7qG2BVji00dXOKiWUdNrbfkktKPmELFhMY+KT6oxFpgQ7zfclLHxb6Rqn4TiuA60NCZ2LwV1M7s3m/+AALiqn/WXLrRdcQpGymPNG5ML0+zdU63uNIvKrIZeTiZ6eK/ArYp5AjtDCeEdS8IM0GzHOAtdgBAkPVsg9Jb97DqJ3kgHp2UfTutfrgcke7xaNEqy5y0Kcdcmf6w25yrcmJ0NqKCkhlcmeeYM5jOY2AqAdbx3HvrUV9oA372BMB9KPkXrhdQJmHJhAzA4Jd32LIppzWoiA9LCXASYLiNsCUjqLqDArxOe7D7MYgAz5pn1AY1W/NtbErT16oGYuyHN0fD7NMbae+dbgmKt FEBxC4MO 8yXY2sO2XMUloBb0vH0Gi8V4j7N9u1Jq4KjsQ1tkHETDv40CaroZKSL5GT5D3spO1p+sPSn6a7lnMcJBnrGciV9JsoCH9GMyi5Cs4R96U9MeWY1RSsLm1YSwphxGIvn1VMTJ0A5RoraOC7DdA5qJ1WDLhZppJ02Z6Cy0WWx7DpeviMyaK0qLyTueU6JIHquWsSv3G9jUpH48Pt6UuR7rsZ6orjxDfgTPx/zo+TrS61JpMdouj93F6te65F5Bvw3FvX/Eh2RVoiBD75AMmOR+GtlhDyrk1Vs6uNDrPGnOk56gk9D2QVbgmXcuxJtd0U2t9YH/d X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Sep 04, 2024 at 09:02:59PM +0100, Matthew Wilcox wrote: > On Wed, Sep 04, 2024 at 12:41:08PM -0700, Steve Sistare wrote: > > The folio_try_get in memfd_alloc_folio is not necessary. Delete it, and > > delete the matching folio_put in memfd_pin_folios. This also avoids > > leaking a ref if the memfd_alloc_folio call to hugetlb_add_to_page_cache > > fails, which would otherwise need an additional folio_put. This is a > > continuation of the fix > > "mm/hugetlb: fix memfd_pin_folios free_huge_pages leak" > > I think you're right, but don't we also need to get rid of the > folio_put() call in the 'if (err)' case after calling > hugetlb_add_to_page_cache()? After scratching my head about this a bit more, I was trying to preserve the existing semantics of the code, but I think the code was always buggy. The correct code would be: folio = alloc_hugetlb_folio_nodemask(...); folio_put(folio); The code as in tree today would trip an assertion: VM_BUG_ON_FOLIO(folio_ref_count(folio), folio); as alloc_hugetlb_folio_nodemask() returns a folio with a refcount 1, folio_try_get() would increment it to 2, folio_put() would decrement it to 1, and so we'd call free_huge_folio() with a refcount of 1. But after your patch, the code _is_ still wrong because we'll start with a refcount of 1, fail to add to the page cache, call folio_put() which will decrement the refcount to 0 _and call free_huge_folio() itself_. Then we'll call free_huge_folio() on an already freed and possibly reallocated folio. So every version suggested so far (current, yours, mine) is wrong, and the right code looks like: folio = alloc_hugetlb_folio_nodemask(...); if (folio) { err = hugetlb_add_to_page_cache(...); if (err) { folio_put(folio); return ERR_PTR(err); } ... Or have I got something wrong in that analysis? > > Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios") > > > > Suggested-by: Vivek Kasireddy > > Signed-off-by: Steve Sistare > > --- > > mm/gup.c | 4 +--- > > mm/memfd.c | 2 +- > > 2 files changed, 2 insertions(+), 4 deletions(-) > > > > diff --git a/mm/gup.c b/mm/gup.c > > index bccabaa..947881ff 100644 > > --- a/mm/gup.c > > +++ b/mm/gup.c > > @@ -3618,7 +3618,7 @@ long memfd_pin_folios(struct file *memfd, loff_t start, loff_t end, > > pgoff_t start_idx, end_idx, next_idx; > > struct folio *folio = NULL; > > struct folio_batch fbatch; > > - struct hstate *h = NULL; > > + struct hstate *h; > > long ret = -EINVAL; > > > > if (start < 0 || start > end || !max_folios) > > @@ -3662,8 +3662,6 @@ long memfd_pin_folios(struct file *memfd, loff_t start, loff_t end, > > &fbatch); > > if (folio) { > > folio_put(folio); > > - if (h) > > - folio_put(folio); > > folio = NULL; > > } > > > > diff --git a/mm/memfd.c b/mm/memfd.c > > index bcb131d..f715301 100644 > > --- a/mm/memfd.c > > +++ b/mm/memfd.c > > @@ -89,7 +89,7 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx) > > numa_node_id(), > > NULL, > > gfp_mask); > > - if (folio && folio_try_get(folio)) { > > + if (folio) { > > err = hugetlb_add_to_page_cache(folio, > > memfd->f_mapping, > > idx); > > -- > > 1.8.3.1 > > >