From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-23.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 592EBC2B9F2 for ; Sat, 22 May 2021 19:41:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C448D61177 for ; Sat, 22 May 2021 19:41:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C448D61177 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 02D9C8E008A; Sat, 22 May 2021 15:41:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F1FFC8E007F; Sat, 22 May 2021 15:41:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D99C58E008A; Sat, 22 May 2021 15:41:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0068.hostedemail.com [216.40.44.68]) by kanga.kvack.org (Postfix) with ESMTP id A5F048E007F for ; Sat, 22 May 2021 15:41:34 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2E0C7180AD5C5 for ; Sat, 22 May 2021 19:41:34 +0000 (UTC) X-FDA: 78169886508.09.2D5C342 Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) by imf04.hostedemail.com (Postfix) with ESMTP id EE61D2C7 for ; Sat, 22 May 2021 19:41:31 +0000 (UTC) Received: by mail-pj1-f52.google.com with SMTP id gb21-20020a17090b0615b029015d1a863a91so9086442pjb.2 for ; Sat, 22 May 2021 12:41:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=SFKpmD5a58yAK0KX7PZrq6XiX162OZLnkIzoJqpkFsw=; b=A/6spznnPg8LSJl1adzNGiDbP6PsR9UtVSJLHhg+rzzqecRdM9ToFCvsmKeUmSnhDp ozgCQ32wH4Rx6CvC23I5J5VJtyxgZr7MVN5bz2ZP0AmXqTddWiUrA26WNzS7owGWL2fC o0lw1n8SNPeGjNAxBTUR2+V0+mkOi3YuHY7Kyv5Q6Pd6qTOAkP+f99UhYxBVjnnJcxJh NmfzXms9svvF/3KFxF49WTQ2e09yKzGJnfsiZ8Q5+1VqALD/e5T0/DzGsMZp+Cr+GDUv Ufy66Prpwtcba5Un53lBSRO4wk0uHTaatXGx5p1czvRLC9kt9rS9x/Oz3/+sTA2cAIQR MZVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=SFKpmD5a58yAK0KX7PZrq6XiX162OZLnkIzoJqpkFsw=; b=Y0ZpBXBKmru9akbxaDTMu6HPwV+klF+XYz8t28xoS/9SxDlyDZkHQacClAXjsS5pLy NXu1VaLwqkmZMW9/uvDwO4PFC1/VadCt6CkqJSbp//F6ok33BuMP8URVKSww5vR+uoB1 p3lfbzUr+t0U+yKGqoQhbZhdlnF5FvQH0ZR19ZVOtev4SUXbh3ch7lruC15PuQxdLpb9 irUHyk7eGHPyGFwoaC45YAIIiwpYcwgmr/5Bcx9l/ooXiPj0KOcy48wckfrULwIBULMq JcrwwYBp4VDTFkk84lloQ0o/EzWkd9CPp8d/TXvLUcARC1YNUUrWTYNHBbX4NYlyW/Yx QlAw== X-Gm-Message-State: AOAM5322qiNSLGnZEpflOLRVL1FuGEbctWDC9bx9UrQ+zi3k0rWLRoqK fiWXTrf8QC5PXP6OB/GEK2OSRxQMl3tc9NhP9JutOg== X-Google-Smtp-Source: ABdhPJyLyvSmE/jiPGphlisECKYEs6RoJ/qavgE71eSKDSMNQxkYZiGy4nVoDHjFCzjJQuOGZcnk6aMfrms8zJe2dW4= X-Received: by 2002:a17:90b:4b0f:: with SMTP id lx15mr16376461pjb.184.1621712492667; Sat, 22 May 2021 12:41:32 -0700 (PDT) MIME-Version: 1.0 References: <20210521233952.236434-1-mike.kravetz@oracle.com> In-Reply-To: <20210521233952.236434-1-mike.kravetz@oracle.com> From: Mina Almasry Date: Sat, 22 May 2021 12:41:21 -0700 Message-ID: Subject: Re: [PATCH] userfaultfd: hugetlbfs: fix new flag usage in error path To: Mike Kravetz Cc: Linux-MM , open list , Oscar Salvador , Michal Hocko , Muchun Song , Naoya Horiguchi , David Hildenbrand , Matthew Wilcox , Miaohe Lin , Andrew Morton , stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20161025 header.b="A/6spznn"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of almasrymina@google.com designates 209.85.216.52 as permitted sender) smtp.mailfrom=almasrymina@google.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: EE61D2C7 X-Stat-Signature: x8qyhutdm4sifpiwhf8ur8im18nmzj3o X-HE-Tag: 1621712491-889574 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, May 21, 2021 at 4:40 PM Mike Kravetz wrote: > > In commit d6995da31122 ("hugetlb: use page.private for hugetlb specific > page flags") the use of PagePrivate to indicate a reservation count > should be restored at free time was changed to the hugetlb specific flag > HPageRestoreReserve. Changes to a userfaultfd error path as well as a > VM_BUG_ON() in remove_inode_hugepages() were overlooked. > > Users could see incorrect hugetlb reserve counts if they experience an > error with a UFFDIO_COPY operation. Specifically, this would be the > result of an unlikely copy_huge_page_from_user error. There is not an > increased chance of hitting the VM_BUG_ON. > > Fixes: d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags") > Cc: > Signed-off-by: Mike Kravetz > --- > fs/hugetlbfs/inode.c | 2 +- > mm/userfaultfd.c | 28 ++++++++++++++-------------- > 2 files changed, 15 insertions(+), 15 deletions(-) > Reviewed-by: Mina Almasry I'm guessing this may interact with my patch in review "[PATCH v3] mm, hugetlb: fix resv_huge_pages underflow on UFFDIO_COPY". I'll rebase my patch and re-test. > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c > index 9d9e0097c1d3..55efd3dd04f6 100644 > --- a/fs/hugetlbfs/inode.c > +++ b/fs/hugetlbfs/inode.c > @@ -529,7 +529,7 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart, > * the subpool and global reserve usage count can need > * to be adjusted. > */ > - VM_BUG_ON(PagePrivate(page)); > + VM_BUG_ON(HPageRestoreReserve(page)); > remove_huge_page(page); > freed++; > if (!truncate_op) { > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c > index 5508f7d9e2dc..9ce5a3793ad4 100644 > --- a/mm/userfaultfd.c > +++ b/mm/userfaultfd.c > @@ -432,38 +432,38 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, > * If a reservation for the page existed in the reservation > * map of a private mapping, the map was modified to indicate > * the reservation was consumed when the page was allocated. > - * We clear the PagePrivate flag now so that the global > + * We clear the HPageRestoreReserve flag now so that the global > * reserve count will not be incremented in free_huge_page. > * The reservation map will still indicate the reservation > * was consumed and possibly prevent later page allocation. > * This is better than leaking a global reservation. If no > - * reservation existed, it is still safe to clear PagePrivate > - * as no adjustments to reservation counts were made during > - * allocation. > + * reservation existed, it is still safe to clear > + * HPageRestoreReserve as no adjustments to reservation counts > + * were made during allocation. > * > * The reservation map for shared mappings indicates which > * pages have reservations. When a huge page is allocated > * for an address with a reservation, no change is made to > - * the reserve map. In this case PagePrivate will be set > - * to indicate that the global reservation count should be > + * the reserve map. In this case HPageRestoreReserve will be > + * set to indicate that the global reservation count should be > * incremented when the page is freed. This is the desired > * behavior. However, when a huge page is allocated for an > * address without a reservation a reservation entry is added > - * to the reservation map, and PagePrivate will not be set. > - * When the page is freed, the global reserve count will NOT > - * be incremented and it will appear as though we have leaked > - * reserved page. In this case, set PagePrivate so that the > - * global reserve count will be incremented to match the > - * reservation map entry which was created. > + * to the reservation map, and HPageRestoreReserve will not be > + * set. When the page is freed, the global reserve count will > + * NOT be incremented and it will appear as though we have > + * leaked reserved page. In this case, set HPageRestoreReserve > + * so that the global reserve count will be incremented to > + * match the reservation map entry which was created. > * > * Note that vm_alloc_shared is based on the flags of the vma > * for which the page was originally allocated. dst_vma could > * be different or NULL on error. > */ > if (vm_alloc_shared) > - SetPagePrivate(page); > + SetHPageRestoreReserve(page); > else > - ClearPagePrivate(page); > + ClearHPageRestoreReserve(page); > put_page(page); > } > BUG_ON(copied < 0); > -- > 2.31.1 >