From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F4205C7EE23 for ; Wed, 24 May 2023 04:26:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7824C900002; Wed, 24 May 2023 00:26:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7330C6B0075; Wed, 24 May 2023 00:26:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5FA65900002; Wed, 24 May 2023 00:26:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 509E86B0074 for ; Wed, 24 May 2023 00:26:40 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 20A041A0751 for ; Wed, 24 May 2023 04:26:40 +0000 (UTC) X-FDA: 80823862560.12.47B5F52 Received: from mail-yw1-f169.google.com (mail-yw1-f169.google.com [209.85.128.169]) by imf25.hostedemail.com (Postfix) with ESMTP id 39031A000A for ; Wed, 24 May 2023 04:26:37 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=GPJQk76q; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf25.hostedemail.com: domain of hughd@google.com designates 209.85.128.169 as permitted sender) smtp.mailfrom=hughd@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684902398; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rMzLKiBxLfoI6P+dZdn5tAx16KBddxlFh7DFYFY62sE=; b=HJsWuMCbFd2bLg5Sq6Zpa6MgaNA808STB+2e/GVflhqb1xDFmaJCOa846qrvBlSivGSNZr cuSH1WyqecPboAkeijnmuC37qmglF4wFE+S4UnIhZxYlzY5IYl20hWvH17BkuBEVSShuSH TQmYZrZGG3u083xHxLSDnj1OQi9eSbM= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=GPJQk76q; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf25.hostedemail.com: domain of hughd@google.com designates 209.85.128.169 as permitted sender) smtp.mailfrom=hughd@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684902398; a=rsa-sha256; cv=none; b=GbWWgPE7EJrefDavP/8Qw5S5j6YIVagbonbrGBs0CYdyl4Gn9KUbpA0128ZgM8pUYD9R7H ZuOQfWaGDgyb8V751DM5wmfmXTmearAIVlOWX04GT6fQCcHnaR0ywN+fj/GpB/RgnG3g4M 5kUW+RNkWh9X52yd3wJkBFEqshpYpPY= Received: by mail-yw1-f169.google.com with SMTP id 00721157ae682-56190515833so10719677b3.0 for ; Tue, 23 May 2023 21:26:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1684902397; x=1687494397; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=rMzLKiBxLfoI6P+dZdn5tAx16KBddxlFh7DFYFY62sE=; b=GPJQk76q1KTZRrSReZmgVOj0kZRP0ZgJlBP72oOacta1ZnwE9jne6YsMN76ucCU7Fz OrCHEWL2y7cGV3TLBV67k61Z7aAnS4+LVzGEGmxwjR9CYkQSFcTvbKNmqLnXBoOS1zZ2 dHN7GnTVl0uJL/38zUsRA5cKUstyeua8jEt/FdE3ym3++7ol17wn2veX0lZpzugqjZOu jBDDdgbSQep0B8lErE9eWHFLHLyW305rJ0pmZGqbe+6sW7+zuZZupFqAu/i3f2AbEBRP mgj228K/wRHBYq2/5z82YdVsCCGTgq4arA43sx+pgZIXmHYzfvl9aOzwwRa78NMmRJ0B kDmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684902397; x=1687494397; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rMzLKiBxLfoI6P+dZdn5tAx16KBddxlFh7DFYFY62sE=; b=X/66vidZqPtHVRTsjMUIgVCuXFxpZcXpkbSXcH7kFERP54xBiJThqOZcAEOtO1LiKC Jdix0OttZvhWmacWPbiyFais6Sf7y4V4sx6l2is6Q6fEO1Dft96NkDm/4FaCzhBIs7h/ IleI4sIZQTTqAcv+PgrQ/z8wKTazTz3LFlxrIgaZDBAan5u8d9N0CsbzaEsahaVuGToS kc3+FStUFIdGgwOpHQRJhtKXhx1XArkX++XZIOS4J0eipiJZxUn05yw0dLq6EbDT9aYw 2FxNVOC4ri7Z5Wr3GqN0XyqA3QAO/GnxTdi1yG2a9/pMKNKAD8pnYcfK1OYz6nO7aaz7 paaw== X-Gm-Message-State: AC+VfDxI4KiGu/0CvzYm6MQ4h56gNNEB9AoDbQjEsQk1KlEJt6zbRuTL Uq5z2HXrlRyAu2BZlpFqjsc5qA== X-Google-Smtp-Source: ACHHUZ7aF8288OQwPxDA2xoNmvBFWRU1yMGqirXSuIKhge0/55UqjaGEBK/BIi2MrXiTDprdgSWV+w== X-Received: by 2002:a0d:d8d1:0:b0:564:c747:64f4 with SMTP id a200-20020a0dd8d1000000b00564c74764f4mr13325863ywe.11.1684902397030; Tue, 23 May 2023 21:26:37 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id h15-20020a81b40f000000b00561b76b72d7sm3412288ywi.40.2023.05.23.21.26.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 May 2023 21:26:36 -0700 (PDT) Date: Tue, 23 May 2023 21:26:33 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Yang Shi cc: Hugh Dickins , Andrew Morton , Mike Kravetz , Mike Rapoport , "Kirill A. Shutemov" , Matthew Wilcox , David Hildenbrand , Suren Baghdasaryan , Qi Zheng , Mel Gorman , Peter Xu , Peter Zijlstra , Will Deacon , Yu Zhao , Alistair Popple , Ralph Campbell , Ira Weiny , Steven Price , SeongJae Park , Naoya Horiguchi , Christophe Leroy , Zack Rusin , Jason Gunthorpe , Axel Rasmussen , Anshuman Khandual , Pasha Tatashin , Miaohe Lin , Minchan Kim , Christoph Hellwig , Song Liu , Thomas Hellstrom , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 25/31] mm/gup: remove FOLL_SPLIT_PMD use of pmd_trans_unstable() In-Reply-To: Message-ID: <3d548f45-9ff9-d73a-83e0-bdd312f524@google.com> References: <68a97fbe-5c1e-7ac6-72c-7b9c6290b370@google.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="-1463760895-1821753091-1684902396=:7491" X-Rspamd-Queue-Id: 39031A000A X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: igyk6sbypooahhix3dth6qh5tdzac7fb X-HE-Tag: 1684902397-462870 X-HE-Meta: U2FsdGVkX1/eMm3F+N4a5jEQnsNGSiONGAkuIRlBdVJc1kpbYjkXKbLlN9my7bppQWZTRxYskkyFLu7gYXgDPmcR3gsKJecx36lHHDpO2AcrVmfQgw41tcRvP2Mvl6FGjxdNa4G07fcdshrAgXopAkD//6Ed9vdOCC/y9CYC121uRcz/CEuJW1/f+SPtX3zYeU6YKJVVYbCvA9ZnUDq5uHGLy9G+h2PPT7j3R5pheT2tfnUrrURMp5g+IwSHXtIM4S/UXG5jhPCA18djwaQGnqgMlrWb+xSbu+/OVeMF8BqpQQeDzHS0W30HGjb7JecYZdQyG3x8c/6xh6VPCe66EMYjLzccPHTHMphq9HiOu36G2pqQ3gQYsaYFKIP8KIQPeLRB1N0VsCc1x6z3XNo6xWuJ84UBEMs8ECF+FTlFYPDD7+h5VRHYUVQMyvbrPKKSG5gCSplC97bBkTm8gD3sC8WWtJoTQSKoq/kLw2T+qE4HYbxaMs0gW2xTgdxPcFSNwmsyoI90I6CB5G1O5Mjfg7pZuMEuYrwkWEfxZS/clwv5F+njbGwgp9iiPiz0PR0a+Wj3EMYB7e7gLlY9gVEhHXtH/v8JfsgVikRyjJ1Lm/KX5f8B0D7NCpfHTIUgF6rM999j/2ddMRoDLo2dziX+NLXUitltMSg+CPadyP9lIl9Ye+6Oay+uR6iZEzUgxd8eVcSwryoVVHnflH7aL4ZHQ7I5ljaH+yG5/cVUnMxWOMCe8iW1qG7r+1hb4iSfh62Ita9mWOo+zl/2m/d0QjddAeYRDxhAHgG9S9MXOwExydg0V3wYPHV4nvfMH8dK8IqNS09A3NjMAxHJZ4tVnH8Di3w1WatXjzpNQ5MzVpkgt6eeNWRctHRsxVixC02qR3bYQkOpeCyar6G9SwFGcJjDzy5qVv5nYmunARKE9CkmfXzAATRe/xgTO+cxyoN62RORmn5EigYGl3ewMMWVFSn nESbcZY9 QXvJCsYeBo77HoQaa89X+CVZUx0LDLKvz92XKZM5syYoHFbCr9cPDkKKWsTf0W510Qd42Od0S3pU5IQGw2kkncr4uI42LQfTsP2zBMs6WOysDmuaZboEN9XGfkYg/7pvFizm/FD4hnpB4L/TdngGbklBR5CZz+6RXQoLpofRivubRzr/u7qq5Dv63F7CgE+51wb+8dQgoEqwjRRgas7CVf+gaGqS2mNsz8vlo/mdCzB5OOd9j+9fnJhUeiveciQ1n2W7KcdcgQOI3LMfNi7CA946dLuCW2qoJBkkwBF6oPPqxI4kuEIaGK0Y7j6lDQb4fk+wE1Zgs2VpHCklAVSSRO+AVJInxr/dv9jElIvlAZ6WL3tdnqpTfn2Ds0n1AXlUPY2fERG4C8V0ifknnOO7ovm8MauX18lpWZhIiP2zm6NHBls28xyKUqCcIrzxf/1a3ZZbPz+0NEcQ4fxd7Uy7orcupid+KPtQ/o6pTldUpurwJWZ6mabITC6dnyILx9SkM/ilGKpj5uwqtrXK6xOm53nyW9Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. ---1463760895-1821753091-1684902396=:7491 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE On Mon, 22 May 2023, Yang Shi wrote: > On Mon, May 22, 2023 at 7:26=E2=80=AFPM Yang Shi wr= ote: > > On Sun, May 21, 2023 at 10:22=E2=80=AFPM Hugh Dickins wrote: > > > > > > There is now no reason for follow_pmd_mask()'s FOLL_SPLIT_PMD block t= o > > > distinguish huge_zero_page from a normal THP: follow_page_pte() handl= es > > > any instability, and here it's a good idea to replace any pmd_none(*p= md) > > > by a page table a.s.a.p, in the huge_zero_page case as for a normal T= HP. > > > (Hmm, couldn't the normal THP case have hit an unstably refaulted THP > > > before? But there are only two, exceptional, users of FOLL_SPLIT_PMD= =2E) > > > > > > Signed-off-by: Hugh Dickins > > > --- > > > mm/gup.c | 19 ++++--------------- > > > 1 file changed, 4 insertions(+), 15 deletions(-) > > > > > > diff --git a/mm/gup.c b/mm/gup.c > > > index bb67193c5460..4ad50a59897f 100644 > > > --- a/mm/gup.c > > > +++ b/mm/gup.c > > > @@ -681,21 +681,10 @@ static struct page *follow_pmd_mask(struct vm_a= rea_struct *vma, > > > return follow_page_pte(vma, address, pmd, flags, &ctx= ->pgmap); > > > } > > > if (flags & FOLL_SPLIT_PMD) { > > > - int ret; > > > - page =3D pmd_page(*pmd); > > > - if (is_huge_zero_page(page)) { > > > - spin_unlock(ptl); > > > - ret =3D 0; > > > - split_huge_pmd(vma, pmd, address); > > > - if (pmd_trans_unstable(pmd)) > > > - ret =3D -EBUSY; > > > > IIUC the pmd_trans_unstable() check was transferred to the implicit > > pmd_none() in pte_alloc(). But it will return -ENOMEM instead of > > -EBUSY. Won't it break some userspace? Or the pmd_trans_unstable() is > > never true? If so it seems worth mentioning in the commit log about > > this return value change. Thanks a lot for looking at these, but I disagree here. >=20 > Oops, the above comment is not accurate. It will call > follow_page_pte() instead of returning -EBUSY if pmd is none. Yes. Ignoring secondary races, if pmd is none, pte_alloc() will allocate an empty page table there, follow_page_pte() find !pte_present and return NULL; or if pmd is not none, follow_page_pte() will return no_page_table() i.e. NULL. And page NULL ends up with __get_user_pages() having another go round, instead of failing with -EBUSY. Which I'd say is better handling for such a transient case - remember, it's split_huge_pmd() (which should always succeed, but might be raced) in use there, not split_huge_page() (which might take years for pins to be removed before it can succeed). > For other unstable cases, it will return -ENOMEM instead of -EBUSY. I don't think so: the possibly-failing __pte_alloc() only gets called in the pmd_none() case. Hugh >=20 > > > > > - } else { > > > - spin_unlock(ptl); > > > - split_huge_pmd(vma, pmd, address); > > > - ret =3D pte_alloc(mm, pmd) ? -ENOMEM : 0; > > > - } > > > - > > > - return ret ? ERR_PTR(ret) : > > > + spin_unlock(ptl); > > > + split_huge_pmd(vma, pmd, address); > > > + /* If pmd was left empty, stuff a page table in there= quickly */ > > > + return pte_alloc(mm, pmd) ? ERR_PTR(-ENOMEM) : > > > follow_page_pte(vma, address, pmd, flags, &ct= x->pgmap); > > > } > > > page =3D follow_trans_huge_pmd(vma, address, pmd, flags); > > > -- > > > 2.35.3 ---1463760895-1821753091-1684902396=:7491--