From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4FC1ECE58D for ; Wed, 9 Oct 2019 19:21:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7B16B20640 for ; Wed, 9 Oct 2019 19:21:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="D46ylTyF" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7B16B20640 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id F27E38E0006; Wed, 9 Oct 2019 15:20:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ED8A88E0003; Wed, 9 Oct 2019 15:20:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DA13E8E0006; Wed, 9 Oct 2019 15:20:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0142.hostedemail.com [216.40.44.142]) by kanga.kvack.org (Postfix) with ESMTP id BA0688E0003 for ; Wed, 9 Oct 2019 15:20:59 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 49A44824CA2D for ; Wed, 9 Oct 2019 19:20:59 +0000 (UTC) X-FDA: 76025213838.10.dogs68_7cf2751fea219 X-HE-Tag: dogs68_7cf2751fea219 X-Filterd-Recvd-Size: 7270 Received: from mail-lj1-f193.google.com (mail-lj1-f193.google.com [209.85.208.193]) by imf24.hostedemail.com (Postfix) with ESMTP for ; Wed, 9 Oct 2019 19:20:58 +0000 (UTC) Received: by mail-lj1-f193.google.com with SMTP id m13so3621780ljj.11 for ; Wed, 09 Oct 2019 12:20:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=dPeGPmQ1EwLkB4czEwZbdMTSScSNuUeFvbx9+vGLyj8=; b=D46ylTyFQpmTxNtqelt7zxhnC3byp9q3f9PTo/a94ejiNeNdqMcUWmos4AUnosIsO+ gqTW1DrlrDqXpkljL0stS9FtPlLnIfQBa4RDyvr49Gwqkoi+iy05ObLtvffsIACoNTqi qysGIPlWJMMJH9gr+V7z0SS/H1hLxX17nBkyo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=dPeGPmQ1EwLkB4czEwZbdMTSScSNuUeFvbx9+vGLyj8=; b=PtPze4Eed/Sybxv4JoQucB0ZX2+xVws9NFOmRausV80lIiEi0vaN6P3Go414/UTUep oFikGAPBv5zJ6DNel6+BDMFS/gIDpHqm3X6I+qjnmR5rXieuROOJC9aG7taCPMGFDjnf SA6F2C7tHOdQJH9Fxs425L8TxCjEBVzMadnTRjMhfEgVY2/ee5fhVyU3fUQNRdGLV4w/ czOsjKSUpx7SX85lHzQPqPNCTy+yIUIN/UA+hCVxtu+raZtwLhTtLfjLYJ18sZxGuwqe E3mSGqksIr6buhRa15NdAcfVKBHrkSrEpYemuNCFT1A8eKDnpz06NbIyABNu39Mi0zpz NbZg== X-Gm-Message-State: APjAAAULkyqjf7xul/N1lxuYAXxk50VGYGwo6Ac/g1yPzpwKmcruPF9w EbJonVDlLziOkKl1Z+36YXa+zAGTHcw= X-Google-Smtp-Source: APXvYqzZ6f34anAVaSCLZ1+wHMLVs5Fsu/KLu3zgNlajZDthRjpDoL0qZIGs40Y5Fauu8ThLQKs5sA== X-Received: by 2002:a2e:98d8:: with SMTP id s24mr3273677ljj.72.1570648856558; Wed, 09 Oct 2019 12:20:56 -0700 (PDT) Received: from mail-lf1-f44.google.com (mail-lf1-f44.google.com. [209.85.167.44]) by smtp.gmail.com with ESMTPSA id i11sm661017ljb.74.2019.10.09.12.20.54 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 09 Oct 2019 12:20:56 -0700 (PDT) Received: by mail-lf1-f44.google.com with SMTP id d17so2513212lfa.7 for ; Wed, 09 Oct 2019 12:20:54 -0700 (PDT) X-Received: by 2002:ac2:5306:: with SMTP id c6mr3143152lfh.106.1570648853109; Wed, 09 Oct 2019 12:20:53 -0700 (PDT) MIME-Version: 1.0 References: <20191008091508.2682-1-thomas_os@shipmail.org> <20191008091508.2682-4-thomas_os@shipmail.org> <20191009152737.p42w7w456zklxz72@box> <03d85a6a-e24a-82f4-93b8-86584b463471@shipmail.org> In-Reply-To: From: Linus Torvalds Date: Wed, 9 Oct 2019 12:20:36 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v4 3/9] mm: pagewalk: Don't split transhuge pmds when a pmd_entry is present To: Thomas Hellstrom Cc: =?UTF-8?Q?Thomas_Hellstr=C3=B6m_=28VMware=29?= , "Kirill A. Shutemov" , Linux Kernel Mailing List , Linux-MM , Matthew Wilcox , Will Deacon , Peter Zijlstra , Rik van Riel , Minchan Kim , Michal Hocko , Huang Ying , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Oct 9, 2019 at 11:52 AM Thomas Hellstrom wrote: > > Hmm, so we have the following cases we need to handle when returning > from the pmd_entry() handler. No, we really don't. > 1) Huge pmd was handled - Returns 0 and continues. No. That case simply DOES NOT EXIST. The only case that exists is "pmd was seen, we return 0 and then look at wherer pte level is relevant". Note that this has nothing to do with huge or not. > 2) A pmd is otherwise unstable, typically someone just zapped a huge > pmd. Returns PAGE_WALK_FALLBACK, gets caught in the pmd_trans_unstable() > test and retries. No. PAGE_WALK_FALLBACK doesn't exist, is completely broken in your patch, and is immaterial. It falls under the previous heading: a pmd was seen, returns zero, and we go on with life. If you don't have a pte callback - like EVERY SINGLE CURRENT USER - that "goes on with life" is just "go to the next pmd entry". And if you do have a pte callback - like your new case, that "go on with life" is to look at the pte cases. > 3) A pte directory - Returns PAGE_WALK_FALLBACK, falls through, avoids > the split and continues to the next level. Yeah that split avoidance > test is indeed made unnecessary by the preceding pmd_trans_unstable() test. Again, no. This case does not exist. It's the same case as above: it returns 0 and goes on to the pte level. > - split_huge_pmd(walk->vma, pmd, addr); > + if (!ops->pmd_entry) > + split_huge_pmd(walk->vma, pmd, addr); > > But as the commit message says, PAGE_WALK_FALLBACK is necessary to have > a virtual address range being handled once and only once. No. Your logic is garbage. The above code is completely broken. YOU CAN NOT AVOID TRHE SPLIT AND THEN GO ON AT THE PTE LEVEL. Don't you get it? There *is* no PTE level if you didn't split. And your "being handled once and only once" is garbage too. If you ask for both a pmd callback and a pte callback, you get both. It's that simple. There are zero users that actually do it now, and you don't want to do it either, so all your arguments are just pointless. > So we need the PAGE_WALK_FALLBACK. No we don't. You make no sense. Your case doesn't want it, no existing cases want it, nobody wants it. When you actually have a case that wants it, let's look at it then. Right now, you introduced fundamentally buggy code because your thinking is fuzzy and broken. So what you should do is to just always return 0 in your pmd_entry(). Boom, done. The only reason for the pmd_entry existing at all is to get the warning. Then, if you don't want to split it, you make that warning just return an error (or a positive value) instead and say "ok, that was bad, we don't handle it at all". And in some _future_ life, if anybody wants to actually say "yeah, let's not split it", make it have some "yeah I handled it" case. In fact, I would suggest that positive return values be exactly that "I did it" case, and that they just add up instead of breaking out. Only an actual error would break out, and 0 would then (continue to) mean "continue with next level". But right now, no such user even exists. Linus