From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B24D3C76196 for ; Sun, 26 Mar 2023 02:27:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E090F900003; Sat, 25 Mar 2023 22:27:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DB90A900002; Sat, 25 Mar 2023 22:27:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C59B6900003; Sat, 25 Mar 2023 22:27:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B26F9900002 for ; Sat, 25 Mar 2023 22:27:02 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 79F271C5B90 for ; Sun, 26 Mar 2023 02:27:02 +0000 (UTC) X-FDA: 80609461884.14.76B78FD Received: from mail-qv1-f52.google.com (mail-qv1-f52.google.com [209.85.219.52]) by imf15.hostedemail.com (Postfix) with ESMTP id 9F824A0004 for ; Sun, 26 Mar 2023 02:27:00 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=joelfernandes.org header.s=google header.b="TSuXUA1/"; dmarc=none; spf=pass (imf15.hostedemail.com: domain of joel@joelfernandes.org designates 209.85.219.52 as permitted sender) smtp.mailfrom=joel@joelfernandes.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679797620; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=e6MdwF8dxq5T17Hd4Ut5Up4l3/CTfKzbkFdh3c/8I8g=; b=3SeVX34OJUow/cbSbnaGyh85qjNoP/lfZJVTc41x1NllL0tkuV9NIOMZyViOsN/dLfLPs7 QC7u05CzqWDgGibOg7rk+dLvVeyqKE1zb7OzAsVvGjHAPvN1b0lurgFrM0sGzdJ5Cg1koH +cKz88+L+gO2Mw5YNUKWQlmmFaak0tU= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=joelfernandes.org header.s=google header.b="TSuXUA1/"; dmarc=none; spf=pass (imf15.hostedemail.com: domain of joel@joelfernandes.org designates 209.85.219.52 as permitted sender) smtp.mailfrom=joel@joelfernandes.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679797620; a=rsa-sha256; cv=none; b=sEcg/6dlZwuYVHWv/wA6jnizPxvi+NDker/cuyEOT/wv3kamyBJWxVIQ0fBHDpy0G59K7M aolaOeO2t0Z8JQGZWwh+Kz56RiDls0JZH4XjienSQU0RKQJD/K5c/gC7hFjhQ75HAOxi2X gzO3C4km/LYnqRnaioDem7jmv11HoI8= Received: by mail-qv1-f52.google.com with SMTP id oe8so4603317qvb.6 for ; Sat, 25 Mar 2023 19:27:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; t=1679797619; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=e6MdwF8dxq5T17Hd4Ut5Up4l3/CTfKzbkFdh3c/8I8g=; b=TSuXUA1/rCTle+jIAbff5cgl68pVipoCkyZahYjwXFVw3yy1QD3Se9VxkbwLihlFG+ CV7XufO3H3RccEfcaTc/NESCLXd3E4MQZntSAiNWJgXL5kkSuO+kpDkA9Cs4pQ/YumPJ RIThw5pxBIn3QSlN69CniIwpd5RO/fFYFjyBc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679797619; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=e6MdwF8dxq5T17Hd4Ut5Up4l3/CTfKzbkFdh3c/8I8g=; b=J132DNW/6uCj3FcO4KnzkTMc/EGWgsLNKNHj6ljNqLaAhpDjonu55uhuqBI40YrGVy N0QmnlqbG0lToBkwBq+qihS00GkFQYv0hWS1EeVYPiHGEMNnM9rrtQwt9XkyhIean7nO VBQvxatseQik1wUBWzoiZFaaIRLuCPxUSzeD7FUILiPtHKuufOZD5u7xMLTgjvK4WYg8 2NWyRvaTTgEAos5zWH4E0L1kvQIBx7oLIZDITbnCwWnjI71wyP3SjccymsCrKcav/uWA 2/8Y0/EvRWC8nbozWaxVRA5ylurKS6qilR75oIplUI3mjCmYU5itJrAd4tEeqEf4PHYS Mg+g== X-Gm-Message-State: AAQBX9fJnzy6ravsWwMTyKe22B375iD/Yr9XV6wA+UmDko0T2dW2AQVL UnKrPRq/5xBrxZQ8NLJ9kinkpw== X-Google-Smtp-Source: AKy350bHA3idHLx/NThEJb96p4xme+KXCB6533BCD9gxM/MB05zOeON9I4FFg/vk5Y1P76eUJovAjw== X-Received: by 2002:a05:6214:27e1:b0:5d1:acb8:f126 with SMTP id jt1-20020a05621427e100b005d1acb8f126mr11149062qvb.38.1679797619643; Sat, 25 Mar 2023 19:26:59 -0700 (PDT) Received: from localhost (129.239.188.35.bc.googleusercontent.com. [35.188.239.129]) by smtp.gmail.com with ESMTPSA id ne19-20020a056214425300b005dd8b9345a1sm1889591qvb.57.2023.03.25.19.26.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 25 Mar 2023 19:26:58 -0700 (PDT) Date: Sun, 26 Mar 2023 02:26:58 +0000 From: Joel Fernandes To: Linus Torvalds Cc: "Kirill A. Shutemov" , Michal Hocko , Naresh Kamboju , Andrew Morton , linux-mm@kvack.org, LKML Subject: Re: WARN_ON in move_normal_pmd Message-ID: <20230326022658.GB3142556@google.com> References: <20230324130530.xsmqcxapy4j2aaik@box.shutemov.name> <20230325163323.GA3088525@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Queue-Id: 9F824A0004 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: m71ypxyuf76q3f9g94cktp3bpyo39we1 X-HE-Tag: 1679797620-945606 X-HE-Meta: U2FsdGVkX1+AvULZRwaLAx06hmg+z5HiwYjfVYs42LMw0t4EUrR3WpGSoprwBAhd7LVEpz99R5Gc+bb0BBsjfQZ2IXOTG9/kmmUfeB9cEYxE5nVnzU5KH/Kb+QwMe0ZsHPvbqdSv+kH+12KejI2DAad87szt+sxk/ITlQmBPjPsDCXgGnRVlYekuTDPb1NmF7TTa/OEdGel4nLsSxbzbbJelW+MPP7t2A+BHvOkbcTcuG2h1ry1PbF0RIlHWBhqcYT95iD+9SD9MTMgHd54Hn5jSTQ/4yW67TVX6Nv7m0ZG/NJRgqQykQi9tVSHOZn8hC00P+ri0iAItTd+SYnKvDtikOnpKV6xaradHuRhlV/qiWeycRS5VsiAnRPH6oDRFJTkkSOWGEwjDv7YVuvStcR1io8DdYej89WgvQS2lCPUorcKnAutV7SDfQL1QxBfcstKmmURLVgGas8ezte2XwX9yRhziXOft9MoDO6tMn5UFlpqwjmUYpe4t10UfAko/H6kdW5sQpvxiXuw5BJJCjO9372f7s9Sd/Sp2eq9C9f5I2IJO37Gxu1Bk4IljnpxlUrLnBLqyTX9wRWtcw11FtDBUdAVoGJCFTeoPEQVEKpkM4yvn5lrUahQSb+uhwQ9i1dAhGEJfBLyOlOHq1QAMTkhn6NttmsZCQRsGvv0qXCTC+HNPk2lSJXJCzqouPP5TdBX7/lpWRoj66zHeGvRRz/hZN9zkz4HPl25aIcdEeFh1ThlIN2PW08jDTSR93TKsxBLJR5we2Q7pHHzGu/GnhG3t9sUfnOf51rDY7zKO5FPy7swQexvuBspjEYvBcgmKmEsZuUwRTZRgH87hYcn/dyemIlQnq1ziXAPM4A94Q9yzm5anD/hMfjR6ApEBfhvPJeKMOPkeg/jsFppNvFPHM9KqAQeHz2EyulyA7LSJi1fBe2NTqNxv9sJr5Pashqnzji23qm4vUwOThK3ugC3 73vFzklp Bc2fTe/E7+5xsod5UfCkUcfPYxJZfDsTFUybo6zAQL9PJ73YtBDIO03suL5oOcyrEM632+DxQQJn6Sb09fZf8PFt7WZSst5mqJ/MsyOs4kcKX/y5Gb5bE+6vG4ok36Ad3+4huDVJlDGtdtwcNvWTIpJr9RWsP2bIcGMR0BMDaWdZ7JHqTyXArOgKOELTIrE5eecFL7/lbs34WQ2Jr9aXj634J0r2UA5E7o7Oopn3priOCDZLdgFuCGv0E0jOMLlDu0/o9F2S9P4neMWYl2JzUMDaW+XaXKWa6o+R28Rr3oy77sZ0evTxzQRguurMDL4EUfMwjg5FwzoasVI/5lcRMqoY78BrbHL3bQMY5r6xBR3oSTzFPur1KjVLFZWErOU5GYlClU3aT6rqtAeaaQw7NNwYyAvdOfHoKMM+5YsOub71UyGq+jp8H3ND/3g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000020, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Mar 25, 2023 at 10:06:59AM -0700, Linus Torvalds wrote: > On Sat, Mar 25, 2023 at 9:33 AM Joel Fernandes wrote: > > > > I actually didn't follow what you meant by "mutually PMD-aligned". Could you > > provide some example address numbers to explain? > > Sure, let me make this more clear with a couple of concrete examples. > > Let's say that we have a range '[old, old+len]' and we want to remap > it to '[new, new+len]'. > > Furthermore, we'll say that overlapping is fine, but because we're > always moving pages from "low address to high", we only allow > overlapping when we're moving things down (ie 'new < old'). > > And yes, I know that the overlapping case cannot actually happen with > mremap() itself. So in practice the overlapping case only happens for > the special "move the stack pages" around at execve() startup, but > let's ignore that for now. > > So we'll talk about the generic "move pages around" case, not the more > limited 'mremap()' case. > > I'll also simplify the thing to just assume that we have that > CONFIG_HAVE_MOVE_PMD enabled, so I'll ignore some of the full grotty > details. > > Ok? [...] > > we could easily decode "let's just move the whole PMD", and expand the > move to be > > old = 0x1e00000 > new = 0x1c00000 > len = 0x400000 > instead. And then instead of moving PTE's around at first, we'd move > PMD's around *all* the time, and turn this into that "simple case > (a)". Right, I totally get what you mean. You want to move more than the 4k pages in the beginning of the mapping. In fact the whole PMD, which extends further below the destination to capture the full PMD that the first 4k pages are located in. With that you get to just move PMDs purely all the way. I think that is a great idea. > NOTE! For this to work, there must be no mapping right below 'old' or > 'new', of course. But during the execve() startup, that should be > trivially true. Exactly it wont work if there is something below old or new. So for that very reason, we still have to handle the bad case where the source PMD was not deleted right? Because if there is something below new, you'll need to copy 1 PTE at a time till you hit the 2MB boundary, because you can't mess with that source PMD, it is in use to satisfy mappings below new. Then you'll eventually hit the warning we are discussing. I guess even if one can assure that there is no mapping below new for the execve() case, it still cannot be guaranteed for the mremap() case I think. But I agree, if there is no mapping below old/new, then we can just do this as an optimization. I think all that is needed to do is to check whether there are any VMAs at those locations, but correct me if I'm wrong as I'm not an mm expert. > See what I'm saying? Yep. And as you pointed out in the mremap example, this issue can also show up with non-overlapping ranges if I'm not mistaken. I get your idea. Allow me to digest all this a bit more, and since it is not urgent and this stuff is going to take some careful work with proper test cases etc, let me take this up and work on it. But your idea is loud and clear. I am also working on sending you that RCU PR and working hard to not screw that up so it is a bit busy :-P. And thank you again for the great idea and discussion! Looking forward to working on this. thanks, - Joel