From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1E13C6FD1C for ; Fri, 24 Mar 2023 23:38:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B46206B0071; Fri, 24 Mar 2023 19:38:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AF6D36B0074; Fri, 24 Mar 2023 19:38:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9BE0B6B0075; Fri, 24 Mar 2023 19:38:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 8BEC56B0071 for ; Fri, 24 Mar 2023 19:38:26 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 55E4212063C for ; Fri, 24 Mar 2023 23:38:25 +0000 (UTC) X-FDA: 80605408170.30.CB114AA Received: from mail-ed1-f45.google.com (mail-ed1-f45.google.com [209.85.208.45]) by imf13.hostedemail.com (Postfix) with ESMTP id 5236320004 for ; Fri, 24 Mar 2023 23:38:23 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=XVeK829b; spf=pass (imf13.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.45 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679701103; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZZvH/dLcD2vRcbCXF9npzcRjSuFqRKQGdZPrUOz5Jgo=; b=PN0XEBUln1dsYgsDkd3fpCw/uojsG/1jV+YOLxvqH0tmK9LM3EJtLkAgZW9SqVt3TAgwGC vSIs0A8isGos2XICGJSADdI4GdAi1/TEYYGymvMF24q8iCoV4oTjAdfIXT1mXVfysbNEVH LijtWai/tjktv6deISaMttM5mHr//q8= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=XVeK829b; spf=pass (imf13.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.45 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679701103; a=rsa-sha256; cv=none; b=6exBdtWNSPEz2JpujCLxn9Zn881neOpaLqKwkPwILTSuU9UOyeWs+1YXSHnm+blj5VZTbJ rlzjf3AZ/xapVz+p4RpoM/YP3a6TG+Y4O+NP2xXVDgVj2yEGYB8zrTXXDF2N7AbaviBUFh dq7CvsRVvt5AON+s7vS1KbraeZZdiQU= Received: by mail-ed1-f45.google.com with SMTP id x3so13751990edb.10 for ; Fri, 24 Mar 2023 16:38:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1679701101; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ZZvH/dLcD2vRcbCXF9npzcRjSuFqRKQGdZPrUOz5Jgo=; b=XVeK829btdxKIg+thjaP/IT8lt6a5tb3bxUlerQHm8JA/PIzhc9Gg3qz9UguvVeDmF HYnl8GDysgE1Pjl7u3mq4LFYiAxbRwDGWA8vyBD/j2G7HKbLXXVLQwWuYzlqmTpEbLLF OZxINnIvxT0Z8nLbkI2t3TtXLWnWpC4DThdb4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679701101; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZZvH/dLcD2vRcbCXF9npzcRjSuFqRKQGdZPrUOz5Jgo=; b=flwKFp3gnQs5spAb/PlZ9nqPVzunASTXUn86TkhtzP0QnGik8aI4RC74TimjCdTuq8 s7KH8ZdiekRtbI9OFKw61/b66VD/4FrMai6/YfzSEMqB61xYMAl6ozdOwagTmq7PnVeg 7bbDb//zTmuWgLHbDlZU4juCvAAdJ7FLQBIvA8gKtzGcGKdUDwylvGYUQ4KKwedEJLEj Iq7mO7M+agzRVRpOi+FvvD7bX8BLfXMVASj7iznVACsCB7dOy1GfjMgBvb3bbCBN3CYx iN9eROdGeqcjLHK8EgNrO1DmSthOqoSPus82iOz0LoATP1suok8Vx3hM6Pb1zGky0rYy bRrw== X-Gm-Message-State: AO0yUKUtUqurcuqPyqXR+FYtgfgOITOVCL0gVowFQqo8mN76BRiJ+qzd SgGXi2TmnS8s2khDy2BGGiOvmQiTcRxHH8CU5+XP7r62 X-Google-Smtp-Source: AK7set9nUx7P0JUhpi30+IHIm8gdwQVpB9JzAZXnAWpE4QlqvL2yzBgdqd7gocJWu0u9mV1K//kT2A== X-Received: by 2002:a05:6402:a53:b0:4bf:b2b1:84d8 with SMTP id bt19-20020a0564020a5300b004bfb2b184d8mr11623972edb.19.1679701101544; Fri, 24 Mar 2023 16:38:21 -0700 (PDT) Received: from mail-ed1-f51.google.com (mail-ed1-f51.google.com. [209.85.208.51]) by smtp.gmail.com with ESMTPSA id u3-20020a50d503000000b004fcd78d1215sm11592225edi.36.2023.03.24.16.38.20 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 24 Mar 2023 16:38:20 -0700 (PDT) Received: by mail-ed1-f51.google.com with SMTP id eh3so13745706edb.11 for ; Fri, 24 Mar 2023 16:38:20 -0700 (PDT) X-Received: by 2002:a50:cd0b:0:b0:4fc:a484:c6ed with SMTP id z11-20020a50cd0b000000b004fca484c6edmr2277835edi.2.1679701100013; Fri, 24 Mar 2023 16:38:20 -0700 (PDT) MIME-Version: 1.0 References: <20230324130530.xsmqcxapy4j2aaik@box.shutemov.name> In-Reply-To: From: Linus Torvalds Date: Fri, 24 Mar 2023 16:38:03 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: WARN_ON in move_normal_pmd To: Joel Fernandes Cc: "Kirill A. Shutemov" , Michal Hocko , Naresh Kamboju , Andrew Morton , linux-mm@kvack.org, LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 5236320004 X-Rspam-User: X-Stat-Signature: 3xbke98qhi16mqeb5xeqfujbhaxfhhep X-HE-Tag: 1679701103-179032 X-HE-Meta: U2FsdGVkX18pZVQqE1KEB9swkQI/8d/Iy4Jy6jyEgiPjVbJHVIrrTbZoLABxnKyxJKoo3mqWXi1r1L7WtpxdkhJpU9UKS57MkFFpC1lS9i+i5NDMZEmIjnvu1TA28FW3lXchmYj7fH1Ij0D5YPMjighq2dgXPcMd2vJPBFNy0Zt07E/LivZ/H5ok450cKK13gOutEfLvU2tp+9zMMGv0DMheLgttWOpGQ4xHBKPEnEvZv79i+aekoJzz8uscXvsaLlDXSTcM9BBlA7VVoRoohpUNGd6J8x0dwC0CajO2/NR2xrnLEvt8OBWGGpy8dbz0zE62VtbxK9L0oSinSpXNc913fyolksAp4X8kHlXco6JCEvnWrezRYdxbQl6XLxnnSHj/T4qgXV5SjURLA2R9dsNQHoSTCS3qYktypHaTXKZfWpmGMQNtyel8jFWs6G35cbdbKlFuix7sXDP/ypHKN0Trcd0ieDFi3lypJ/uq7vQ5cWRFGmNphjABO4CCsgpjVupbrZmPGG9mv0YejmEWpaxRvNytxhKyJ8ancOJZsDeX5l5TOXvhh5FyiCE9rb6u9C8q05bHd3GIyn44Loyy4mkiN74gdaaHFTcUNePzPtqYqQ6W02rHyt0U0ccQf29IhhGJW5GxWSbmDZtgz7disF3SJk2raPqHeCaLKNGKF1UjTjPWi1LslQHc1kHqfr32WTEXFSxsmSpYvRW0Lc6GTSu7rn1adZl3ggdabELA4bEdzDLiEt3RWVM5EAH5G/vcoNwfmXzpsYdF3k/U/m6WQFQziq/+Njs+qTMGmAoXIYV5lCmsci4WDBnI7ZD1BQJW3KNmS397oams8iNYGrSzMbAeOgTGmpwls5UrXSknHw0b+dK478m2BTWDH0Nd0j0DfsJda9eEpLoXc+W6kikuFbKhrJagKqD1xoVB0IRviPN/j1R3zNLROFaRFzKwEqNX6PFheTY1rEu0t+SFdez 3/Im8f7a mHB4tcuNKUYuut4AzkRWOugsGQLm5okZ4URlbIdlRWTQogt6skuyJyAkfM+YBH+HRR99PmbTekv53bLg6VZewknaIvA1hKi0Mc7t8x4GhULnX1DPFlgM1HOhZvoDsd/Gyydtu2uiTJZ2wL0UCNzkXo86tAViul40wR3WChfmoOgHA3G/SfUZb5gPhPwikofIa5Kh9zpYKVvdeC24KGyrhoufeHusXzte6fvqdWzKDSK3DGFF72xwvgYSGVuTVgIA1rKY+gvQM6UsLWqH0AO0PKFYhQRWeD86uBEzKQI5h9omFkIn/B2my6IoY746TN0fdp8pVU8fk4Kf/hTO9iJiVkuA4d7QqpmDR1ShEF6s6rOwYMPV1MfEqKA+ni/759rp8PCcxdzs6ayQGgsanu+ybJhHJlpmcfsmTCrDuwB1o4d0xLD71docP9GW/gLKmVdkAhfseH9GQNsyQCt41dLuGDWEuBVvZH0aDdWY7YDzr/JM8qBmMX84neBNbx7pY2cAv5BT0TWs29fkU/AAyZysyPeEHNVg5Hxe+AZLhuqgGkOPKXV3veRtaf3m3ZCxEbklUMeePUTfc2InJDUcoYbrwlIrVduA0jFVn0872dZDPStajnB0+6Gp6LdWxrA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Mar 24, 2023 at 6:43=E2=80=AFAM Joel Fernandes wrote: > > Wouldn't it be better to instead fix it from the caller side? Like > making it non-overlapping. I wonder if we could just do something like this in mremap() instead - if old/new are mutually PMD_ALIGNED - *and* there is no vma below new within the same PMD - then just expand the mremap to be PMD-aligned downwards IOW, the problem with the exec stack moving case isn't really that it's overlapping: that part is fine. We're moving downwards, and we start from the bottom, so the moving part works fine. No, the problem is that we *start* by moving individual pages, and then by the time we've a few pages down by a whole PMD, we finish the source PMD (and we've cleared all the contents of it), but it still exists. And at *that* point, when we go and start copying the next page, we're suddenly fully PMD-aligned, and now we try to copy a whole PMD, and then that code is unhappy about the fact that the old (empty) PMD is there in the target. And for all of this to happen, we need to move things by an exact multiple of PMD size, because otherwise we'd never get to that aligned situation at all, and we'd always do all the movement in individual pages, and everything would be just fine. And more importantly, if we had just *started* with moving a whole PMD, this also wouldn't have happened. But we didn't. We started moving individual pages. So you could see the warning not as a "this range overlaps" warning (it's fine, and happens all the time, and we do individual pages that way quite happily), but really as a "hey, this was very inefficient - you shouldn't have done those individual pages as several small independent invidual pages in the first place" warning. So some kind of /* Is the movement mutually PMD-aligned? */ if ((old_addr ^ new_addr) & ~PMD_MASK =3D=3D 0) { .. try to extend the move_vma() down to the *aligned* PMD case .. } logic in move_page_tables() would get rid of the warning, and would make the move more efficient since you'd skip the "move individual pages and allocate a new PMD" case entirely. This is all fairly com,plicated, and the "try to extend the move range" would also have to depend on CONFIG_HAVE_MOVE_PMD etc, so I'm not saying it's trivial. But it would seem to be a really nice optimization, in addition to getting rid of the warning. It could even help real world cases outside of this odd stack remapping case if users ever end up moving vma's by multiples of PMD_SIZE, and there aren't other vma's around the source/target that disable the optimization. Hmm? Anybody want to look into that? It looks hairy enough that I think that "you could test this with mutually aligned mremap() source/targets in some test program" would be a good thing. Because the pure execve() case is rare enough that using *that* as a test-case seems like a fool's errand. (To make things very clear: the important part is that the source and targets aren't *actually* PMD-aligned, just mutually aligned so that you *can* do the mremap() by just moving whole PMD's around) Linus