From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D062C6FD1C for ; Sat, 25 Mar 2023 17:26:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CF2766B0072; Sat, 25 Mar 2023 13:26:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CA2A36B0074; Sat, 25 Mar 2023 13:26:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B6B2E6B0075; Sat, 25 Mar 2023 13:26:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A36F56B0072 for ; Sat, 25 Mar 2023 13:26:27 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 674B1801C8 for ; Sat, 25 Mar 2023 17:26:27 +0000 (UTC) X-FDA: 80608099614.11.CD04FD9 Received: from mail-ed1-f54.google.com (mail-ed1-f54.google.com [209.85.208.54]) by imf16.hostedemail.com (Postfix) with ESMTP id 6284518000C for ; Sat, 25 Mar 2023 17:26:25 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=S+ykdErp; spf=pass (imf16.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.54 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679765185; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Hi8j/E7y8bTSj/LAOmUph9FQ5yORTj5HDWgCIVQKKc4=; b=pDwuUIfzkefr1wDz3vT+mAuZsR96Kt6fH4qg3fFO2TboI5uf7XfmLm6wBcwhGLUNae4oUz hUytyigEqoc4DwvgBQ0ygQ91CsP5lqTQ8trBmrEXtdoN16tuD4VOZ70pkkfO8xw+9w80L5 Cjh9nqdkz0djPlbPxaFHgUBymBioiY8= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=S+ykdErp; spf=pass (imf16.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.54 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679765185; a=rsa-sha256; cv=none; b=irxiFupenAVJGpmLRKFdtbobZ48SFoO1h4p96yC1dPkuhwEo81+Q+5c1sKadzNd9nDnXgo q3LQvqzw0k8JHXG7oXBa0GV26kOhgyUXYN9a3nSq4RdeBucGIb9myOAxZj+h9ec7QJAUcl 6SVp4YdU2trZWnLhk9FRbVFPX6aJP0c= Received: by mail-ed1-f54.google.com with SMTP id i5so20136288eda.0 for ; Sat, 25 Mar 2023 10:26:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1679765183; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Hi8j/E7y8bTSj/LAOmUph9FQ5yORTj5HDWgCIVQKKc4=; b=S+ykdErpn3T79trLHF9aoCQ21d1mxvVVpDMtPBG6GoNMklcmo6WFL3DS6TGclMAasB hrgFJIMWSAO4jLMXbZpgG1QqNLp2HtS+enM3kcHfXfp3Bwkdl0BvI+Q9zf2vkSXaRG41 EMRAVxQZGpTA/kvgzYNX5k9oH72oH7YKPNJro= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679765183; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Hi8j/E7y8bTSj/LAOmUph9FQ5yORTj5HDWgCIVQKKc4=; b=voa9KYbp+nj+FtY/jdN2YBR7C/ou/gAXLtrsqQ5LI/2THpJ0H4qma2THNFrFR0/csn 79Q8BKH3Exo2TNNxv3ZX0zHhypTnmm0HGap9OdFFa5ei2ACJr1s+4iFbzQEZPEQziss9 6k+//mhrNmOQhQO9HKp+/V9O5oNc++07Eb2p8e+C9D3yVHEgzvxZMSJtuc5HYfu/oqTd YgF13Q0qEeB4wQFuhuHkdJhUj5hcJ/pxG4OJyXK21e1hROKAUlYetldTeWDkLq2VSJF0 vlkB06mzph+bZpW7dCRMd6HfXoBv1pAIk42slsoO8nnu7QnRl7PDGwzSHxqi4Br0xUks Duhw== X-Gm-Message-State: AAQBX9cv2xNe6yp9wGCdcjX7ANsrgDn6npCKIKaq05Ey7LuL2wX4akgh WhccxvPpFn0WmdIq2pkBrdKYwI/Rmbdm9YjNbItzwHRs X-Google-Smtp-Source: AKy350aBv0t0B1v2f97uM4hyyuvHX5s8DjqFAHPUBrTCMAha5AfYQKM9eYRwgM05fAn2RC0SPEgrnA== X-Received: by 2002:a05:6402:45:b0:4fc:c644:6141 with SMTP id f5-20020a056402004500b004fcc6446141mr7328406edu.0.1679765183624; Sat, 25 Mar 2023 10:26:23 -0700 (PDT) Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com. [209.85.208.49]) by smtp.gmail.com with ESMTPSA id r3-20020a50d683000000b004c0239e41d8sm12506288edi.81.2023.03.25.10.26.23 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 25 Mar 2023 10:26:23 -0700 (PDT) Received: by mail-ed1-f49.google.com with SMTP id b20so20016495edd.1 for ; Sat, 25 Mar 2023 10:26:23 -0700 (PDT) X-Received: by 2002:a17:906:eec7:b0:93e:186f:ea0d with SMTP id wu7-20020a170906eec700b0093e186fea0dmr2870426ejb.15.1679765182826; Sat, 25 Mar 2023 10:26:22 -0700 (PDT) MIME-Version: 1.0 References: <20230324130530.xsmqcxapy4j2aaik@box.shutemov.name> <20230325163323.GA3088525@google.com> In-Reply-To: From: Linus Torvalds Date: Sat, 25 Mar 2023 10:26:06 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: WARN_ON in move_normal_pmd To: Joel Fernandes Cc: "Kirill A. Shutemov" , Michal Hocko , Naresh Kamboju , Andrew Morton , linux-mm@kvack.org, LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 6284518000C X-Stat-Signature: eysfrknonps3jukrgan5fy7bnxt5sr5b X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1679765185-119030 X-HE-Meta: U2FsdGVkX1/rhWS1mpJXwyAb9R0fNZkHWepEj6AREKowMhBXJQC2AXMsyuwPfj7GpJn97b/Ie9jJzYHXbChD5RkSWHFKr4Q1DdBoImF8fysdHjlKHQZbS0L5K8bRkYclyxNEDItH6I1V5TyTaARUVcev14C4DN0Gf2TMS1lDvUdArKj6YYgLNwaatOm4g0vWn+leR9bd0J/eDBMOlT/alWtLHTUMd/yu59kg5sq4q+aDiuW5gan7SN+oYJ1QvavuMEISuKImBmFyLqmAKqzrMckvHL2GiS/VirYPL7EaCclxNOvjPOxT/tWhjS+Vo3T9X1uxJ8VgkKVScPNul5WY3Y7nE4Z4UHQuCe/WJMb+j2ZHJDuMZJyp35dFoIsB3yPTdtcWn0n34Fw7pWJcRnVQnoEdau/4WTRNOA9JJ/+5z78ffUc2aqkEe/FOJKM8SflPop6dBANRfU8Tkp+nElw987VPZkEFP1fVJD16mwa+AK4qJKfp15k/SSSFlmBkufXi4QQNsXYaOUhkJbyvLEozfLpuRaFwXayBWdjmRHJJPyhurv+01GpddG1rdDRrpjHvmqDHivjGbndpeLYcEl4X++RpkDyi82jZT+/7EkHGAHsEOhVUGsxjpTD7sDR6NmNgNyQE/bHkZHorUtVqLrh7osZ6urQIHz3Zwnguo5g6Od1u5szfvmOWO1gf6AaptfpIQW9BRV0uOa14gUVWzIN+AH1Gf/AZO9Vg7dPDDqtmQeEQYxQj+POgSZm8YeayCat+JOHYDzxh2onnTy2nTJrbvEmUUxKUmx6UUc4ZvOfC97DZlJjeXr9S9cHnwdKTTsR260LXqbw5/w8bZE3VccUFoEL/hknkakK0W0wa6Qpz8NiOyPyQDeKPJAflYcPixGgf4oMc89Sejq4LIe/rfdNTht96lEnltPfgSo30nO14n+KKjs7F5SgwpUDdlCHY2r/b9Eqzja+SGfIx5H0dbjx Qaj3xYpO GhM26ScG4d566M7OJji+HYS7hdnk3ibTTac67R/gpsA1J7QdGo1kz7VhuqIVAOKXPW9tepqqh+YvjYVNmOwcBgUfthCfM/zMNMLjWbn6YMRfwYUKZLHx41ShH3E8YtuwSAGBEPhfvowHUsneV7Iv3aRDC/9Wzg6+27r5ccCs274roSGEJ/VVEFuhhObQPp9mUBzbM0PPsm+T5zo88gPrnouatTmhaAuZ9EjMTwUZmhHMv4ZG5xNpifXMvY2JuC4P/Wb5CtoEGSKE3dWvnbNccztifYSdTRWl6epCvKmFQZA/DcZmlv73tvwgGax4PW74wLfKn90v7ZAeTloRDhSnpeuFyCYsvzebXwJm3zflZ7X26wmyAU7thLg29qxk2Q41UvsrjRZexGi93ricmmr7RXFE2tymSArqb8CIsEHzr8QSH/T2yhQYTk9xjUM28gisebr5FA3WUC6yj1ppMDwmq6io2PjkDq+2CoHWjFzueX86cKxr/Y7cQ9FhyrJ3qdtyRGF2FYpWWyLiemgnwOCAUTzIdJF+xUshkoEUk/WXOJEufXaxfgwuZzkQaBXOP3PptCVPYqXu5d/g0x283ovPJLdSthbRdglw889xdRqWQonKYwLnj8VsV4K7BpQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Mar 25, 2023 at 10:06=E2=80=AFAM Linus Torvalds wrote: > > So what I'm saying is that *if* we start out with that situation, and > we have that > > old =3D 0x1fff000 > new =3D 1dff000 > len =3D 0x201000 > > we could easily decode "let's just move the whole PMD", and expand the > move to be > > old =3D 0x1e00000 > new =3D 0x1c00000 > len =3D 0x400000 > > instead. And then instead of moving PTE's around at first, we'd move > PMD's around *all* the time, and turn this into that "simple case > (a)". > > NOTE! For this to work, there must be no mapping right below 'old' or > 'new', of course. But during the execve() startup, that should be > trivially true. > > See what I'm saying? Also note that my comments about "this can be tested with mremap()" are because the above optimization works and is valid even when old and new are not originally overlapping, but they overlap after the expansion. IOW, imagine that you have a 2GB mapping, but it is not 2GB-aligned virtually, and you want to move that mapping down by 2GB. Now, because that 2GB mapping is *not* 2GB-aligned, it actually takes up *two* PMD entries. But if that mapping is the only thing that exists in those two PMD entries, and the PMD entry below it is clear (because there is no mapping right below the new address), then we can still do that unaligned 2GB mapping move entirely at the PMD level. So instead of wasting time to move it one page at a time (until it is 2GB aligned), we could just move two PMD entries around. Here's a (UNTESTED! It compiles, but that's it) user test-case for this situation: #define _GNU_SOURCE #include #include /* Pick some random 2GB-aligned address that isn't near anything else */ #define GB (1ul << 20) #define VA ((void *)(128 * GB)) #define old (VA+GB) #define new (VA-GB) #define len (2*GB) int main(int argc, char **argv) { void *addr; addr =3D mmap(old, len, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0); memset(addr, 0xff, len); mremap(old, len, len, MREMAP_MAYMOVE | MREMAP_FIXED, new); return 0; } and I claim that that mremap() right now ends up doing the whole 2GB page table move one page at a time, but it *should* be doable as just two PMD entry moves. See? Linus