From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9EA85C7EE23 for ; Wed, 24 May 2023 03:45:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 08E9A900002; Tue, 23 May 2023 23:45:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 03CFD6B007E; Tue, 23 May 2023 23:45:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E46C1900002; Tue, 23 May 2023 23:45:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D1B5E6B007D for ; Tue, 23 May 2023 23:45:54 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 8B9DA1C68A5 for ; Wed, 24 May 2023 03:45:53 +0000 (UTC) X-FDA: 80823759786.11.A216984 Received: from mail-yw1-f171.google.com (mail-yw1-f171.google.com [209.85.128.171]) by imf03.hostedemail.com (Postfix) with ESMTP id 8BA242000A for ; Wed, 24 May 2023 03:45:50 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=QeqsWRUT; spf=pass (imf03.hostedemail.com: domain of hughd@google.com designates 209.85.128.171 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684899950; a=rsa-sha256; cv=none; b=6O074RA/h6BZ/Qrc0nTHb5l/5CbvI66NI23BxNoikIlmzrl8yZhqZur+kKZaUe35n+7+vs QBvipsMu5ZgMJzlE1CcAPWvjD7/RZb+PXpVaxjFnpal6UWrCPN6epjwpswrey85o5upmPV FDBWxPxnwaqeGGO+0hug4ZlczGaCDC4= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=QeqsWRUT; spf=pass (imf03.hostedemail.com: domain of hughd@google.com designates 209.85.128.171 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684899950; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2lVbnEYsazKzX4enQtO4af5XFObezJstzH38Mnuyqik=; b=QJCdmURzlJmWOFKQAZeDqNHmB8K1MFWRIm1TiDjYkK2BjC3VPqcmXoL5br462jtkQ0X/5w IRl8ebp67rIn3vGjLzudgTkhFT4nOCvMsxaDcj7n3ic2p1wCjG/akktFEM8DH1OzySMvXj AnAe2SMOhOJSmtsDnXrub8TlWAeFKjg= Received: by mail-yw1-f171.google.com with SMTP id 00721157ae682-561c1ae21e7so6964117b3.0 for ; Tue, 23 May 2023 20:45:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1684899949; x=1687491949; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=2lVbnEYsazKzX4enQtO4af5XFObezJstzH38Mnuyqik=; b=QeqsWRUTIaM7z8ObBASO6rMxhUwHOaMOJQuaiTpKL4dhLcrusvv81VoDTUHxu0zaPc l2n68Z6HCYi0FK4CW7Xxh4nLcs7I6SuH4H+KIz63Dxsbq/W7e5ockvMqvIJT4A8rVFEw r2HfYjvpXIdGadh+M/JHV94OiHEJS2Y1qXcnjJBi44vAWTeSfWv4kQmauXwlPd+D2jKk JQKv9zBR2MqNQi/DVKTe1dXOuqbJliDsxOzg+5MAguYnOYSfs091WlrNIJNRMs47CrZd jPERmDuwKyAsBb944CNDy1jjmz6iypYAK7Gy1wvcADAX0RL813ghMliaaD152dgoa9X4 YVgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684899949; x=1687491949; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2lVbnEYsazKzX4enQtO4af5XFObezJstzH38Mnuyqik=; b=EiQwgud8VXUJlZpF9Kksu7sy25B9TH8gFkpqJqKhAw2mfLatEgXF7+8vc1RxvP1Rpc s9W7xIhAinxGTzJqd0JZnj9NpQGPtu28YWQmC6vy8eKeMcyt7pXXBpcHh5Mh29o+SzPh kr/jCYBHZjNozEy1JeLV/5QHY6+PrQCdbgLvI4nNURHBUnAgG3kzlqW4Ch6sp+26byLf Z6FENOGJZ/khH2I2O9ebmUS/N0E+p2ggZIASA0GWp9ojiuQF+UzQbrVnXUy1bg4iO+JT 1dHvZOgkNj2ePj8c7gYv6lCVdYzT/swIftvm/pmeHN6t/gZky+PhUPb1AsXTYD5mBuE/ mM9A== X-Gm-Message-State: AC+VfDwPrAwYjmjaBAzFQ06fJFw53Qt1AhRUbWZHkUGJxHmm6Chxslb8 r8csfhzfoCQyanG+bUAl4hVptA== X-Google-Smtp-Source: ACHHUZ70sisZwDeFJo5/b2Pf7ePEqVJmgh0OAHByoVNyolGXBU8aPumI2PHrZqUwzWqCYUICcJtddQ== X-Received: by 2002:a0d:ff44:0:b0:55d:7d07:7fbb with SMTP id p65-20020a0dff44000000b0055d7d077fbbmr20216649ywf.27.1684899949486; Tue, 23 May 2023 20:45:49 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id f205-20020a816ad6000000b00561949f713fsm3417913ywc.39.2023.05.23.20.45.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 May 2023 20:45:49 -0700 (PDT) Date: Tue, 23 May 2023 20:45:45 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Alistair Popple cc: Hugh Dickins , Andrew Morton , Mike Kravetz , Mike Rapoport , "Kirill A. Shutemov" , Matthew Wilcox , David Hildenbrand , Suren Baghdasaryan , Qi Zheng , Yang Shi , Mel Gorman , Peter Xu , Peter Zijlstra , Will Deacon , Yu Zhao , Ralph Campbell , Ira Weiny , Steven Price , SeongJae Park , Naoya Horiguchi , Christophe Leroy , Zack Rusin , Jason Gunthorpe , Axel Rasmussen , Anshuman Khandual , Pasha Tatashin , Miaohe Lin , Minchan Kim , Christoph Hellwig , Song Liu , Thomas Hellstrom , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 24/31] mm/migrate_device: allow pte_offset_map_lock() to fail In-Reply-To: <877csz943s.fsf@nvidia.com> Message-ID: <838a5172-f7f2-43db-e990-d38b36b544a2@google.com> References: <68a97fbe-5c1e-7ac6-72c-7b9c6290b370@google.com> <877csz943s.fsf@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 8BA242000A X-Stat-Signature: 3frxhphj4ycn7zddubsnou1e5kbkwdc3 X-Rspam-User: X-HE-Tag: 1684899950-666790 X-HE-Meta: U2FsdGVkX19GPBFCyeh58V/vtngWcm3i+S7vBlVyHX2ypfo2aLojKplr+/5Sdx26XEtbskXNg2B3Ef1fDF2YplhH5udJVO+75PwQCeS2MzbNoWQ9pyrwQEwRko3k2HGgu1YBXFG4ADIeCIBpKrjirqDvHy2lWz2y/5kvnKdNbpS/baWqw5qzshZmWjcovjrpHAB4crSwLDAaA2/uNpckYUy4Byt4QR+VGETCSLLdzpIRzOYRCkAj/jD4ABLHXxqls3W+K+Elq0WSsYFulYHovWI121FfBSeVVoG6fcb1ZHaKPz5FYqYWVKiFxE/skQDXm8wy7B5iMw8W31A2zhUo0WQKEXsMVJcSWlWKLClacTtGzaBz/UnY0UEWE+5unE7vK9rfgfwlHo3AT/diCfBqAw7VWvJWhEOM7bgdUiSVrFnr3aaJdz/kxThBOmnxf8mkVS8nVAcMa9V8G2pzeb5PmuupxSKHj//PSW9N6K039GrkbIFuiFq4nkQ0Y6ZoUsQ9Wru9pU6RBIJK+NACtpuKDlb9AI6ES5UgiFuSgn/JLtJJF7P138QDytfY+BZP4Q1SyHJw/NFnYPrIQRv1XQwUlD+eLvYmM3cF58MzuuYsREyvxOGpOpHCUd8lV+QNEZ1aL0GUxG4iR8s2yYukO45cVeY7yreH628MKvKbhYCIqDVJ7XkeOnNNSjKrq0k2hgP06HOSoVBArts/SXMDDK2RNeLoRYmQES75/mBtjiNAULKyR5vFnM8/nSZjEhBr1b+9B/wKfewCzlRr2aLOHXtAy4rmuz1Xoz+GyzeiowMOFBbJkw0H4IaRmBYbPStH88h9i5kId/V+Y3AriZM1a6jQpPRqLw8XXjk9BObsaOiEzL1gHGUoAxbawxojJu3ZImtvh79+4X9nisnlBmVAg5IM+4Jqp/uIbRf4sC3EGHn3LTEmZ3VEyMFdsmSWS8GryhwuBM4zAlLuY6KnTyzVikj EjI/TFRG uNemBXZ8qH0ikBB8QXzRpNy4Kxa2k+m/aGpVJkU4eEmd8lYaCHA/Cmi90GELWklXz/1PqIlmLGqgYIZrmIzV7mICPr8Vemp3V+0GQk2347HaAZQIStNnw70vyi+guZY8Io/CqOXH3usY1c5uFUTOGSuS92RE7GbWBQ95uG2uZuCEdq0UT+eI8RC7YsWePG1DZRXpwcUJi4hB/eT6AW36f/YIqA7sHEO//Nlnkv/gEaXjKuspb3zu1xcY5GpnwRnZsoPgQC7yusCGZH62DcAFuxbvnrBz/3Q537i1wcjdd8dLKhWql3VMcIj3V+SZY51HMe0noKbai5zXi3DYBcvshfCpRvSbBHXeTzgWD8QxqSE81HRXhBre3TY+04XVpZ9N5fJEa8Waz2fdpJR78lbkN+RKgxGvLClZrKTW6/rfjpaWUl7TBVk3mJf2zYw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, 23 May 2023, Alistair Popple wrote: > Hugh Dickins writes: > > > migrate_vma_collect_pmd(): remove the pmd_trans_unstable() handling after > > splitting huge zero pmd, and the pmd_none() handling after successfully > > splitting huge page: those are now managed inside pte_offset_map_lock(), > > and by "goto again" when it fails. > > > > But the skip after unsuccessful split_huge_page() must stay: it avoids an > > endless loop. The skip when pmd_bad()? Remove that: it will be treated > > as a hole rather than a skip once cleared by pte_offset_map_lock(), but > > with different timing that would be so anyway; and it's arguably best to > > leave the pmd_bad() handling centralized there. > > So for a pmd_bad() the sequence would be: > > 1. pte_offset_map_lock() would return NULL and clear the PMD. > 2. goto again marks the page as a migrating hole, > 3. In migrate_vma_insert_page() a new PMD is created by pmd_alloc(). > 4. This leads to a new zero page getting mapped for the previously > pmd_bad() mapping. Agreed. > > I'm not entirely sure what the pmd_bad() case is used for but is that > ok? I understand that previously it was all a matter of timing, but I > wouldn't rely on the previous code being correct in this regard either. The pmd_bad() case is for when the pmd table got corrupted (overwritten, cosmic rays, whatever), and that pmd entry is easily recognized as nonsense: we try not to crash on it, but user data may have got lost. My "timing" remark may not be accurate: I seem to be living in the past, when we had a lot more "pmd_none_or_clear_bad()"s around than today - I was thinking that any one of them could be racily changing the bad to none. Though I suppose I am now making my timing remark accurate, by changing the bad to none more often again. Since data is liable to be lost anyway (unless the corrupted entry was actually none before it got corrupted), it doesn't matter greatly what we do with it (some would definitely prefer a crash, but traditionally we don't): issue a "pmd bad" message and not get stuck in a loop is the main thing. > > > migrate_vma_insert_page(): remove comment on the old pte_offset_map() > > and old locking limitations; remove the pmd_trans_unstable() check and > > just proceed to pte_offset_map_lock(), aborting when it fails (page has > > now been charged to memcg, but that's so in other cases, and presumably > > uncharged later). > > Correct, the non-migrating page will be freed later via put_page() which > will uncharge the page. Thanks for confirming, yes, it was more difficult once upon a time, but nowadays just a matter of reaching the final put_page() Hugh