From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 899DAC77B73 for ; Wed, 19 Apr 2023 04:37:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D1B0D8E0002; Wed, 19 Apr 2023 00:37:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CCAE68E0001; Wed, 19 Apr 2023 00:37:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BBA6A8E0002; Wed, 19 Apr 2023 00:37:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id AA29F8E0001 for ; Wed, 19 Apr 2023 00:37:50 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 644A5140449 for ; Wed, 19 Apr 2023 04:37:50 +0000 (UTC) X-FDA: 80696882700.16.EE70319 Received: from mail-yw1-f172.google.com (mail-yw1-f172.google.com [209.85.128.172]) by imf06.hostedemail.com (Postfix) with ESMTP id 9B67F18000C for ; Wed, 19 Apr 2023 04:37:48 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=NMv2ixsd; spf=pass (imf06.hostedemail.com: domain of hughd@google.com designates 209.85.128.172 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681879068; a=rsa-sha256; cv=none; b=sQDWGtWzGEsdzb1bPjpVqTWx919Dv8T6rz+ql7aNzIoe6RLW4LrB4gRNpWovR/etHl2zwo ef8TpwDQALfPo1t1ReUrNrcj7D2VVZOHoaxy2sTE85/cMDHhB9F4trQTRCokPTPE7wkxdY 12UCcrQOeEalFO1BZtV2WaNKpyepaoY= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=NMv2ixsd; spf=pass (imf06.hostedemail.com: domain of hughd@google.com designates 209.85.128.172 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681879068; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pYXTmrYzjexfz2MYsEFttRvUhicjx4hCZtLdGvtQzsM=; b=WmrqMIaEaugt4uDK0f30aVipFJOXOyTFe0dLTBOtjhBFxuEgOOL2487Lg1mNhq3NN1CFLI P2c5y/1IdtGAsCgsY4SHgjn/5WcwTWjMdr1M5e6vImrEGzOIyrFRcsO7kjodyJL3FcfZ8c WC3iynMA8erEsP911Ap1eFeNqSvSpI0= Received: by mail-yw1-f172.google.com with SMTP id 00721157ae682-54f21cdfadbso434412327b3.7 for ; Tue, 18 Apr 2023 21:37:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1681879067; x=1684471067; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=pYXTmrYzjexfz2MYsEFttRvUhicjx4hCZtLdGvtQzsM=; b=NMv2ixsdeJ2a+1uurjusEhBS/ufWexNvFFga5/rJhOUoAgjMZdaah+/24pR9timYUR v7L0Z2K/ppuPRYJgoNMEHbWXR/W6MxWp7sHfbnlJZVRunnTw3NC6LV5l8awNm+dSOZu5 +dd14JRu4n4WGnbbaCkIWxK/KyQaDekD7oU7oKHBgwsx1eJh/feC5wR44B7zM7uBTyLH iYHKqJF3i1a0sx+9c4THEj8h2y6ET7sgvVC/JWjCf7kJj8Iv/r1lkoMguvT/7y3G9l+Z cuBXhHNuL57m+yq7+1Xn3aZ7I+DqEVJTqzAaUyiCfHPOeVngJRbDNtUvwCnPQOY+O2K0 P45g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681879067; x=1684471067; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pYXTmrYzjexfz2MYsEFttRvUhicjx4hCZtLdGvtQzsM=; b=dM7NmNp862I2IUPiZltHlwtXb3UjTdSgMrcGwXz0jR//sdeVTMTPyoQqW4Ys0y0vOJ 6wNnh5jwVQC2aiVKxOoi+JzLw2o59sb7Etk4h/aLOmGWyFMMZMS/F1DiPleEApv9n1JR HqKV46a91j8+EDWU5X8wXdQUUw9gHa9CT73OreJ1ET2ju0PB4cgWO7BV3BN6bAs5aSqZ yHmiwlP1BYFaaxK9M5NelbI3OfYNXXh9/eBwmFuLlwYVoOWGI9S/dhLxntC1f7EEsqm9 bj2MeF16dBMSiU+aOiERFdzULttBP/phuugpKmIAJwMxMx/2QraR4zPgPGo8aNiHZZv+ RUXQ== X-Gm-Message-State: AAQBX9c3CUnQ58ucynMMwEdkjyi6JWM6V4h2c5Ni/NFxxRXAoiMfDiY9 2cUnS7ee+BkRfSAkOd1wHc/AUg== X-Google-Smtp-Source: AKy350aGHTz+Ltc4k7j3L7XZcD97vWaCIjBIDRGX3tdF0AU/AKX3ld5oQqd4/FYqRaJLBIYd2/JlSA== X-Received: by 2002:a81:a044:0:b0:54e:dcf2:705b with SMTP id x65-20020a81a044000000b0054edcf2705bmr2006239ywg.47.1681879067533; Tue, 18 Apr 2023 21:37:47 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id cl22-20020a05690c0c1600b0054fbadd96c4sm4241038ywb.126.2023.04.18.21.37.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Apr 2023 21:37:46 -0700 (PDT) Date: Tue, 18 Apr 2023 21:37:38 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Peter Xu cc: David Stevens , linux-mm@kvack.org, Hugh Dickins , Andrew Morton , Matthew Wilcox , "Kirill A . Shutemov" , Yang Shi , David Hildenbrand , Jiaqi Yan , linux-kernel@vger.kernel.org Subject: Re: [PATCH v6 4/4] mm/khugepaged: maintain page cache uptodate flag In-Reply-To: Message-ID: <2c32b136-f17c-8362-a9ec-4490b35be1c0@google.com> References: <20230404120117.2562166-1-stevensd@google.com> <20230404120117.2562166-5-stevensd@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspam-User: X-Rspamd-Queue-Id: 9B67F18000C X-Rspamd-Server: rspam01 X-Stat-Signature: xkf48szq9588gj95qcd7nrw4jsnpszwt X-HE-Tag: 1681879068-915420 X-HE-Meta: U2FsdGVkX19mxMlrWdVvylCX6eFMwgLpIlQihQp0UbW6zwlD3/RNURYZQx5p+VBXlj6tzpKiCTQQcsFWFz5ENMg5eWokfAM4zaqwP12XKAvPJLKeFj4tDDB5h/1Hb/elkGCxdvqKcRI0ctb3HQl+yRMIOl7KEwGwTVxw+BYXgusbvCJhSIqsJeTmhgR4J74NyVUvq2VWdSN5JFwMmRHe+ED3IYx9+2Mwt4WhxJCL53O7QWmh9kkqB+8W1ScyB6xrB29cwLFmE9+iAuWH3zP2OiczoNdpDzpF9ev0JcJxefO3mAFgk0UnXQrD9z7yzFBYYG2EySdYTj8cclPyW7YeRXiTJBBxgFHcHgghzZ32kkxeIA9a570BPG2QZWuY6xSwYf1zvjWADo5iiXFJy1T1A77k8+JnPuazzu3hnWqXIt1GISxXfiZUFrNfAK7euhwMinQDbDQKcaO7kIrC87ckKkt/7RHmNdXW/KoMxMmpxJoPfWJo0Thlu7bMMpzk6BhuRWptdZoGVvhwdav5X3gZ61gdkm0NOpwtWuk2gndxwvKY/3hk7xFt+SxiB1JsUu9xk01rmMrAY6cj8+ubYDPIJn1LhQ4+j4a4KAWhZWL7WdaFq0tvqRARNaZXAQ2TJWCsFVTM5jPaNEfTXWwgZ4zQ6xdTVHYWemNp+5fgbZ1mRVsSgMO2J6Oq0ljHf/sQEa2xJe/X1DG5K9bknUL7qb6dUGaHXodS32mUNEjjqQK73p/Wefbbshjw51LHj+jE2ZSvizTQqvR8sH9p1Evj1ZPJMTrRLY6KdxomWmaK6JhRSwsIpAGrz34Vg0YBoQFnkefhBOcLYDRTALTQCz+VlDG+oGrMSATN38eKD6+LbtKO0At0z6/5kkOC9SnwH+z5Gjab0ye3taw2lAFgzry86hYW5NNDsHDfhSFoZU+ApLLUi4mAyv9KCu3fFkTHYnjPVFP9ULF+zGPc8xe7fRAcWGi Atz6Gfzr aN6tD3hFCnKDsqzK0K/UB7MiuqeN7e+Kbu9VM490P7aWcKH5/03n0CGDqPMNAL+X39R9HgObLTgR1bK/y06XaYPVDk06E+nXSBc6hIhD3pT10DSJhm+ctWU/w6OHUL0d7CmGvPkX+xaxZn1rctKZckM2vbrqLLzkoL0M56afZoScA4ET1NeEKpgigUB1KeLFCs+asnY9hni7R6WCVRUlbxqyY8rhJa2XzmYt+cML5mVUKou+zzFYmAW8GngLjmhQylp1msyb3hywMKaza/7EvwBpAIiTfk6VDL6ZySUS7F8mdsBpMFaizieuqYB2scF4y74R/AXz5zqP+dWo2QrqcAAIFOQa/4QQ8bXKvH/vk1YNxZYsnH4RMvIhwbtEi36GQhjX9bsLofOgVJRQqv4ta65b/Pgu6OQhc9iJ9NqVyolsrNHmKFfEszUFG4g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, 4 Apr 2023, Peter Xu wrote: > On Tue, Apr 04, 2023 at 09:01:17PM +0900, David Stevens wrote: > > From: David Stevens > > > > Make sure that collapse_file doesn't interfere with checking the > > uptodate flag in the page cache by only inserting hpage into the page > > cache after it has been updated and marked uptodate. This is achieved by > > simply not replacing present pages with hpage when iterating over the > > target range. > > > > The present pages are already locked, so replacing them with the locked > > hpage before the collapse is finalized is unnecessary. However, it is > > necessary to stop freezing the present pages after validating them, > > since leaving long-term frozen pages in the page cache can lead to > > deadlocks. Simply checking the reference count is sufficient to ensure > > that there are no long-term references hanging around that would the > > collapse would break. Similar to hpage, there is no reason that the > > present pages actually need to be frozen in addition to being locked. > > > > This fixes a race where folio_seek_hole_data would mistake hpage for > > an fallocated but unwritten page. This race is visible to userspace via > > data temporarily disappearing from SEEK_DATA/SEEK_HOLE. This also fixes > > a similar race where pages could temporarily disappear from mincore. > > > > Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages") > > Signed-off-by: David Stevens ... > > Personally I don't see anything wrong with this change to resolve the dead > lock. E.g. fast gup race right before unmapping the pgtables seems fine, > since we'll just bail out with >3 refcounts (or fast-gup bails out by > checking pte changes). Either way looks fine here. > > So far it looks good to me, but that may not mean much per the history on > what I can overlook. It'll be always good to hear from Hugh and others. I'm uneasy about it, and haven't let it sink in for long enough: but haven't spotted anything wrong with it, nor experienced any trouble. I would have much preferred David to stick with the current scheme, and fix up seek_hole_data, and be less concerned with the mincore transients: this patch makes a significant change that is difficult to be sure of. I was dubious about the unfrozen "page_count(page) != 3" check (where another task can grab a reference an instant later), but perhaps it does serve a purpose, since we hold the page lock there: excludes concurrent shmem reads which grab but drop page lock before copying (though it's not clear that those do actually need excluding). I had thought shmem was peculiar in relying on page lock while writing, but turn out to be quite wrong about that: most filesystems rely on page lock while writing, though I'm not sure whether that's true of all (and it doesn't matter while collapse of non-shmem file is only permitted on read-only). We shall see. Hugh