From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6495BC54E58 for ; Tue, 12 Mar 2024 16:38:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EA59C8E0003; Tue, 12 Mar 2024 12:38:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E55E68D0017; Tue, 12 Mar 2024 12:38:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1DB08E0003; Tue, 12 Mar 2024 12:38:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C13228D0017 for ; Tue, 12 Mar 2024 12:38:39 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 636B7807B3 for ; Tue, 12 Mar 2024 16:38:39 +0000 (UTC) X-FDA: 81888945558.30.DBEF90B Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf14.hostedemail.com (Postfix) with ESMTP id 6EDC9100017 for ; Tue, 12 Mar 2024 16:38:37 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=jYe8NXlh; spf=none (imf14.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710261517; a=rsa-sha256; cv=none; b=3PCWQt8s+/rHj59qwGfDX5lR5DCU6JdcZGU4WxhO6Kr8dn9JpjNaVkOeArgBTgHKp1f41i 2goXdKovAtoKi5z+4Qzhj3HKaw8TCn08hmstiAyhWu952L5mE1uxlIMBFIWvyuBNc+h+ZE ivk4NX5HTbeL74hT02PtssHtvTulA5M= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=jYe8NXlh; spf=none (imf14.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710261517; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8LxRhGuwXJFpATs069eW/+reuCwEH54tH10y+l407N4=; b=ijLcU0tdS8JTbJdMJfw8m1D1bdXnC9dvS73+Yp9CVKrWspfNShjUIWVI9okP5lyhYDFte3 /Q/kRys+deBjI59YB+kBI/k+WIuh08L4nTXbKuLIQMM4qR9IgbayNGXuGSuShxfEpn8sx8 LdICoDXhXLfL92+5ChA31zNktvKD9CE= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=8LxRhGuwXJFpATs069eW/+reuCwEH54tH10y+l407N4=; b=jYe8NXlha4EYzdWWFia+M3+OgZ zcOr25Z8IhZ7NogxmHSvhJBS4mlncMVYqEA4ZTB0ctlMv62PpDh+A4O2uExc+OQyuSYVQLnxXZQFn vHmjzdp/5IwVV3oRbpUX5P9gMOmUfw0rFRIfJh6DofHPQ0Milb7vIGn/QehrzX3eTD2dmN+20PqCR x0d6t0Y8A9Eb1cIJok8pYcxtFfX63bOCfsyemI1wb4a0rkpqA0cVOUuwWX53m60MW2ElMvGow0+6X pgULWQrqUXt/XJ0LvA1AQqwDvqCw2ao2/VMK97VIZVIJJQTHnjPkC04Fq+4VhCezk/WwbWM7FD86p uJxSzQAQ==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1rk596-00000003RNz-1xJ1; Tue, 12 Mar 2024 16:38:32 +0000 Date: Tue, 12 Mar 2024 16:38:32 +0000 From: Matthew Wilcox To: Zi Yan Cc: linux-mm@kvack.org, Andrew Morton , Yang Shi , Huang Ying , "\"Kirill A . Shutemov\"" , Ryan Roberts , linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] mm/migrate: put dest folio on deferred split list if source was there. Message-ID: References: <20240311195848.135067-1-zi.yan@sent.com> <74AF044A-A14A-4C66-A019-70F8F138A9AB@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 6EDC9100017 X-Stat-Signature: 557x58x1wnq9kpbz8oetdbx5skdki5w8 X-Rspam-User: X-HE-Tag: 1710261517-757489 X-HE-Meta: U2FsdGVkX18zwmfzSnQo1OgK4jzDfKb/w5O+Y8X0ZESV6uPKL6jdrz0XPGwHNvbcvgTKgid8D3mxS4Y0tHl1AqGfpEX86Kx2q6yfGPHov0Rc12MxeVlWGGUtQ9BiVHi8SD1108gmLzur9k6VY9YnmwJxfOQ676NMryG0lLb/nSXfhWiyeh+uVCg17R2yAzSJFxoapT+62j3GdYzzXsHw/gPpx3rYf0LNYg4NAmlH45RIosZG+b10b4hwruy0O8fdVfNZCDR1iYnVVPltBvimW/U7wy0YieA0wbL/e2I089IJS0BTS++zI/o4PrIlGEYUZf7IF8nSxiz27kt6cQxNY/R4oCBpzRsf/ddCw826qyvezwxzzpLLFPGVXkd3v5fIelNdUQtC6xoEc718PDCRcu8ir+fMM0nZWHpxsFrFLfa3+D0w7NMIKXuExyzcHWCWz5k4jZ9sKVeBI4IkLRYcW0G6OzdK6jsYziEehdFMIZZZyQ1+w3QNBLSK4iCLS1OCZ9HuslBNtwRbwfW8YrhKli/vfS1KgfZcQGglnh+lTIE/mbBbgIHH9UB4aw9441zC3L91NLMyiMapdcG3HAeonusoDzNzHbOBJv5aCvLcW3lN+QGvQWoXjGT4rcC8J7XUv6bJaMc53x6rNbrykke44WL7XnGRYAZJQ6q/x3B/UWTx2G9Je69dgP20i4vOnJZsNKfO0RvTq2D4pe6v9kM0pdJCkGw0/FA92ZLzuz99miGO2Los7SlaDL/tLG8Dw2MMcb6ZccqDs7RbjOLQcRAKmsi2RkovhoTWlAwtrRGs8cNfm/4iH8tGm6pLwtoY27mgEL4KFSInF6wNMiXSFYt38w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Mar 12, 2024 at 11:51:13AM -0400, Zi Yan wrote: > On 12 Mar 2024, at 10:19, Matthew Wilcox wrote: > > > On Tue, Mar 12, 2024 at 10:13:16AM -0400, Zi Yan wrote: > >> On 11 Mar 2024, at 23:45, Matthew Wilcox wrote: > >>> Much more important: You're doing this with a positive refcount, which > >>> breaks the (undocumented) logic in deferred_split_scan() that a folio > >>> with a positive refcount will not be removed from the list. > >> > >> What is the issue here? I thought as long as the split_queue_lock is held, > >> it should be OK to manipulate the list. > > > > I just worked this out yesterday: > > https://lore.kernel.org/linux-mm/Ze9EFdFLXQEUVtKl@casper.infradead.org/ > > (the last chunk, starting with Ryan asking me "what about the first bug > > you found") > > Hmm, like you said a folio with a positive refcount will not be removed > from ds_queue->split_queue, it will have no chance going to the separate > list in deferred_list_scan() and list_del_init() will not corrupt > that list. You've misread it. Folios with a _zero_ refcount are not removed from the list in deferred_split_scan. Folios with a positive refcount are removed from the per-node or per-cgroup list _at which point there is an undocumented assumption_ that they will not be removed from the local list because they have a positive refcount. > So it should be safe. Or the issue is that before migration > adding a refcount, the folio is removed from ds_queue->split_queue > and put on the list in deferred_list_scan(), as a result, any manipulation > of folio->_deferred_list could corrupt the list. Basically, > !list_empty(&folio->_deferred_list) cannot tell if the folio is on > ds_queue->split_queue or another list. I am not sure about why "a positive > refcount" is related here. > > That makes me wonder whether ds_queue->split_queue_lock is also needed > for list_for_each_entry_safe() in deferred_split_scan(). Basically, > ds_queue->split_queue_lock protects folio->_deferred_list in addition to > ds_queue->split_queue. > > > > >>> Maximally important: Wer shouldn't be doing any of this! This folio is > >>> on the deferred split list. We shouldn't be migrating it as a single > >>> entity; we should be splitting it now that we're in a context where we > >>> can do the right thing and split it. Documentation/mm/transhuge.rst > >>> is clear that we don't split it straight away due to locking context. > >>> Splitting it on migration is clearly the right thing to do. > >>> > >>> If splitting fails, we should just fail the migration; splitting fails > >>> due to excess references, and if the source folio has excess references, > >>> then migration would fail too. > >> > >> You are suggesting: > >> 1. checking if the folio is on deferred split list or not > >> 2. if yes, split the folio > >> 3. if split fails, fail the migration as well. > >> > >> It sounds reasonable to me. The split folios should be migrated since > >> the before-split folio wants to be migrated. This split is not because > >> no new page cannot be allocated, thus the split folios should go > >> into ret_folios list instead of split_folios list. > > > > Yes, I'm happy for the split folios to be migrated. Bonus points if you > > want to figure out what order to split the folio to ;-) I don't think > > it's critical. > > > -- > Best Regards, > Yan, Zi