From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C67A7C433F5 for ; Fri, 11 Feb 2022 18:49:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 28D8D6B0078; Fri, 11 Feb 2022 13:49:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 23D976B007B; Fri, 11 Feb 2022 13:49:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0DE8E6B007D; Fri, 11 Feb 2022 13:49:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0067.hostedemail.com [216.40.44.67]) by kanga.kvack.org (Postfix) with ESMTP id F1E8C6B0078 for ; Fri, 11 Feb 2022 13:49:14 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id A3C1199909 for ; Fri, 11 Feb 2022 18:49:14 +0000 (UTC) X-FDA: 79131386628.22.FA977CB Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf08.hostedemail.com (Postfix) with ESMTP id D6847160007 for ; Fri, 11 Feb 2022 18:49:13 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 788C8212C5; Fri, 11 Feb 2022 18:49:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1644605352; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NPoThXiMgcVwXwbCWZMDUGT+DkRAo11uKH0ojRAxncU=; b=Lmm3SHWwTd8a0LSnP1bgl7Ooq4yP4pVs2McqNVdeL4Pz3ZrB/jwMQEeKC+Dt/lqu+h6xvw GZIIANv1rm/uFAxQNFxZ6ybhQ+865nDGOuvzGchMc+UrcByUdYYXEClJJD+gdQTJDKMizE AP8dsjsL7hDm/Yo6Pm5H85tjZ3CgMfk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1644605352; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NPoThXiMgcVwXwbCWZMDUGT+DkRAo11uKH0ojRAxncU=; b=ZuucMhu6mUi2/io7C0WqnO8en3CpYXZpJYSwgwim0nMWA565M4r/t9LScsoAvya+LXBTY+ iHvwygCLJVj9WjBA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 2DA6213C0F; Fri, 11 Feb 2022 18:49:12 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id Ek8/CaivBmKtJwAAMHmgww (envelope-from ); Fri, 11 Feb 2022 18:49:12 +0000 Message-ID: Date: Fri, 11 Feb 2022 19:49:11 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.1 Content-Language: en-US To: Hugh Dickins , Andrew Morton Cc: Michal Hocko , "Kirill A. Shutemov" , Matthew Wilcox , David Hildenbrand , Alistair Popple , Johannes Weiner , Rik van Riel , Suren Baghdasaryan , Yu Zhao , Greg Thelen , Shakeel Butt , linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com> <90c8962-d188-8687-dc70-628293316343@google.com> From: Vlastimil Babka Subject: Re: [PATCH 11/13] mm/munlock: page migration needs mlock pagevec drained In-Reply-To: <90c8962-d188-8687-dc70-628293316343@google.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: D6847160007 X-Stat-Signature: sq5x671r4x7robpgnbasdcsdhbzc8zgw Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=Lmm3SHWw; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=ZuucMhu6; spf=pass (imf08.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-HE-Tag: 1644605353-767835 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2/6/22 22:49, Hugh Dickins wrote: > Page migration of a VM_LOCKED page tends to fail, because when the old > page is unmapped, it is put on the mlock pagevec with raised refcount, > which then fails the freeze. > > At first I thought this would be fixed by a local mlock_page_drain() at > the upper rmap_walk() level - which would have nicely batched all the > munlocks of that page; but tests show that the task can too easily move > to another cpu, leaving pagevec residue behind which fails the migration. > > So try_to_migrate_one() drain the local pagevec after page_remove_rmap() > from a VM_LOCKED vma; and do the same in try_to_unmap_one(), whose > TTU_IGNORE_MLOCK users would want the same treatment; and do the same > in remove_migration_pte() - not important when successfully inserting > a new page, but necessary when hoping to retry after failure. > > Any new pagevec runs the risk of adding a new way of stranding, and we > might discover other corners where mlock_page_drain() or lru_add_drain() > would now help. If the mlock pagevec raises doubts, we can easily add a > sysctl to tune its length to 1, which reverts to synchronous operation. Not a fan of adding new sysctls like those as that just pushes the failure of kernel devs to poor admins :) The old pagevec usage deleted by patch 1 was limited to the naturally larger munlock_vma_pages_range() operation. The new per-cpu based one is more general, which obviously has its advantages, but then it might bring new corner cases. So if this turns out to be an big problem, I would rather go back to the limited scenario pagevec than a sysctl? > Signed-off-by: Hugh Dickins Acked-by: Vlastimil Babka > --- > mm/migrate.c | 2 ++ > mm/rmap.c | 4 ++++ > 2 files changed, 6 insertions(+) > > diff --git a/mm/migrate.c b/mm/migrate.c > index f4bcf1541b62..e7d0b68d5dcb 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -251,6 +251,8 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma, > page_add_file_rmap(new, vma, false); > set_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); > } > + if (vma->vm_flags & VM_LOCKED) > + mlock_page_drain(smp_processor_id()); > > /* No need to invalidate - it was non-present before */ > update_mmu_cache(vma, pvmw.address, pvmw.pte); > diff --git a/mm/rmap.c b/mm/rmap.c > index 5442a5c97a85..714bfdc72c7b 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1656,6 +1656,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, > * See Documentation/vm/mmu_notifier.rst > */ > page_remove_rmap(subpage, vma, PageHuge(page)); > + if (vma->vm_flags & VM_LOCKED) > + mlock_page_drain(smp_processor_id()); > put_page(page); > } > > @@ -1930,6 +1932,8 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma, > * See Documentation/vm/mmu_notifier.rst > */ > page_remove_rmap(subpage, vma, PageHuge(page)); > + if (vma->vm_flags & VM_LOCKED) > + mlock_page_drain(smp_processor_id()); > put_page(page); > } >