From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2AABEC27C4F for ; Wed, 26 Jun 2024 06:44:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1DB6C6B0082; Wed, 26 Jun 2024 02:44:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 18BB66B0083; Wed, 26 Jun 2024 02:44:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 054426B0085; Wed, 26 Jun 2024 02:44:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D90AB6B0082 for ; Wed, 26 Jun 2024 02:44:06 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 75C581C1CBA for ; Wed, 26 Jun 2024 06:44:06 +0000 (UTC) X-FDA: 82272100092.02.BD96EF4 Received: from mail-oi1-f180.google.com (mail-oi1-f180.google.com [209.85.167.180]) by imf11.hostedemail.com (Postfix) with ESMTP id 9562E40011 for ; Wed, 26 Jun 2024 06:44:04 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jTnHE2+1; spf=pass (imf11.hostedemail.com: domain of alexjlzheng@gmail.com designates 209.85.167.180 as permitted sender) smtp.mailfrom=alexjlzheng@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719384236; a=rsa-sha256; cv=none; b=uqROwoMvmpVq6ZaauatFihHqEKjXebni1HD987E4aa+xNA5R9IuvQEOFis4F1n9CcvRgEU RDkatG/7Bh+4mgjY0iWXsbD9SCxS/YWBcTKP7SQKEtgB8t4og1jya/JSGD9UTAKogwo89l TTSqojkjVV2DluzuX4H161dOY7yhoJs= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jTnHE2+1; spf=pass (imf11.hostedemail.com: domain of alexjlzheng@gmail.com designates 209.85.167.180 as permitted sender) smtp.mailfrom=alexjlzheng@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719384236; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jNw0dskDHLVIQXzCGgLODiSI5oPf0Z65i8AIhbRzTM0=; b=3a4sdPI/NogslqsqNRwyVCcq/YlFg1PvKCUSxcehCYxytloMPdw+4Olis5da6ferayTbaS G6+2pk+FsWr1p7VJhmxGAreRvfCwzmjQ4Rt1eHN+69lWsrsWHhRcyBNlPwLEmK31vwqOLT Cb8Jf6rV9Qmttf8BQDAlf2STe+5ebSw= Received: by mail-oi1-f180.google.com with SMTP id 5614622812f47-3d561e564d0so136977b6e.1 for ; Tue, 25 Jun 2024 23:44:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719384243; x=1719989043; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jNw0dskDHLVIQXzCGgLODiSI5oPf0Z65i8AIhbRzTM0=; b=jTnHE2+1ISzrrwdMjeVju7qeyfRh7JISZg4KXoa/UoULKVwz9q98uRJEh6WOr4wn12 lxevjbrCoN2mIZXJg/xBwUi8N3JgEJcgACF7lkoJx5fAL1kCHT/bqq2ArooFKCxfPGDD lQBJ48Mv7SXuUgtIjT4iTUw5TUfr3loJLRiO6FVUTzTUGxasQicyhr3JzcGgpg2tLMHC AHqvNjwpr0hCGLjTjrpQKjZPrkaWksYXelCaTg1qlLUVhyYL3oK6cqWZhMzmpsd5fpeC RXUGhnXRNu93N8tBa/3/JlFAc2WNE66MXwxOJ0xR4rC98yHAn0WNlI9bTfC5dND6OkXu pm2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719384243; x=1719989043; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jNw0dskDHLVIQXzCGgLODiSI5oPf0Z65i8AIhbRzTM0=; b=a2V9O94GbuX6SCUPrW7j7YHSEWe7rkJtJxXkx6qSXWvrxczNv3abMI115pZn6Pw9R5 JCUpMpHyBwh41XDe7mc1BPflc6mbw4MxGC+Vl4rqfyxeH2XwY9vcuKJm1NJsphCICdur +SKsfEAX7S8SNSjN3vsY4vS4b/NnDvNmPEu6n//ZNcz26ZCQh7pPFIXTIFISRxi5jkqJ 7x8N2octDIJLCrEu5WtvbplUhpT2i0bWQQtdXZMvZX604WA+igp/ti1j1WoJmYQPKJ9k BUXqzAX1ZemSa1yruwtUknsbHDEz6eWG85FNnSKGjc6u+whHWtgWvgAkZFxwa7P+WUs8 ORnQ== X-Forwarded-Encrypted: i=1; AJvYcCVcE6XUkx3JZEng4uUt3Q1eZhRLeYfMRZ0Ax62mBeyBprNHU3hgwixYrR/YxIZglGzc7W69wxGF/x7x/XTt7j5Wi84= X-Gm-Message-State: AOJu0Ywt2Hg940x4EzIyU2t4R2MP6eWRl6LURBXFAg5T6r7e/wQCAWgf xnbxVh0woX2pGBjiHkfuDhAhPRMDPwcvmTAEkkNgWi/m+OOitS1V X-Google-Smtp-Source: AGHT+IEh4Y/5K7DnRvm22rOOcOL3Obh8b2E7hfzm7A4/EcLjkbK17Ij8bQ9YtNn+GqFGTh1tj6wJ/w== X-Received: by 2002:a05:6808:1701:b0:3d5:4256:26d4 with SMTP id 5614622812f47-3d542562770mr11477631b6e.7.1719384243533; Tue, 25 Jun 2024 23:44:03 -0700 (PDT) Received: from localhost.localdomain ([43.135.72.207]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7068c719371sm4311312b3a.102.2024.06.25.23.44.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Jun 2024 23:44:03 -0700 (PDT) From: Jinliang Zheng X-Google-Original-From: Jinliang Zheng To: mhocko@suse.com Cc: akpm@linux-foundation.org, alexjlzheng@gmail.com, alexjlzheng@tencent.com, axboe@kernel.dk, brauner@kernel.org, ebiederm@xmission.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mjguzik@gmail.com, oleg@redhat.com, tandersen@netflix.com, willy@infradead.org Subject: Re: [PATCH v2] mm: optimize the redundant loop of mm_update_next_owner() Date: Wed, 26 Jun 2024 14:43:59 +0800 Message-ID: <20240626064359.79119-1-alexjlzheng@tencent.com> X-Mailer: git-send-email 2.41.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Stat-Signature: tdckrqpd84pfg9hg9w6g7s5sfiofi8u1 X-Rspamd-Queue-Id: 9562E40011 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1719384244-311146 X-HE-Meta: U2FsdGVkX1+CazaHGkuGx+h2cW3J0G80EqzraX1xr9W2Kgu2zm6gXDsLgG+KbhM8fditlDa/Wz5nFFhdlcSQT3hUFHV/5yRZmRwEHc06FdkdW3Igkp6HRcDhQyiNco2L/dy51B6Rzrnf6mH3jcXPNNKhFP+NBfc1txS08l6ZvoggE8SzoylJBx8nQU2b+nPwPzP050771mhXxxxqXTNvSgIFk8PjsaM4KxJv96X10Y2nK1rBPLMIQoK+TWaTi2sJivcXPRUdKmD3y6JskGPVenxSQxdRCV/YpeNa0f5rxNZ+GBI4WBJl5b3klLGaqkHsAttTEQwplOAficbSSfO8FQPNXTuXn7dn2F9iDrQUT7mMALkqqUEeJ4Y8eKfqr7jfDmHZcWVq2W5rKTdl+XRsEcDTTZz/mtg8WJBBGxqJrkcHr8/JJzvbidm/U/dH9wra9aBkLnUA3itD/x+aq8CbPSQIBPyPJdEv5ExbxB04qPPLdnJrL+jscYuYbxsjF1doyqnyRZF6FABJZCR+sn8R87gVjgYn1cv0I1gevnx39LmB/z9uSKf/oyJMv9+oEQdb3eWg/gvXLDHwLhEst68nW29482lDuEtFgSHaf5uBLVfnGSVHkxRbqisBn/TYGoKKbfN5XzR8NyTaZ1mIopl90QkQujUG2Ly4xzCc51CABszj+9ppD2ZCcarCWl1CVbu8nvSkf7g3KshiaRcvvBpJWUfzftYTcb53AxGEsNyvYGGGJlIojaRR1HjEOaAS6tuwsgqGtRVUw1ZPtXSnCNl3FrnX6PXKWd1ra3ZUd/VOyqQ46v7c4IOz1FiBX+4k6oglaAhgFBj1C9qyeNgTpY8ptU7k8D2nNyg5XjAQAYid5KU7k9cmXIYB6CmHu2wZBXKNd1MBsegCzH3HNk9kFc5fbEbN+G+cjy5uAWlWJUSnTqtyo4EVibD/jQ1Y+4cvwna8YbqAqtsT8hNBA3jSDHO MVCGsoDR zhgG81I2OVilQqRdLiYprTxfhTOTqBqEX6DacGonmmDgPTwAhAoV7Cv4tT0uWd2nxcl0/9fFwJrivEO5Dw3tuDJ2azmDKZr6tcCw9qQ8rY7CjlefP0uCo9fgkBb3P1rw2A3qQ9OOWd0kgVlApe7s79TOfoyev001gC9r7jPoLswFgeAtnDUzzWMLZbkq9FayurQuAofknUOgt2lggsXgYlR4PrMOuMbMrKKeB49y0bMN7/gEiNKNM2LcK+zmMmYqkkkWIchFhvL4C4dpAq28H1rjH7R0vczsvEdc+ktObjU0JBEdqQDnzfikRtNX5TU6DD+Qthv4ForkWtT9JRqlhIhVlcbaWA4L9p+kkbFzapI8e+AQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000068, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, 21 Jun 2024 10:50:10 +0200, Michal Hocko wrote: > On Thu 20-06-24 19:30:19, Oleg Nesterov wrote: > > Can't review, I forgot everything about mm_update_next_owner(). > > So I am sorry for the noise I am going to add, feel free to ignore. > > Just in case, I see nothing wrong in this patch. > > > > On 06/20, alexjlzheng@gmail.com wrote: > > > > > > When mm_update_next_owner() is racing with swapoff (try_to_unuse()) or /proc or > > > ptrace or page migration (get_task_mm()), it is impossible to find an > > > appropriate task_struct in the loop whose mm_struct is the same as the target > > > mm_struct. > > > > > > If the above race condition is combined with the stress-ng-zombie and > > > stress-ng-dup tests, such a long loop can easily cause a Hard Lockup in > > > write_lock_irq() for tasklist_lock. > > > > > > Recognize this situation in advance and exit early. > > > > But this patch won't help if (say) ptrace_access_vm() sleeps while > > for_each_process() tries to find another owner, right? > > > > > @@ -484,6 +484,8 @@ void mm_update_next_owner(struct mm_struct *mm) > > > * Search through everything else, we should not get here often. > > > */ > > > for_each_process(g) { > > > + if (atomic_read(&mm->mm_users) <= 1) > > > + break; > > > > I think this deserves a comment to explain that this is optimization > > for the case we race with the pending mmput(). mm_update_next_owner() > > checks mm_users at the start. > > > > And. Can we drop tasklist and use rcu_read_lock() before for_each_process? > > Yes, this will probably need more changes even if possible... > > > > > > Or even better. Can't we finally kill mm_update_next_owner() and turn the > > ugly mm->owner into mm->mem_cgroup ? > > Yes, dropping the mm->owner should be a way to go. Replacing that by > mem_cgroup sounds like an improvemnt. I have a vague recollection that Sorry for the late reply. Replacing that by mem_cgroup maybe a good idea, a rcu lock looks good, too. But before the above optimization is implemented, I recommend using this patch to alleviate it. Both [PATCH] and [PATCH v2] are acceptable, they only differ in the commit log. Thanks for your reply. :) Jinliang Zheng > this has some traps on the way. E.g. tasks sharing the mm but living in > different cgroups. Things have changes since the last time I've checked > and for example memcg charge migration on task move will be deprecated > soon so chances are that there are less roadblocks on the way. > -- > Michal Hocko > SUSE Labs