From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40433C47422 for ; Fri, 26 Jan 2024 19:27:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 954216B0092; Fri, 26 Jan 2024 14:27:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9048C6B0093; Fri, 26 Jan 2024 14:27:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7CBB16B0095; Fri, 26 Jan 2024 14:27:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 66B4E6B0092 for ; Fri, 26 Jan 2024 14:27:42 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id CC599120837 for ; Fri, 26 Jan 2024 19:27:41 +0000 (UTC) X-FDA: 81722446722.29.0C8B5DB Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf04.hostedemail.com (Postfix) with ESMTP id D0AFF40022 for ; Fri, 26 Jan 2024 19:27:39 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=QOY3h8Ks; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf04.hostedemail.com: domain of "SRS0=zAkC=JE=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 139.178.84.217 as permitted sender) smtp.mailfrom="SRS0=zAkC=JE=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706297260; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GmyTultXQTywcPbLh8m1Ot7j7IMr3eH5v/d6uFHfDLY=; b=qxn/e+K0d38R4G9Rltqf2yFBDgKITOlX/6qf/tYqpTTe3vo1CTx8x3aVBvlzQtMK+H/B7M oQVY8/X0af/qJyZ08Uk6iHF25Q+d1bOu5TuA8QgptBLjau9irtaoyyNhA+/Uq8eVOX7JiU d+BtIEG2hcqpmy3XPP2cxlYrBWGy7K8= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=QOY3h8Ks; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf04.hostedemail.com: domain of "SRS0=zAkC=JE=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 139.178.84.217 as permitted sender) smtp.mailfrom="SRS0=zAkC=JE=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706297260; a=rsa-sha256; cv=none; b=1Aq3Xyhh/jm0XvpdBeEziDHoq4apN/zQ4ak+6jMStCaMTw3E5ECo2aLRxVoQkamw3yZNGS SF26zYUBB+Q3m53l9cNboHCt1EoqO+2maHDNGGZ3B76b5YPUAyphm1yibRNKCvLKTFxR7n 8o39oRRTjrwyWxuefy2qfKKMFrC4LsQ= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 4E8C0625B8; Fri, 26 Jan 2024 19:27:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AB10CC433C7; Fri, 26 Jan 2024 19:27:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1706297256; bh=Brcoq3u1u6FjSbFdR28vkViQhVtyTTDP8moYCJxVZO4=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=QOY3h8Ksikuxb5AaqYWa4TnNSjnKWy0i18CJfgZ4SKlh0lnb+lcE0C+Y+lvV1XMXS RgDfGgrmgasjOn4DHIvHh9Z9Sy1/ndVzfn/vDTrjE8q1agvHVCSmXfpxWkRoI17C+W rdDTN469f9JuTGdT4YHJw2+CQi74dk7CiRuN9Qe3bj06FrNk39Bnmackb5IkX/NNzI CG4qvUCjahlr68RSG7Xdx6/FZSc12cKHawDfkK58WeorOprUdQYIOy8HY6eF4k/bSW fvHfyOQNBHYaZv6XbhtTd8Cba+eR7k1VTYmlITkukES7n9/Ptw6UWLWZnuWv+h2j4D i5YieLcZg0YhQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 4A6A6CE12D9; Fri, 26 Jan 2024 11:27:36 -0800 (PST) Date: Fri, 26 Jan 2024 11:27:36 -0800 From: "Paul E. McKenney" To: Suren Baghdasaryan Cc: kernel test robot , Matthew Wilcox , oe-kbuild-all@lists.linux.dev, Linux Memory Management List , Andrew Morton Subject: Re: [linux-next:master 1589/1892] fs/proc/task_mmu.c:143:45: sparse: sparse: incorrect type in argument 1 (different address spaces) Message-ID: Reply-To: paulmck@kernel.org References: <202401251829.0m6Eo4LI-lkp@intel.com> <68dc8592-6a5d-4cbe-bb8c-e0f4b5517684@paulmck-laptop> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: D0AFF40022 X-Stat-Signature: 4szmccpyzidp43y41a64w4k4wx5choqo X-Rspam-User: X-HE-Tag: 1706297259-504410 X-HE-Meta: U2FsdGVkX18CZaWJVLyNJrKNm1EFLTxlrwdsckknYJZ4xJgl5Fn5IBxQrCpQ2FNocBYjxKDf3Etr4jxtmYrXE3VJU1I4ATKdfn15DUbhuv4+2aTRIayISJDYStnqTD6PVy29URyGCMCmwhnnTa4hAqEKsD31f2IvmaJBIc/H+/FW59yYwe4nVwIYofS+zKAQlwFLLmVJSw37FnIYELuyrpA6ivrNcCLbTtSjn3uZ39HH4R3yYS14e9EBH2F1hoWptT4C3lZ8ImQ0xplTvkBKDp2pG0Sg7l9Ed11WiiJJhngebORZjyY1BoTmhQbYLr2Yp/VkA2jVorVnOsFK3c10agiHg8f6nWm5XE1jtEzmkeQfOv0uk8osCvswGnVwsNZQ5LmBLRAnuY51Cd0iIzyV+ClOOirdKXXWUKcvuVxz7w7q+2yzLoC2td93V8UdUmq4jw6QBb8p1UBhbQwE182Eff41H83tlqMzlGWw4pRQ4Un0hj1ydUbFWupOY5BQZGCOgxw+o361LFBlAE9r+vjQEgx5U+nsc+u0Gtrlvg39j5LLmhG+eBI7H37Enw43zM9NBQbkHmSBAwjLHkLxVP3JeoaU0RyKkFfp4GraB4gQYFWb8QABQMMc3aMLm7W+78ZELJ9tnBR1+nm8ecMriOCxZCvDmyb4nQdnLRV4NSO16OUP2/ThkWbOkHMPcuOcxqZe0zGtyomytb76mvsoGHsAvyiv3K3JlHm2XAVAClfBRUUlDZLXPN8Qva196NTiLhH+ATQIsZt5TkpX4Vq1Fqul/ljU08XkAre+UzkdsN6Cr8FfmXz0Xgke2KdJ//EIYt1qN8epqujSP7PZulLwgz2zYn1kOx4qzdTHaCDYpKI9q0OCK36t+Z/lBVTvSD1b/RrH/4fcw0vPc0LwhtQP2GPUiDRrWOFJyD0HgwVOTvlQ2F8Di3juazf5iAcEZWjlYCZqm6XI3kmf+TUQ7OpzLae WtQF/Y71 O1Nb7V+RMdTgLPzwoejoyD6yiAjel42p/EwDo27nTeyFwptggWea/2bAg9lXDsOkEZEVwl7xQ0GwJYjeGR8A6ub03xU1EivTjH2wKopdrR5O6ieOx7hF6Qxw9PKE7pm3MPzeN9XvHBWVoa1iwVn/xsyMkN7pEYUF3LYfVJotvhCKeJ6zSvtGJxjC7CEj37Uybr0TFImja/D364+pnvlZ8hMIkHpe6KvcIXWyhWz5Mt4qGyfSBT06gLUhUkqAKAWAboykhRvz1MZmVRFJHva67x9oDiUyCYdg5BuscvxM6ljlROmdtcOk7j0NrXEy5JuKqfBblFng02u3FGHtqGVO5PpjwI7fBOv3ivEHUbFlUA4vG6K5KLLdz19JtoPHb1B00aQN1NWyVuLzHg2H3XtqwPBHAZWdtRNoBsIW5/twHmd3ZJjTlCorFg7TthOtycv/2efKBtvKYnQOwgklb19rFAkV9krq4AMfFk4+E+VKC0FzaAnHHaF0K1pR6CuwS8daKoqoisPoJiuQgkrB7SdRijvdLHkUKp+xrA8hVqa6Gpy0LnVzUZLJzTCosgK5qTp280Qq4fD7oQgGO6RxLcMUd3t9UsHShrZgvpTAx5SD6bDViAtez78HaU+WRvhNOH0kKn2J7QGG6bLXuHiVThyGmao+cL2kcVY/I7CSTfQRQoqRq6CrjwG4s0tOTrYT2VjMaonajx4vg9lcaNnuADoD8GERYXuXNB762r6i4WwL9W4Q0uEYHqq+t5gneIA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jan 26, 2024 at 08:55:38AM -0800, Suren Baghdasaryan wrote: > On Fri, Jan 26, 2024 at 2:22 AM Paul E. McKenney wrote: > > > > On Thu, Jan 25, 2024 at 06:24:05PM -0800, Suren Baghdasaryan wrote: > > > On Thu, Jan 25, 2024 at 4:35 PM Paul E. McKenney wrote: > > > > > > > > On Thu, Jan 25, 2024 at 03:17:17PM -0800, Suren Baghdasaryan wrote: > > > > > On Thu, Jan 25, 2024 at 2:44 PM Paul E. McKenney wrote: > > > > > > > > > > > > On Thu, Jan 25, 2024 at 01:27:34PM -0800, Suren Baghdasaryan wrote: > > > > > > > On Thu, Jan 25, 2024 at 2:23 AM kernel test robot wrote: > > > > > > > > > > > > > > > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master > > > > > > > > head: 01af33cc9894b4489fb68fa35c40e9fe85df63dc > > > > > > > > commit: 0c30c4cd953025979b7689e49844837f762303ec [1589/1892] mm/maps: read proc/pid/maps under RCU > > > > > > > > config: x86_64-randconfig-121-20240125 (https://download.01.org/0day-ci/archive/20240125/202401251829.0m6Eo4LI-lkp@intel.com/config) > > > > > > > > compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18) > > > > > > > > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240125/202401251829.0m6Eo4LI-lkp@intel.com/reproduce) > > > > > > > > > > > > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > > > > > > > > the same patch/commit), kindly add following tags > > > > > > > > | Reported-by: kernel test robot > > > > > > > > | Closes: https://lore.kernel.org/oe-kbuild-all/202401251829.0m6Eo4LI-lkp@intel.com/ > > > > > > > > > > > > > > > > sparse warnings: (new ones prefixed by >>) > > > > > > > > >> fs/proc/task_mmu.c:143:45: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected struct file [noderef] __rcu **f @@ got struct file ** @@ > > > > > > > > > > > > > > Uh, this is a problem. > > > > > > > I missed that get_file_rcu() is used only with mm->exe_file and > > > > > > > vma->vm_file is not really RCU-safe. It's freed via a call to fput() > > > > > > > which schedules its freeing using schedule_delayed_work(..., 1) but I > > > > > > > don't think that constitutes RCU grace period. Paul, Matthew, could > > > > > > > you please confirm? In the meantime I'm going to ask Andrew to remove > > > > > > > my patchset from mm-unstable to be safe. > > > > > > > > > > > > Sadly, no, schedule_delayed_work() does not imply an RCU grace period. > > > > > > > > > > > > There is a queue_rcu_work() that schedules work after a grace period, > > > > > > which could be combined with a timer to get the delay. > > > > > > > > > > > > Another approach would be to use get_state_synchronize_rcu() before > > > > > > the schedule_delayed_work() in fput(), then do cond_synchronize_rcu() > > > > > > in delayed_fput(). This would require adding an unsigned long to > > > > > > struct file to keep track of which grace period a given struct file > > > > > > needed to wait for. > > > > > > > > > > > > Perhaps something like this: > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > > > void fput(struct file *file) > > > > > > { > > > > > > if (atomic_long_dec_and_test(&file->f_count)) { > > > > > > struct task_struct *task = current; > > > > > > > > > > > > if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) { > > > > > > init_task_work(&file->f_rcuhead, ____fput); > > > > > > if (!task_work_add(task, &file->f_rcuhead, TWA_RESUME)) > > > > > > return; > > > > > > /* > > > > > > * After this task has run exit_task_work(), > > > > > > * task_work_add() will fail. Fall through to delayed > > > > > > * fput to avoid leaking *file. > > > > > > */ > > > > > > } > > > > > > > > > > > > file->f_rcu_seq = get_state_synchronize_rcu(); > > > > > > if (llist_add(&file->f_llist, &delayed_fput_list)) > > > > > > schedule_delayed_work(&delayed_fput_work, 1); > > > > > > } > > > > > > } > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > > > And this: > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > > > static void delayed_fput(struct work_struct *unused) > > > > > > { > > > > > > struct llist_node *node = llist_del_all(&delayed_fput_list); > > > > > > struct file *f, *t; > > > > > > > > > > > > llist_for_each_entry_safe(f, t, node, f_llist) { > > > > > > cond_synchronize_rcu(f->f_rcu_seq); > > > > > > __fput(f); > > > > > > } > > > > > > } > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > > > Note that if you called fput() on a long sequence of struct file > > > > > > structures, the cond_synchronize_rcu() would be a near-noop almost all the > > > > > > time, actually blocking at most about every once per every few jiffies. > > > > > > After all, once a grace period has been waited for, it covers all of > > > > > > the struct file structures that were passed to fput() during a given > > > > > > RCU grace period. > > > > > > > > > > > > Still, it would add the occasional delay. And it would increase the > > > > > > size of struct file, though there are workarounds for that, if size > > > > > > is an issue. > > > > > > > > > > > > Thoughts? > > > > > > > > > > Thanks for the suggestion, Paul. I'm worried about this occasional > > > > > delay but otherwise this seems like a nice and simple approach. > > > > > > > > One potential saving grace is that the more heavily loaded the mechanism, > > > > the smaller a fraction of the cond_synchronize_rcu() calls will do > > > > a delay. > > > > > > > > > Do you > > > > > guys think that making *all* files RCU-safe with this approach is > > > > > warranted? For my particular case I need only vma->vm_file to be > > > > > RCU-safe but maybe there are other cases which would benefit from > > > > > this? > > > > > > > > To this, I can only give an unqualified "I don't know". :-( > > > > > > > > But if there is some condition that can be sampled on a per-file-structure > > > > basis, you could use that to invoke cond_synchronize_rcu() only when > > > > needed. Or send only those file structures that need the extra delay > > > > through queue_rcu_work(), perhaps by accumulating a list of them. > > > > > > Thanks Paul! You gave me enough food for thought. Let me see if I can > > > come up with something usable. > > > > Do we have the same problem with the task_work_add() path in fput()? > > Yes, I think so. It's called with TWA_RESUME, so the file will be > freed when the task returns to user mode, but that again does not > guarantee that RCU grace period is over. That one is going to be annoying, as there doesn't seem to be a nice place to hide grace-period latency. The fget() function and friends want an FD, so I am not seeing an obvious way of grabbing an explicit reference. Now the fget() code path is protected by RCU. But maybe that does not apply in the case where the file is closed but still mapped? Thanx, Paul > > > Cheers, > > > Suren. > > > > > > > > > > > Thanx, Paul > > > > > > > > > > > > fs/proc/task_mmu.c:143:45: sparse: expected struct file [noderef] __rcu **f > > > > > > > > fs/proc/task_mmu.c:143:45: sparse: got struct file ** > > > > > > > > fs/proc/task_mmu.c: note: in included file (through include/linux/rbtree.h, include/linux/mm_types.h, include/linux/mmzone.h, ...): > > > > > > > > include/linux/rcupdate.h:781:9: sparse: sparse: context imbalance in 'get_vma_snapshot' - unexpected unlock > > > > > > > > fs/proc/task_mmu.c:264:22: sparse: sparse: context imbalance in 'm_start' - different lock contexts for basic block > > > > > > > > include/linux/rcupdate.h:781:9: sparse: sparse: context imbalance in 'm_stop' - unexpected unlock > > > > > > > > include/linux/rcupdate.h:781:9: sparse: sparse: context imbalance in 'smaps_pte_range' - unexpected unlock > > > > > > > > include/linux/rcupdate.h:781:9: sparse: sparse: context imbalance in 'clear_refs_pte_range' - unexpected unlock > > > > > > > > include/linux/rcupdate.h:781:9: sparse: sparse: context imbalance in 'pagemap_pmd_range' - unexpected unlock > > > > > > > > include/linux/rcupdate.h:781:9: sparse: sparse: context imbalance in 'pagemap_scan_pmd_entry' - unexpected unlock > > > > > > > > fs/proc/task_mmu.c: note: in included file (through arch/x86/include/asm/uaccess.h, include/linux/uaccess.h, include/linux/sched/task.h, ...): > > > > > > > > arch/x86/include/asm/uaccess_64.h:88:24: sparse: sparse: cast removes address space '__user' of expression > > > > > > > > arch/x86/include/asm/uaccess_64.h:88:24: sparse: sparse: cast removes address space '__user' of expression > > > > > > > > > > > > > > > > vim +143 fs/proc/task_mmu.c > > > > > > > > > > > > > > > > 132 > > > > > > > > 133 /* > > > > > > > > 134 * Take VMA snapshot and pin vm_file and anon_name as they are used by > > > > > > > > 135 * show_map_vma. > > > > > > > > 136 */ > > > > > > > > 137 static int get_vma_snapshot(struct proc_maps_private *priv, struct vm_area_struct *vma) > > > > > > > > 138 { > > > > > > > > 139 struct vm_area_struct *copy = &priv->vma_copy; > > > > > > > > 140 int ret = -EAGAIN; > > > > > > > > 141 > > > > > > > > 142 memcpy(copy, vma, sizeof(*vma)); > > > > > > > > > 143 if (copy->vm_file && !get_file_rcu(©->vm_file)) > > > > > > > > 144 goto out; > > > > > > > > 145 > > > > > > > > 146 if (!anon_vma_name_get_if_valid(copy)) > > > > > > > > 147 goto put_file; > > > > > > > > 148 > > > > > > > > 149 if (priv->mm_wr_seq == mmap_write_seq_read(priv->mm)) > > > > > > > > 150 return 0; > > > > > > > > 151 > > > > > > > > 152 /* Address space got modified, vma might be stale. Wait and retry. */ > > > > > > > > 153 rcu_read_unlock(); > > > > > > > > 154 ret = mmap_read_lock_killable(priv->mm); > > > > > > > > 155 mmap_write_seq_record(priv->mm, &priv->mm_wr_seq); > > > > > > > > 156 mmap_read_unlock(priv->mm); > > > > > > > > 157 rcu_read_lock(); > > > > > > > > 158 > > > > > > > > 159 if (!ret) > > > > > > > > 160 ret = -EAGAIN; /* no other errors, ok to retry */ > > > > > > > > 161 > > > > > > > > 162 anon_vma_name_put_if_valid(copy); > > > > > > > > 163 put_file: > > > > > > > > 164 if (copy->vm_file) > > > > > > > > 165 fput(copy->vm_file); > > > > > > > > 166 out: > > > > > > > > 167 return ret; > > > > > > > > 168 } > > > > > > > > 169 > > > > > > > > > > > > > > > > -- > > > > > > > > 0-DAY CI Kernel Test Service > > > > > > > > https://github.com/intel/lkp-tests/wiki