From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E708AC3ABDD for ; Sat, 17 May 2025 19:02:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 733D26B0085; Sat, 17 May 2025 15:02:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 70C3B6B0088; Sat, 17 May 2025 15:02:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5A97F6B0089; Sat, 17 May 2025 15:02:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 3927A6B0085 for ; Sat, 17 May 2025 15:02:52 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 5F81FE511A for ; Sat, 17 May 2025 19:02:52 +0000 (UTC) X-FDA: 83453321784.13.2C1CD88 Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176]) by imf21.hostedemail.com (Postfix) with ESMTP id 773101C000F for ; Sat, 17 May 2025 19:02:50 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=VWuHI5rP; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of surenb@google.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747508570; a=rsa-sha256; cv=none; b=MAokCUQRt4UrC1/CHU/9+pd3A+uwjela3by2i1+Mx5ERbWc2LDZCg5z9g/zwXgskc00vEI FVuhbvcx6+eCh+r81uXdBRFA8KqcbMjK9qWCUC4TiaCIqgWMWyFaakpmZtXo1fO5iBT1/0 8f/BBDFiy2SaEE2VuvomMSNdfHzPZ2Q= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=VWuHI5rP; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of surenb@google.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747508570; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yF0K2FdBjnXOHIclO/dFA/TrIig1DjU+JS8Dy78PlzA=; b=Y5T/4mTqSk2fBAJgOe1Za6mKJZPyi1qGb8HT3Gkt/SBuDxyLKvy4aqbXtJ7vjy+7UhyeQ0 A1TkUCCIZlFkt3MdZfihAH/bZDjIs34utpsYw3rXX7YYS00q8Le0aqbw8X1j3fNIP0/vcB sLi/DU0FZr1mMwq3VHPJyjlnQvlR/uI= Received: by mail-qt1-f176.google.com with SMTP id d75a77b69052e-4774611d40bso205351cf.0 for ; Sat, 17 May 2025 12:02:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1747508569; x=1748113369; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yF0K2FdBjnXOHIclO/dFA/TrIig1DjU+JS8Dy78PlzA=; b=VWuHI5rPs+/pzmO1YRw3vPUc5pUf6fjg3WkSF+pCq1gCyULE4uO+JI8ftDix9vElA/ NEqFdYQXK4msDMCkOZzYn68menfb/QCJA/2tMIL8b4kdpBXvgT+dkAFbcbHo7t0NU0MC sOwIztUnipbmXr89RfWcXUWabk1ruHvDvcmVxUVlUzU8hM6Lsnr2q11yCQnbS/KxtAMW gvY6ag/+7tA+fNuJHhV/roxA9pouXuYC+051V1AXv+KpQzzwo/V/IWFStyirllp4TgjF Ges8dM7AocwVIkOFDa1Qo7qkmlyOat/BS1Bcvqjp3eAUL5Bxmhe8P3thrTAfZxI4rq4S bKOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747508569; x=1748113369; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yF0K2FdBjnXOHIclO/dFA/TrIig1DjU+JS8Dy78PlzA=; b=HsR3ZAW7erH626uAsIKNrRMQZFwH4PgwWlstvlXvWUtz+4q6Duz5m/ZWcXPDcbP1Ha 5X/qKA29jPIT3ScOAXO6JHZzEyoBg8Zt58UPfhpQhYAh5R9yEtj8lwbdRC9YrjwvpsoR Dihz4WKqxqSKz9fC1vSZ+vOUWt9VJMNDfzoltkdSdoWmIeU6f3dLGMBeR03P+1Qg9ncR EXP5AMhv13Pn2u2qr3Qzg8gsR7K+pZCPQ2HiB5hOjzd8rb+Zr9brjqF0b85S6O6t6acm zHT0/jUEtqq+0z3+iIfhrve0u/IKuXkJNnH/Hotjh27dlfG7p+yp8Ai3hmmzmBHe92uY kduA== X-Forwarded-Encrypted: i=1; AJvYcCX6Tdzh103AZI2doQe/lTft0DrdMmtxHn9x0SOdSiAOFFb6BCrk0kyQPXrfXHxKFKGKvX+vRRS5Fw==@kvack.org X-Gm-Message-State: AOJu0Yz0BYdvhy9VZtvP1M0M+77+3WRc7KZJB4JQY0KO5uDpkkOFSYtU lQu4dtB9/fYMGOztWLnKlTNfLeyq0PmNdloTeUYvy3B6GqOwfGf+iniMUsZns7aam2GkFahT3Ej GDcRZMeqM14NeruYWEolRtFzaz1C1qGjfcvC5snmA X-Gm-Gg: ASbGncsHtqXRy0h7CL9BtoqdpvGetJJFrJdJ5/kZ7Hq3jgsQs0CXUENf3sLElqHHZLM XEGbGr1kEj+cFsWZoKvpYxTjEJ+UsG/HXpWo9fQ7/YYk6OWBeuk4+IoIgoiDwwLlKrdHduyNxA7 0WrFwEoI7scAm1idaSDSvHQe81r36GjjU= X-Google-Smtp-Source: AGHT+IFsSho1e2aTh7vSg0v45aJBk8Uf+jg26yARFi6n1yE9KeiCYnhSQ8K8q8+5FBnvY+y/nPl7xMHfuwxFolWB6v8= X-Received: by 2002:a05:622a:848d:b0:497:75b6:e542 with SMTP id d75a77b69052e-49775b6ef5emr712401cf.10.1747508568892; Sat, 17 May 2025 12:02:48 -0700 (PDT) MIME-Version: 1.0 References: <20250516131246.6244-1-00107082@163.com> <6646d582.18f8.196dd0d5071.Coremail.00107082@163.com> <233aab47.38f2.196df28812a.Coremail.00107082@163.com> <5a1f5612.363e.196df64bd1f.Coremail.00107082@163.com> In-Reply-To: <5a1f5612.363e.196df64bd1f.Coremail.00107082@163.com> From: Suren Baghdasaryan Date: Sat, 17 May 2025 12:02:37 -0700 X-Gm-Features: AX0GCFvxHzGd4WuCqJiwHzAQGCJzPQbN4swlD7m4Nhp4XSuypmqv0NcDVELeiMk Message-ID: Subject: Re: BUG: unable to handle page fault for address To: David Wang <00107082@163.com> Cc: kent.overstreet@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 773101C000F X-Stat-Signature: edwhyooxhxoucpqxbojpgwx8hn6qfiku X-HE-Tag: 1747508570-419737 X-HE-Meta: U2FsdGVkX18PO2Kt+H9q8hTOI8CPy46GCVKHZLZ7/b9gQ/56FqexUsz5v7dQpiYgqxDysw4G15xlTEREFDLxJNQXU7yV0jDXyL+k7nhpf3pHTlXlQfMVecIsx2mHh/lhkQeMjJyn1zVCbABl/V+WHb4M6UvEjRBFfRa2rA0ZOCEBVXOiWfcNAQSS6aqo0n9Izpfr2lKiwUv+Yz0mFk4DCm+bzdrb4mpLsuLeGnUryesnyKmxmHBgqJ+oVouvVdEEiDPPvyvrb0gmojHHLckmS20oduzZTcyqwEx7Qr9kDaVl9/ANf0ad+Gr1Z4+GBPtajAt1U9pgDktq32t+ME8Y9WydFpK3cCn5bnXS24BhqFOoO7QoTstnYIVd19oVpOEzvIjZIQDjF90g6WrTj6eCbO+uucVRJU5ose2qLnLZuBaNSdMWygrpArY4iRBl3IhLpLiamWDNVbOim4DhUlbPzrxKuM++R8qg1/YC8Ezfq4RWPIP/Mfa1UpW4Q74eRA7u/yepPbq+HqWrD4R5RhkHBn6s6ib0OqhW3ljldEuAN6cA+T9f0tpE2wen7MAj+YJNsxeVSsID8V2XuKAuH/uZf/hphC+IZSrzwuqE3/xGE9Ij+f85X2BQT2EN6d/CAMUfoC4/GDsFdrh4y7dqqMTo3Xjz6TEp1Q8ZfJTlIXNU+r5VLrABYsdD6lmJcx9pFQx2+EbrnIBFhozVVvF9G6ax6oPPuHX7v/UlSlfKioIg9Qo2VDn7DUEFPrY/bC8WA146GKgphVfrZ3n510ipWam+8bOzUCeYx1Mg4gnr1NuRpevz7vnQ2VPVak5tVACYI7lTIyUp48lwQQ1jtE4d1TjtlSCSsw51UDN6jxjBxK+1HWhzFFtVJnRHhnVVaWRdb06oNTDtAX9FoN2jXSqHwpfiWEc+F7Dv/4s2oXnXB4Gcx6XDYuW72fxBT8KaswEP8GyUgvbaD0dpWzwu1qOZeVW 6SkgnndI sjBiUJpa8dtGA6lEkw0HkocyaoZRdiurVYa3KPqIiMkpAvXHG8uQpjVNL7tM1fKmcX1JizUsgMVS7QOyC0hH+7jd6rFS9b6mP+Qguq7UxQ9gnBgvadDePiDaDWZ2uqW0Q+IPHDbA2g94tuAhBYnz9UmRq6AoG60TO0zDXiTNAlbQQaLtQ7dsP4+2J4+jKF8J1YRBhaNitEaF0gJYYwOC7jip17Dj4DDqBGtNwpb8KzVsrLLwcKFEMNDyFws/TtdkuCwDwX4qp4NTPOWQCzm+LA0Z35eZi3Ojz2v5Swo7QFCdVB44zdOsbtHR4GUtjz5N8Oj+waDv9X8xa4k2xk2bO6NwsthDlSxWnEpj8 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, May 17, 2025 at 10:57=E2=80=AFAM David Wang <00107082@163.com> wrot= e: > > > > At 2025-05-18 01:29:35, "Suren Baghdasaryan" wrote: > >On Sat, May 17, 2025 at 9:51=E2=80=AFAM David Wang <00107082@163.com> wr= ote: > >> > >> > >> At 2025-05-18 00:39:30, "Suren Baghdasaryan" wrote= : > >> >On Sat, May 17, 2025 at 12:02=E2=80=AFAM David Wang <00107082@163.com= > wrote: > >> >> > >> >> > >> >> At 2025-05-17 08:11:24, "Suren Baghdasaryan" wr= ote: > >> >> >On Fri, May 16, 2025 at 10:03=E2=80=AFAM Suren Baghdasaryan wrote: > >> >> >> > >> >> >> Hi David, > >> >> >> > >> >> >> On Fri, May 16, 2025 at 6:13=E2=80=AFAM David Wang <00107082@163= .com> wrote: > >> >> >> > > >> >> >> > Hi, > >> >> >> > > >> >> >> > I caught a page fault when I was changing my nvidia driver: > >> >> >> > (This happens randomly, I can reproduce it with about 1/3 prob= ability) > >> >> >> > > >> >> >> > [Fri May 16 12:05:41 2025] BUG: unable to handle page fault fo= r address: ffff9d28984c3000 > >> >> >> > [Fri May 16 12:05:41 2025] #PF: supervisor read access in kern= el mode > >> >> >> > [Fri May 16 12:05:41 2025] #PF: error_code(0x0000) - not-prese= nt page > >> >> >> > ... > >> >> >> > [Fri May 16 12:05:41 2025] RIP: 0010:release_module_tags+0x103= /0x1b0 > >> >> >> > ... > >> >> >> > [Fri May 16 12:05:41 2025] Call Trace: > >> >> >> > [Fri May 16 12:05:41 2025] > >> >> >> > [Fri May 16 12:05:41 2025] codetag_unload_module+0x135/0x160 > >> >> >> > [Fri May 16 12:05:41 2025] free_module+0x19/0x1a0 > >> >> >> > ... > >> >> >> > (full kernel logs are pasted at the end.) > >> >> >> > > >> >> >> > Using a image with DEBUG_INFO, the calltrack parses as: > >> >> >> > > >> >> >> > RIP: 0010:release_module_tags (./include/linux/alloc_tag.h:134= lib/alloc_tag.c:352 lib/alloc_tag.c:573) > >> >> >> > [Fri May 16 12:05:41 2025] codetag_unload_module (lib/codetag.= c:355) > >> >> >> > [Fri May 16 12:05:41 2025] free_module (kernel/module/main.c:1= 305) > >> >> >> > [Fri May 16 12:05:41 2025] __do_sys_delete_module (kernel/modu= le/main.c:795) > >> >> >> > > >> >> >> > The offending lines in my codebase: > >> >> >> > 126 static inline struct alloc_tag_counters alloc_tag_= read(struct alloc_tag *tag) > >> >> >> > 127 { > >> >> >> > ... > >> >> >> > 131 > >> >> >> > 132 for_each_possible_cpu(cpu) { > >> >> >> > 133 counter =3D per_cpu_ptr(tag->count= ers, cpu); > >> >> >> > >>>> 134 v.bytes +=3D counter->bytes; <--= ------------here > >> >> >> > 135 v.calls +=3D counter->calls; > >> >> >> > > >> >> >> > > >> >> >> > Nvidia drivers are out-tree... there could be some strange beh= avior in it causes this.. but, > >> >> >> > when I check the code, I got concerned about lifecycle of tag-= >counters. > >> >> >> > Based on following defination: > >> >> >> > 108 #define DEFINE_ALLOC_TAG(_alloc_tag) = \ > >> >> >> > 109 static DEFINE_PER_CPU(struct alloc_tag_cou= nters, _alloc_tag_cntr); \ > >> >> >> > 110 static struct alloc_tag _alloc_tag __used = __aligned(8) \ > >> >> >> > 111 __section(ALLOC_TAG_SECTION_NAME) =3D { = \ > >> >> >> > 112 .ct =3D CODE_TAG_INIT, = \ > >> >> >> > 113 .counters =3D &_alloc_tag_cntr }; > >> >> >> > 114 > >> >> >> > _alloc_tag_cntr is the data referenced by tag->counters, but t= hey are in different section, > >> >> >> > and alloc_tag only prepare storage for section ALLOC_TAG_SECTI= ON_NAME. > >> >> >> > right? > >> >> >> > Then what happens to those ".data..percpu" section when module= is unloaded? > >> >> >> > Is it safe to keep using those ".data..percpu" section after m= odule unloaded, > >> >> >> > or even during module is unloading? > >> >> >> > >> >> >> Yes, I think you are right, free_module() calls percpu_modfree()= which > >> >> >> would free the per-cpu memory allocated for the module. Before > >> >> >> 0db6f8d7820a ("alloc_tag: load module tags into separate contigu= ous > >> >> >> memory") we would not unload the module if there were tags which= were > >> >> >> still in use. After that change we load module tags into separat= e > >> >> >> memory, so I expected this to work but due to this external refe= rence > >> >> >> it indeed should lead to UAF. > >> >> >> I think the simplest way to fix this would be to bypass > >> >> >> percpu_modfree() inside free_module() when there are module tags= still > >> >> >> referenced, store mod->percpu inside alloc_tag_module_section an= d free > >> >> >> it inside clean_unused_module_areas_locked() once we know the co= unters > >> >> >> are not used anymore. I'll take a stab at it and will send a pat= ch for > >> >> >> testing today. > >> >> > > >> >> >Ok, I went with another implementation, instead dynamically alloca= ting > >> >> >percpu memory for modules at the module load time. This has anothe= r > >> >> >advantage of not needing extra PERCPU_MODULE_RESERVE currently > >> >> >required for memory allocation tagging to work. > >> >> >David, the patch is posted at [1]. Please give it a try and let me > >> >> >know if the fix works for you. > >> >> >Thanks, > >> >> >Suren. > >> >> > > >> >> >[1] https://lore.kernel.org/all/20250517000739.5930-1-surenb@googl= e.com/ > >> >> > > >> >> > > >> >> >> Thanks, > >> >> >> Suren. > >> >> >> > >> >> > >> >> Hi, the patch does fix my issue. > >> >> I now have another similar concern about modules RO data, > >> >> The codetag defined as > >> >> 24 struct codetag { > >> >> 25 unsigned int flags; /* used in later patches */ > >> >> 26 unsigned int lineno; > >> >> 27 const char *modname; > >> >> 28 const char *function; > >> >> 29 const char *filename; > >> >> 30 } __aligned(8); > >> >> > >> >> Those modname/function/filename would refer to RO data section, rig= ht? > >> >> When module unloaded, its RO data section would be released at some= point. > >> >> My question is is it safe to use RO data during module unload? beca= use these > >> >> lines seems to access those data: > >> >> > >> >> + pr_info("%s:%u module %s func:%s has %llu a= llocated at module unload\n", > >> >> + tag->ct.filename, tag->ct.lineno, t= ag->ct.modname, > >> >> + tag->ct.function, counter.bytes); > >> > > >> >These lines are called from release_module_tags() using this call cha= in: > >> > > >> >delete_module() > >> > free_module() > >> > codetag_unload_module() > >> > release_module_tags() > >> > > >> >and codetag_unload_module() is called at the very beginning of > >> >free_module(), when no other module memory has been freed yet. So, > >> >from what I understand, this should be safe. > >> >After we unload the module these pointers inside the tags will be > >> >dandling but we should not be using them anymore since we do not > >> >report unloaded modules. Do you see some usage that I missed? > >> > >> Why data..percpu. is different. The page fault error caught when I rei= nstall nvidia drivers is also > >> raised from release_module_tags(). > >> > >> Is data..percpu. section released earlier? > > > >No but counters are different because the allocations that still > >reference these tags from unloaded modules will be decrementing them > >when they are freed. That's where UAF is coming from. So, the counters > >might be accessed after the module is unloaded, other fields should > >not. > > > > I do notice there are places where counters are referenced "after" free_m= odule, but the logs I attached > happened "during" free_module(): > > [Fri May 16 12:05:41 2025] BUG: unable to handle page fault for address:= ffff9d28984c3000 > [Fri May 16 12:05:41 2025] #PF: supervisor read access in kernel mode > [Fri May 16 12:05:41 2025] #PF: error_code(0x0000) - not-present page > ... > [Fri May 16 12:05:41 2025] RIP: 0010:release_module_tags+0x103/0x1b0 > ... > [Fri May 16 12:05:41 2025] Call Trace: > [Fri May 16 12:05:41 2025] > [Fri May 16 12:05:41 2025] codetag_unload_module+0x135/0x160 > [Fri May 16 12:05:41 2025] free_module+0x19/0x1a0 > > The call chain is the same as you mentioned above. Is this failure happening before or after my fix? With my fix, percpu data should not be freed at all if tags are still used. Please clarify. > This part confusing me a lot. The log indicates during free_module, ..da= ta..percpu access failed, > I doubted those section would be released that quick. > > The only guess left I feel reasonable is the ..data_percpu was not paged= in at all, probably because no access to it, > and when the section is accessed during free_module, somehow the access i= s refused. Just guessing..... > > Or, do I missing something here? > > > >> > >> Thanks > >> David > >> > >>