From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9996CD711C7 for ; Wed, 20 Nov 2024 16:40:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 292B66B009A; Wed, 20 Nov 2024 11:40:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 243226B009B; Wed, 20 Nov 2024 11:40:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0BB9A6B009C; Wed, 20 Nov 2024 11:40:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id E11246B009A for ; Wed, 20 Nov 2024 11:40:55 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 604EA120DFB for ; Wed, 20 Nov 2024 16:40:55 +0000 (UTC) X-FDA: 82807035318.19.1145A9B Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) by imf28.hostedemail.com (Postfix) with ESMTP id AE3B9C0007 for ; Wed, 20 Nov 2024 16:39:58 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=IMBVG3TT; spf=pass (imf28.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.160.169 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=pass (policy=none) header.from=soleen.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732120607; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qFK5EvqBZ+FjmE22hL4UUAySxH2UgWn1OIf3kinQ/4M=; b=R1DekAyfdtMB31LXrHN7zGir0D4IQL1pKUkRLzKoqx8dowdtrvRU0l9+VXiNW/O3Qi+pE+ H4fmJ+XzN5kQpwv8ITe7sGHyoP4/YdixBnUhqtbbjVfS6ii1SkaGYPY2tbMFhFjGi5v8MB 6Xmpb9DS7OBftyPZz/J+ZLy0GD/tKC4= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=IMBVG3TT; spf=pass (imf28.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.160.169 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=pass (policy=none) header.from=soleen.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732120607; a=rsa-sha256; cv=none; b=kZvly6qPwhvWwW/sl+3x3HI5kIoUtP5381CCTP1v37vp+CT0+1DY4wHNv81LJeLkTPlphm FBxDCf9/KQgr+itxEAO0JrtCwuHVYR1JTiSNxkXaSWtPyHK88Xmgd2fVlNcv1c6u+yVFJ7 lzdKZ6rhv7hMvba2bupBRMYCnrqI76E= Received: by mail-qt1-f169.google.com with SMTP id d75a77b69052e-4609b968452so45284291cf.3 for ; Wed, 20 Nov 2024 08:40:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1732120853; x=1732725653; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=qFK5EvqBZ+FjmE22hL4UUAySxH2UgWn1OIf3kinQ/4M=; b=IMBVG3TTBEoRozddggTBtvJqOXIWPDSphU0VbmVKIZt2C94EL5iZom0lxc+b0oeNyt YbA1lLBSGNIn0VU5NS0r4EEBU4exDe9DT0LpsmackmZJCB5J5InmcuBAjcUDBj5tw6wf /gCv6EYsrHtNuG7b8ZFKh0xHS4omptc0jZmAiY0JWXlOvgs//LbAFcDahsPSdEWjgEjg wZBSRoyYD0fYlbVA3RwaRCILg8dgrS8yVJVZ2zBL5rY4lmqjRubDHavbs+2zwx8iAUnp 2suWGrSJ0Mi7t7C7G8Y5ZJxbTFR/JMGH6OyX7z2P+74p9ucx2jf8tzbGDRmJ/ObsTgan Ej5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732120853; x=1732725653; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qFK5EvqBZ+FjmE22hL4UUAySxH2UgWn1OIf3kinQ/4M=; b=jscielsA/e30bXIuPYfHp2VezrSqnnFE/0bfo4kton12653mYBjwmynORNsGSO05sF TrMQgURxm4pC8OdLHxRXw0meDqcle/8+aAha2clHn6JXuxSz8xOmGzCYKVmr9DdBYt2Y PdgGBF/+/a+uQ+l5Mrvh30MEvyA2nT7XldEvnMrzERHIofXvVH6JbJn5rqfvAsDdH6/k oAH+MZOL/65D4zaUY15ZgN4F55wbZxza2iligCWq59+gd6k5/27k0NJu55dDp6KVxCOR 5OFiTQukK0P4e69U2UkV+wWCwI8DvTFTnrcTQBuBOiD1J1dG10z7aKw442CSip+1kpcw RA9w== X-Forwarded-Encrypted: i=1; AJvYcCXbh0X1GvcEGgmHIql44MvAmc0ZpCRuZpQsxDSSMrbB4CBr6I2wqySGnpDS658NXugQyO5U4cbEiQ==@kvack.org X-Gm-Message-State: AOJu0Yz1jWX7Rai3z6gIy4Z8wfD/U56hPRIS7zFFylWQE/Il5gsqs84x gEbA+n3ypD+XzgQ3A/SDSgGdH+pBOejzMMGn3ZUWzr8xBQ6Y8X7lDGNBc7lakx5hwLn7/iD3lNe KwrH/e5SaKLGDYg+rPXVw1d5p+gu8lDvCRgbbKw== X-Google-Smtp-Source: AGHT+IGVXXSe3/jOn7/F0+v6GfoX8kW0XiYKv4M7VEKR+QzShMxf6KNfCHMwiuPpKZqaaCv3T5XpK+zCdLzMyTPmntk= X-Received: by 2002:a05:622a:5496:b0:464:b81c:316e with SMTP id d75a77b69052e-464b82b74c4mr29186591cf.6.1732120852696; Wed, 20 Nov 2024 08:40:52 -0800 (PST) MIME-Version: 1.0 References: <20241116175922.3265872-1-pasha.tatashin@soleen.com> <87wmgxvs81.fsf@linux.intel.com> In-Reply-To: <87wmgxvs81.fsf@linux.intel.com> From: Pasha Tatashin Date: Wed, 20 Nov 2024 11:40:15 -0500 Message-ID: Subject: Re: [RFCv1 0/6] Page Detective To: Andi Kleen Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org, linux-kselftest@vger.kernel.org, akpm@linux-foundation.org, corbet@lwn.net, derek.kiernan@amd.com, dragan.cvetic@amd.com, arnd@arndb.de, gregkh@linuxfoundation.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, tj@kernel.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, vbabka@suse.cz, jannh@google.com, shuah@kernel.org, vegard.nossum@oracle.com, vattunuru@marvell.com, schalla@marvell.com, david@redhat.com, willy@infradead.org, osalvador@suse.de, usama.anjum@collabora.com, andrii@kernel.org, ryan.roberts@arm.com, peterx@redhat.com, oleg@redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: AE3B9C0007 X-Stat-Signature: k4k3po9cu9q1j1xzpfkfxgbfez389pxf X-Rspam-User: X-HE-Tag: 1732120798-830286 X-HE-Meta: U2FsdGVkX19p51v4uKwYcitA2Dg8qm+9ahi5LkBQaAgirtrVA6aUMyNOidkF5Dpyygxwkkh7kPqvkNvOUDnkmf5ZJXwGQatY66nepCTwMV0AGUZgq82m0zHbNOTQVLoGZJrsqU3ezH7aSmgN0pWkzp5b1zwo1ZwTx8vMlDz/EXkQD+agdOZ1aVz+VAGBOflUbCqRxfAwekW4jvib8qSRr5nRgSN3pFS/Zy57ap5GNUwZ0Ea7ICPwbetResOlBYZUt1OPmuUknYmdiJepOaSd0prep9XkRi6qAdUrWAD/Myb8aLuRLYl8HNZOXKYcDKVwW9vtk1JjyhU4yZss3jCOTik03zosad2XQVX1qVeBBSvSuVYHwYGkScokAVie+B6BaERA7dkX3oGbtIMCe34nmPlSdt1TDQ3LkyzNGcPlviLdR7ztXK6vW3srhIgEFakbfFByuXVFxgbX0UQHGRs8eOq2A0ULmSQTTk/vxAPfW2OyN0bU/hd0InBQaUG12xbmM9YbrGJgpdraiewYAiu8LPdqCbm4IDcKA0uU8lAa46O6pwuCljkDVfrBBJz/tHT5HdkOP5Qwb2N1sXP/BRXpcffgo/KoQQ8n/He9dRx6ENgO7iPwWZWv/iK2luLHWZZ7o+TsfO7uJzyjuIHZ6/QXyuuTKeEbnS/PEES/Q0ulYO3LJwiB4kF8zO5BGK1ztDPpkani9vr2x93cCFdhaBpyv/TS/GZ2BFJme7eeeOu4PlBXjRmM5Pjuxaf0k3AsprLyk57wL7eEdg6o+w/vzjgc50CshsxI6VXm30328HJYqKaFrV6WcBKSHj6tYJ3rV7tpElr/4YG/Jk5FCh6CwpOQQbWFMFPxvAPo1FX5aCdZRc7cN9TW0Y7D7Jl2jpIVuRfz8hrigcERGq16d0e0fJHiPSMl7E7vJYMbQXFxfkiHFM8GaFIdkXHN+Zol+LTdlFMaAT3F64ny8R8UlPYvO/s xFJcIjeR AzfG2SN8TlopLQbk7o16UxZde1AzbrykVU6oI/lCzVyOsVjhqCKpoTo5EuAB0J9mLHG7USxS3W6SpV65BQMjU+Sn02gr4sEKr7Y4DGjXvRic1syPvs8/3pYzhvGTnRNOj4fz+DsY6epsEGPSgikr5fPQWCEoEQ7h2rNt13ShtPmP3hhER9Wsd9gK4mDalOKve84RBOcknWUe4dhxqx1T8laDEMbxL5hNqvG4xe8HXHYtxJX1WN1o7ROlHdDYwHSiM7mLVJ3WjdcpshT5CnFYwEPAm2h9E1HIVrwPue6K0R1IK7oan9Z1eYvZ4EVblF/TqWtM5ZMzIX19Qc20DP57G/m7n2w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Nov 20, 2024 at 10:29=E2=80=AFAM Andi Kleen wr= ote: > > Pasha Tatashin writes: > > > Page Detective is a new kernel debugging tool that provides detailed > > information about the usage and mapping of physical memory pages. > > > > It is often known that a particular page is corrupted, but it is hard t= o > > extract more information about such a page from live system. Examples > > are: > > > > - Checksum failure during live migration > > - Filesystem journal failure > > - dump_page warnings on the console log > > - Unexcpected segfaults > > > > Page Detective helps to extract more information from the kernel, so it > > can be used by developers to root cause the associated problem. > > > > It operates through the Linux debugfs interface, with two files: "virt" > > and "phys". > > > > The "virt" file takes a virtual address and PID and outputs information > > about the corresponding page. > > > > The "phys" file takes a physical address and outputs information about > > that page. > > > > The output is presented via kernel log messages (can be accessed with > > dmesg), and includes information such as the page's reference count, > > mapping, flags, and memory cgroup. It also shows whether the page is > > mapped in the kernel page table, and if so, how many times. > > A lot of all that is already covered in /proc/kpage{flags,cgroup,count) > Also we already have /proc/pid/pagemap to resolve virtual addresses. > > At a minimum you need to discuss why these existing mechanisms are not > suitable for you and how your new one is better. Hi Andi, Thanks for your feedback! I will extend the cover letter in the next version to address your comment about comparing with the existing methods. We periodically receive rare reports of page corruptions detected through various methods (journaling, live migrations, crashes, etc.) from userland. To effectively root cause these corruptions, we need to automatically and quickly gather comprehensive data about the affected pages from the kernel. This includes: - Obtain all metadata associated with a page. - Quickly identify all user processes mapping a given page. - Determine if and where the kernel maps the page, which is also important given the opportunity to remove guest memory from the kernel direct map (as discussed at LPC'24). We also plan to extend this functionality to include KVM and IOMMU page tables in the future. provides an interface to traversing through user page tables, but the other information cannot be extracted using the existing interfaces. To ensure data integrity, even when dealing with potential memory corruptions, Page Detective minimizes reliance on kernel data structures. Instead, it leverages direct access to hardware structures like page tables, providing a more reliable view of page mappings. > If something particular is missing perhaps the existing mechanisms > can be extended? > Outputting in the dmesg seems rather clumpsy for a production mechanism. I am going to change the output to a file in the next version. > I personally would just use live crash or live gdb on /proc/kcore to get > extra information, although I can see that might have races. For security reasons crash is currently not available on our production fleet machines as it potentially provides access to all kernel memory. Thank you, Pasha