From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9EA40E7717F for ; Tue, 10 Dec 2024 09:33:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 219CB6B0147; Tue, 10 Dec 2024 04:33:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1C86A6B0149; Tue, 10 Dec 2024 04:33:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0913E6B014A; Tue, 10 Dec 2024 04:33:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DC2246B0147 for ; Tue, 10 Dec 2024 04:33:43 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 88FCE160505 for ; Tue, 10 Dec 2024 09:33:43 +0000 (UTC) X-FDA: 82878536706.04.628E38F Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) by imf23.hostedemail.com (Postfix) with ESMTP id 6CA6D140004 for ; Tue, 10 Dec 2024 09:33:26 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=l2zYjaB3; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf23.hostedemail.com: domain of joonas.lahtinen@linux.intel.com has no SPF policy when checking 198.175.65.14) smtp.mailfrom=joonas.lahtinen@linux.intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733823211; a=rsa-sha256; cv=none; b=N+Iay9CW1knVdu82xaiHV8TEQGg5rq2j/8G0ArCKSABT7WD6Q7Hl1ZleyayLB0x9LmytK0 juy962hpeoLdizDKqcOVUWrjlSQTP3/Ray2su8wNYcwuvflZWoyzkUJQrKp03bgkoU8f+5 i7ztfqT90TYME5tpv0nH5TfXoZkoM/w= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=l2zYjaB3; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf23.hostedemail.com: domain of joonas.lahtinen@linux.intel.com has no SPF policy when checking 198.175.65.14) smtp.mailfrom=joonas.lahtinen@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733823211; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vyKlK/+YRRbIHYkgdRZYj/n8sCFTaINpaqrv28XOXrM=; b=s9+DkewCJ11KdXoxRI278liWk97s2vgjJT0Aui4dRkOlJUDXi1AqXdZcSeOidGoQluLynD I6eTSz6VUaoZp6ypkaGgpo/nrZGHmH58IQYT/awYRjbdwss0yz3wFvtaldB+NsTu61cDIz G++xQsM2HmjPbA9+4jsTAg/FPShciXI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1733823221; x=1765359221; h=mime-version:content-transfer-encoding:in-reply-to: references:subject:from:to:date:message-id; bh=uiJSYhsMkhrtcrx4xUvSwptITiekNJGtqKuEgrR35KQ=; b=l2zYjaB37CRByoyMtYIWZQscocRLe3mNzyi/aoTefxGhqlw6mI/pPWSB p9URsMvUuBYt8oJND52PgUc2zmNHxHcloE0/D2MSdR1JZLYKYX+Zv5sa/ nSoMGUnabsPKfhkSvbC9HeDIxifT+jKh6EdQF3MT4I+DdfBSvVy0bOZe4 OmzAxT61bJrUoh/iJ8QGgEW7JzadYP44vqm2z7CCiqRwvGVXYLLn5ZPS9 UZEyVjnBxHrkJbC7x1JdMHKNwT6wtJVouxqCfzYtBW49bWt3vWFsNi5uq EAxan3/3mza10xV1tUm2cqnmhLvN+SPB9WdpP6ud2+/61d7agTL3wdt6H w==; X-CSE-ConnectionGUID: 7MDRXjMdTfGBH5ez9+X8rw== X-CSE-MsgGUID: cOfo0GbpQhCz/PruE0FPwA== X-IronPort-AV: E=McAfee;i="6700,10204,11281"; a="37942968" X-IronPort-AV: E=Sophos;i="6.12,222,1728975600"; d="scan'208";a="37942968" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Dec 2024 01:33:40 -0800 X-CSE-ConnectionGUID: HUjq2qIMReiVzDIrC+D+OA== X-CSE-MsgGUID: r/mkXWyhSDCb+77g5KxIQA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,222,1728975600"; d="scan'208";a="95826050" Received: from fdefranc-mobl3.ger.corp.intel.com (HELO localhost) ([10.245.245.228]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Dec 2024 01:33:36 -0800 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable In-Reply-To: References: <20241209133318.1806472-1-mika.kuoppala@linux.intel.com> <20241209133318.1806472-15-mika.kuoppala@linux.intel.com> Subject: Re: [PATCH 14/26] drm/xe/eudebug: implement userptr_vma access From: Joonas Lahtinen To: Andrzej Hajda , Christian =?utf-8?q?K=C3=B6nig?= , Jonathan Cavitt , Linux MM , Maciej Patelczyk , Mika Kuoppala , dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org, lkml Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo Date: Tue, 10 Dec 2024 11:33:33 +0200 Message-ID: <173382321353.8959.8314520413901294535@jlahtine-mobl.ger.corp.intel.com> User-Agent: alot/0.10 X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 6CA6D140004 X-Stat-Signature: kfmd7haof6z4zwrpg96u9sbhgmp97kma X-HE-Tag: 1733823206-420156 X-HE-Meta: U2FsdGVkX1+YUPvTgVcxV4/7msZ1K2muRFpSj9efwpZbbfC+ECPaHl1s5CO/Cu6GXtLPihmALkVuc1CvTGioxIC4R3RKnk2I8ulLEhCmQWMasHuC5MAqac3zYarlJ/3eSLwj6YL2HIqYrG1I7xI0585Jnwl/Zj1Wsyfy1kgQgOXQ+Swy509gpYvfBlhirO1hWdZjDLN79zvjoDUlaxczzQ1RdZAqw/oXmuQw9aEVbYwJKrq1i64WEa6kiWK0WHBkc0rSJiaa9FR2uj3zNlu/dgp/NIq1uLl9TykWVt43NGHZguMM/kLOUDftF4byfSCRW8IiSCvpiF0o+/z4e/TIXt46KGcDLrT0WMRDjIerFr1A2ZQ3p8WXq+bCP1nYWuhyH4Z9+QObT69qkIP3nL3gkEx8Y7uOqusIhjjgqyR9wQ/Ocr6firJeKLN+3OWfq9JRwnHDM8WbfLdyzoXSabPAYXm1mCKYRReIARYqGdKx5Lis3i6X2bVxMsrd9Xzu2YK0LSZvO2/qWbS8t6Cj9PnFW0rydaDQYc5xWnf0TA5ddlsaiuP92NsxIxzUe97GM+AWDKwbXtsqci0Cc2yTlKMjJEYLLrTfK3E6dadUt2W8fbRGB/m3lKTMzEIz6Cs8OLt54GbREswjy4AkrA/bg3mVf5uoSzL9uiEfqEQBOmWREHqYq7UzobBT0JTEupehbHbDYaHWEGX/ExBEv55K0gBJ0xhINDlNsYQwStdmjOV8143JUJIR78NhXUc73qUAPXUM9z2a5WC6TR4Cv/rhanxfXeI2+faZq36DZf//VmIyA1CEKS04rpxGKq0fbI7YHU2jYBJtU6pthk8gG5Q3n2Tk5cvVoG88RD65A+ewWnBGenE7gUJhHc9FQY0rB6Q25voaR6uKoNqM4CJHShKy3T5utjR85ocxxWFbKUyJSZ/5OruaxWkQxCdLZ+gHTMtMs1K9Zu655p2fdvTBh8CvSam nRxMXg6Z NcNUxXt6GpOu3Xv0raqf/oEjiU//IlJVp4jdtyQCzF2yBRQrwmhsNmTeUJS/tziAM4B8DMa73pIb/u9kJ6H65sEn+LWWYPyx5UBMNM0BYgn+FzOrPMIncelWYSEcdOfA09/votWljaIE66nKruQADXwgprp5iAUxJ8jHUqwcItKSRSkHxjU7JHJPI8+AT0zOLMpVsYxorjKFWcaHbrFT8zULMy5UsFTtnXn8c9YYqO6K5R73shEr5sDEkvUZQyW1auxKP X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Quoting Christian K=C3=B6nig (2024-12-09 17:42:32) > Am 09.12.24 um 16:31 schrieb Simona Vetter: > > On Mon, Dec 09, 2024 at 03:03:04PM +0100, Christian K=C3=B6nig wrote: > >> Am 09.12.24 um 14:33 schrieb Mika Kuoppala: > >>> From: Andrzej Hajda > >>> > >>> Debugger needs to read/write program's vmas including userptr_vma. > >>> Since hmm_range_fault is used to pin userptr vmas, it is possible > >>> to map those vmas from debugger context. > >> Oh, this implementation is extremely questionable as well. Adding the = LKML > >> and the MM list as well. > >> > >> First of all hmm_range_fault() does *not* pin anything! > >> > >> In other words you don't have a page reference when the function retur= ns, > >> but rather just a sequence number you can check for modifications. > > I think it's all there, holds the invalidation lock during the critical > > access/section, drops it when reacquiring pages, retries until it works. > > > > I think the issue is more that everyone hand-rolls userptr. >=20 > Well that is part of the issue. >=20 > The general problem here is that the eudebug interface tries to simulate = > the memory accesses as they would have happened by the hardware. Could you elaborate, what is that a problem in that, exactly? It's pretty much the equivalent of ptrace() poke/peek but for GPU memory. And it is exactly the kind of interface that makes sense for debugger as GPU memory !=3D CPU memory, and they don't need to align at all. > What the debugger should probably do is to cleanly attach to the=20 > application, get the information which CPU address is mapped to which=20 > GPU address and then use the standard ptrace interfaces. I don't quite agree here -- at all. "Which CPU address is mapped to which GPU address" makes no sense when the GPU address space and CPU address space is completely controlled by the userspace driver/application. Please try to consider things outside of the ROCm architecture. Something like a register scratch region or EU instructions should not even be mapped to CPU address space as CPU has no business accessing it during normal operation. And backing of such region will vary per context/LRC on the same virtual address per EU thread. You seem to be suggesting to rewrite even our userspace driver to behave the same way as ROCm driver does just so that we could implement debug memo= ry accesses via ptrace() to the CPU address space. That seems bit of a radical suggestion, especially given the drawbacks pointed out in your suggested design. > The whole interface re-invents a lot of functionality which is already=20 > there=20 I'm not really sure I would call adding a single interface for memory reading and writing to be "re-inventing a lot of functionality". All the functionality behind this interface will be needed by GPU core dumping, anyway. Just like for the other patch series. > just because you don't like the idea to attach to the debugged=20 > application in userspace. A few points that have been brought up as drawback to the GPU debug through ptrace(), but to recap a few relevant ones for this discussion: - You can only really support GDB stop-all mode or at least have to stop all the CPU threads while you control the GPU threads to avoid interference. Elaborated on this on the other threads more. - Controlling the GPU threads will always interfere with CPU threads. Doesn't seem feasible to single-step an EU thread while CPU threads continue to run freely? - You are very much restricted by the CPU VA ~ GPU VA alignment requirement, which is not true for OpenGL or Vulkan etc. Seems like one of the reasons why ROCm debugging is not easily extendable outside compute? - You have to expose extra memory to CPU process just for GPU debugger access and keep track of GPU VA for each. Makes the GPU more prone to OOB writes from CPU. Exactly what not mapping the memory to CPU tried to protect the GPU from to begin with. > As far as I can see this whole idea is extremely questionable. This=20 > looks like re-inventing the wheel in a different color. I see it like reinventing a round wheel compared to octagonal wheel. Could you elaborate with facts much more on your position why the ROCm debugger design is an absolute must for others to adopt? Otherwise it just looks like you are trying to prevent others from implementing a more flexible debugging interface through vague comments abo= ut "questionable design" without going into details. Not listing much concrete benefits nor addressing the very concretely expressed drawbacks of your suggested design, makes it seem like a very biased non-technical discussion. So while review interest and any comments are very much appreciated, please also work on providing bit more reasoning and facts instead of just claiming things. That'll help make the discussion much more fruitful. Regards, Joonas