From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DB60C369AB for ; Thu, 24 Apr 2025 17:56:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 84A706B00B0; Thu, 24 Apr 2025 13:56:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7ACD56B00B3; Thu, 24 Apr 2025 13:56:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D7DC6B00B8; Thu, 24 Apr 2025 13:56:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 38F176B00B0 for ; Thu, 24 Apr 2025 13:56:40 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id C4E8B1603A5 for ; Thu, 24 Apr 2025 17:56:41 +0000 (UTC) X-FDA: 83369692602.17.5CE29EB Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172]) by imf20.hostedemail.com (Postfix) with ESMTP id E3F901C0009 for ; Thu, 24 Apr 2025 17:56:39 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=sh43yzbf; spf=pass (imf20.hostedemail.com: domain of debug@rivosinc.com designates 209.85.210.172 as permitted sender) smtp.mailfrom=debug@rivosinc.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745517400; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KOLy/80XFfLxXJDNfbBZwCZ4HqYR8Mbf5OU8mj5q3uc=; b=KTWchOeNA7/3WUvCytSzPvHShHtyXY+0NGlJuYI8ONXe9QS31wgriVotu/mUtwlN11Yqqb V4yYTI5jgFdsDfh8I3rOtjtvacfNizFdmpJWhrOrtLVgH/k/j/HU4fQoQH6tmkQSvczZLP rUUbFOEqwYQBQ+0TpPOxIsJLeG2JmmA= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=sh43yzbf; spf=pass (imf20.hostedemail.com: domain of debug@rivosinc.com designates 209.85.210.172 as permitted sender) smtp.mailfrom=debug@rivosinc.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745517400; a=rsa-sha256; cv=none; b=zOCr4UnY9Zr89TpfNQyyJbmn9ZFjOnItgj5WE3i0x8Bsoacdl/+WBMM27Mtb9dkfe3Qsx9 vbRSRLMUCuo2GTZsx1uh3BDr4SP947z5GLZ3SlHwPlS0YkUXig2o3/q+BgtRxk58M5B0gf BKse6sa7dFiRXTnD0ZZqwnQycadaGGM= Received: by mail-pf1-f172.google.com with SMTP id d2e1a72fcca58-736c1138ae5so1329732b3a.3 for ; Thu, 24 Apr 2025 10:56:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1745517398; x=1746122198; darn=kvack.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=KOLy/80XFfLxXJDNfbBZwCZ4HqYR8Mbf5OU8mj5q3uc=; b=sh43yzbfIOvQ4SvK38Dld1J0d7INS9OfMJJjbFgaZ9nbXPRbOvWIJemaZAmSBQBPBg 1sgj0usYS1fYOvotyCsmFWkel6Mlh28OoUQvltkEsyQvqmbt7/ZqfpOM8x+av1YeOm4o en9BC2Zk4LmR6LreI5j98C2OZ48ugg1r4yjxAujixRG1Vh96vnsWxXCpArCv4EZ56CnV VQNX0i5nkSrgX5I60RyLcg5PRs5cMHIPfFnIbEdOWbzFJqKAUlioRwCEIURtR+w0AdAz BN0FVjS2wlnnQ55VseFhNdoeEhps3po+4VuDDO5GgvVNyqOQE7Hw9LuGfV7uIdblBehZ V/TA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745517398; x=1746122198; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KOLy/80XFfLxXJDNfbBZwCZ4HqYR8Mbf5OU8mj5q3uc=; b=sT9RzAOBijP3VwGkiKX2MRHSCqX14pytLct2R6AVzBRmAZqhhMA8Tnl7C5C0dqNYb2 5G6Rw1YJ6vGvZGpKcTOp8BBJS+Covd4BjdQFQD7tm7yyDKWy2RT6vg98llI1EOeuwGGh ePQNdcQ+b9pn/rE3L6okJa0N/mLQYNcG2Y6n55wYe31yhrGSgLiBeM5QqSebalemfegh 5R/UPveLWDOxkG5FRokaJSTNNaY26tNEbOSfygfTkzIq7eE9IWtv1c4IrFOxdYjMcHGl 5Z04YSJbq7lhkswIOetSC4qUXYvjjrhMkd3lAanRLwp3TerC+HstmnaeSZlzvR+3mHZH YaEg== X-Forwarded-Encrypted: i=1; AJvYcCXgTUtkvx6aqNGq+epZ320l7Mg0vVrkxlO+D6byzkW6eb9jzKbHhgZ0UNe2FkMivEGssB6HxOgSrA==@kvack.org X-Gm-Message-State: AOJu0YyhgNxjGQ3xEb8Ka1SJD5sMPknk5QuhFmZRCE6hUZSgnX5b8tIc y9xRLIsmqrDZ2K5pIkJ7i7fufKw/XA4PErEg+XtoAYb/jEzn01HetvF9qensT5s= X-Gm-Gg: ASbGncsqwQ7PBl/KqMk9J4Owkz483uDbPLnu6evM1UdEtg/gfLGh/dcGYUqVkU0oSuZ NUEblAQp2YoT0Q2HVZp9iqsPr4FMUzCVaBLUFWwTxgWB76c9yByrSPEkxa1fX6/w/QyDhIqO0XG ASssxCOoGBPJBHkDoz55R9cUdKdjyh2M1s7ZzLtaYDu6KOBkmPn3t3oik3SRFTxPiw2DRtJGCxS T+HiyJVhHbFrDhwg5LMG+4jgJCfQMNsJRXdWMq/VCGx/uRdlqZIMKBA34qj/oDVVZ5vIVuIzJRd 7idK8DXqZtbiW6sxZW02HZLuJ5hdfkMHBq5MB38Z7+ZksJJEl5w= X-Google-Smtp-Source: AGHT+IEvwKAWcL1BB3CDYo6zsE5OWQQndt5nVMjDwtZXDpnZ2rdOW1imvzSm4O5fk1CTzTK9UpqvkQ== X-Received: by 2002:a05:6a00:1744:b0:735:d89c:4b8e with SMTP id d2e1a72fcca58-73e32fbafeemr709517b3a.5.1745517398496; Thu, 24 Apr 2025 10:56:38 -0700 (PDT) Received: from debug.ba.rivosinc.com ([64.71.180.162]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-73e25a9940esm1697082b3a.129.2025.04.24.10.56.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 24 Apr 2025 10:56:38 -0700 (PDT) Date: Thu, 24 Apr 2025 10:56:34 -0700 From: Deepak Gupta To: Radim =?utf-8?B?S3LEjW3DocWZ?= Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Andrew Morton , "Liam R. Howlett" , Vlastimil Babka , Lorenzo Stoakes , Paul Walmsley , Palmer Dabbelt , Albert Ou , Conor Dooley , Rob Herring , Krzysztof Kozlowski , Arnd Bergmann , Christian Brauner , Peter Zijlstra , Oleg Nesterov , Eric Biederman , Kees Cook , Jonathan Corbet , Shuah Khan , Jann Horn , Conor Dooley , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, devicetree@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, alistair.francis@wdc.com, richard.henderson@linaro.org, jim.shu@sifive.com, andybnac@gmail.com, kito.cheng@sifive.com, charlie@rivosinc.com, atishp@rivosinc.com, evan@rivosinc.com, cleger@rivosinc.com, broonie@kernel.org, rick.p.edgecombe@intel.com, Zong Li , linux-riscv Subject: Re: [PATCH v12 05/28] riscv: usercfi state for task and save/restore of CSR_SSP on trap entry/exit Message-ID: References: <20250314-v5_user_cfi_series-v12-0-e51202b53138@rivosinc.com> <20250314-v5_user_cfi_series-v12-5-e51202b53138@rivosinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: E3F901C0009 X-Stat-Signature: q7gum15sodp97tdreot5ed8p4qg3c57t X-HE-Tag: 1745517399-803804 X-HE-Meta: U2FsdGVkX18BsDws1WLNQOcgJsTzRP/QYiGiSPjVqBomRU7juRE9VhVe+rCvDWtugs5BcRS3vn31sWoPdoTRZXEjYYnP6GTPlMnFsuq4e0jUOOHbowpnxOZdjm7gnwsMxKo4iIVKPi++xke4G2sMZcgNHIC0qcoiV2xplHE3p6kQGj9dwt3fzLtGHCYQi7D8FxBwQHztXtn1RLURMnBXe+pfDA/vkMNj9AviKljB1frzQ/GXGBocVpQWn+j0nGfwvYQ3vQM5ZMYXRaXzWV8ZZdqRfhWJo6pFwgZ4OSRV2pKJ++u/Y9N1WUnt3OaEcii999rRNGuW9K29c47K1FEViCZzSdSJ8TUxDw4MPwCp/gNDbfNMb5XQIcJCI+pTYipjWbdB19lY9rvChSYJleZGz6GUcw/O0qPyP+gddFeCcefJ6tN1tAXX2n9LWeK0MjbAhiw3TZek94gDROD9GPsUCVKIyggehI0DJU8STJqAGOEM3zlMgQVdxy/Q24UsNmOynwG9Rma6oSWjSDGXPnsJ+q0zAtYPrAYNsCveHoTUZ0kRAZZbd3pEBxpeaRfol6P8i7Ja7gT32VGS/BJ4Of7FfJ+VzY8ksjoAFIXiHF97MVuVM6hV7wu7ErgUxbHDyV+zb+sqW0lHp1RivnstcATBGfRkvywjcOakz2nC66r4jYXzdP1nzNqB5JnIfaE0eq/BphSH/OOSqkKBRTr0TfBpwTsa8/EvQydfwCzqaBMm8yMa3XgWl9nMVXLc7/QDDHxfmI9zEYgPDtnWZVfKim2iMfECgL6d2m7mpT8ei2ABFkpNkhi3UX3PHUhCAdJXe3aiBDiukdBHHCkBnjslvjhoMFSAHcRoqrlgXSS/GlVqnx0aVWpo07tPESMfxHSnW7jRKqvycKM/aY3k2KEUWY3ceUlo+WKjFAgrQvLKMu9fvjHcRdFM9fN9CG15Jvi6ZyF6KhLSkpOIkKyEcD0Sosb 9jDYTEDh MqiE7qFNyhVEJUWtf31SLlZgdbE0nZNhma36Ovn3/HYIxPVWxuc6RAx9OdgMYzxWQXf1BYvTmuUcmm65pOZDJAarZw3UQKAzpRtOxWL3NSnqDaFgarF7x9nljzz/Tm90HtdUc5fL8PVqETlhWMFfKyguvmTbl1Aa9hZ4QWymbTwTsWyh+bWTiMvEqhHKITgngHiVy X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Apr 24, 2025 at 01:52:43PM +0200, Radim Krčmář wrote: >2025-04-23T17:00:29-07:00, Deepak Gupta : >> On Thu, Apr 10, 2025 at 01:04:39PM +0200, Radim Krčmář wrote: >>>2025-03-14T14:39:24-07:00, Deepak Gupta : >>>> diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h >>>> @@ -62,6 +62,9 @@ struct thread_info { >>>> long user_sp; /* User stack pointer */ >>>> int cpu; >>>> unsigned long syscall_work; /* SYSCALL_WORK_ flags */ >>>> +#ifdef CONFIG_RISCV_USER_CFI >>>> + struct cfi_status user_cfi_state; >>>> +#endif >>> >>>I don't think it makes sense to put all the data in thread_info. >>>kernel_ssp and user_ssp is more than enough and the rest can comfortably >>>live elsewhere in task_struct. >>> >>>thread_info is supposed to be as small as possible -- just spanning >>>multiple cache-lines could be noticeable. >> >> I can change it to only include only `user_ssp`, base and size. > >No need for base and size either -- we don't touch that in the common >exception code. got it. > >> But before we go there, see below: >> >> $ pahole -C thread_info kbuild/vmlinux >> struct thread_info { >> long unsigned int flags; /* 0 8 */ >> int preempt_count; /* 8 4 */ >> >> /* XXX 4 bytes hole, try to pack */ >> >> long int kernel_sp; /* 16 8 */ >> long int user_sp; /* 24 8 */ >> int cpu; /* 32 4 */ >> >> /* XXX 4 bytes hole, try to pack */ >> >> long unsigned int syscall_work; /* 40 8 */ >> struct cfi_status user_cfi_state; /* 48 32 */ >> /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */ >> long unsigned int a0; /* 80 8 */ >> long unsigned int a1; /* 88 8 */ >> long unsigned int a2; /* 96 8 */ >> >> /* size: 104, cachelines: 2, members: 10 */ >> /* sum members: 96, holes: 2, sum holes: 8 */ >> /* last cacheline: 40 bytes */ >> }; >> >> If we were to remove entire `cfi_status`, it would still be 72 bytes (88 bytes >> if shadow call stack were enabled) and already spans across two cachelines. > >It has only 64 bytes of data without shadow call stack, but it wasted 8 >bytes on the holes. >a2 is somewhat an outlier that is not used most exception paths and >excluding it makes everything fit nicely even now. But we can't exclude shadow call stack. It'll lead to increased size if that config is selected. A solution has to work for all the cases and not half hearted effort. > >> if shadow call stack were enabled) and already spans across two cachelines. I >> did see the comment above that it should fit inside a cacheline. Although I >> assumed its stale comment given that it already spans across cacheline and I >> didn't see any special mention in commit messages of changes which grew this >> structure above one cacheline. So I assumed this was a stale comment. >> >> On the other hand, whenever enable/lock bits are checked, there is a high >> likelyhood that user_ssp and other fields are going to be accessed and >> thus it actually might be helpful to have it all in one cacheline during >> runtime. > >Yes, although accessing enable/lock bits will be relatively rare. >It seems better to have the overhead during thread setup, rather than on >every trap. > >> So I am not sure if its helpful sticking to the comment which already is stale. > >We could fix the holes and also use sp instead of a0 in the >new_vmalloc_check, so everything would fit better. > >We are really close to fitting into a single cache-line, so I'd prefer >if shadow stack only filled thread_info with data that is used very >often in the exception handling code. I don't get what's the big deal if it results in two cachelines. We can (re)organize data structure in a way the most frequently accessed members are together in a single cacheline. We just need to find those members. In the hot path of exception handling, I see accesses to pt_regs on stack as well. These are definitley different cacheline than thread_info. I understand the argument of one member field crossing into two cachelines can have undesired perf effects. I do not understand reasoning that thread_info exactly has to fit inside one cacheline. If this was always supposed to fit in a single cacheline, clearly this invariant isn't/wasn't maintained as changes trickled in. I would like to see what maintainers have to say or someone who did data analysis on this. > >I think we could do without user_sp in thread_info as well, so there are >other packing options. Sure, probably somewhere in task_struct. But fact of the matter is that it has to be saved/restore during exception entry/exit. But then load/store to task_struct is essentially a different cachline. Not sure what we will achieve here? > >Btw. could ssp be added to pt_regs? I had that earlier. It breaks user abi. And it was a no go. > >Thanks.