From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F1BEC4361B for ; Thu, 17 Dec 2020 15:35:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F2A4D23975 for ; Thu, 17 Dec 2020 15:35:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F2A4D23975 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=amacapital.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 64E008D0002; Thu, 17 Dec 2020 10:35:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5FEFB8D0001; Thu, 17 Dec 2020 10:35:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4EB898D0002; Thu, 17 Dec 2020 10:35:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0171.hostedemail.com [216.40.44.171]) by kanga.kvack.org (Postfix) with ESMTP id 32F148D0001 for ; Thu, 17 Dec 2020 10:35:24 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id C58B03653 for ; Thu, 17 Dec 2020 15:35:23 +0000 (UTC) X-FDA: 77603173326.21.chalk28_400488827435 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id 976DF18044604 for ; Thu, 17 Dec 2020 15:35:23 +0000 (UTC) X-HE-Tag: chalk28_400488827435 X-Filterd-Recvd-Size: 6661 Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) by imf13.hostedemail.com (Postfix) with ESMTP for ; Thu, 17 Dec 2020 15:35:22 +0000 (UTC) Received: by mail-pj1-f41.google.com with SMTP id lb18so3833293pjb.5 for ; Thu, 17 Dec 2020 07:35:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=content-transfer-encoding:from:mime-version:subject:date:message-id :references:cc:in-reply-to:to; bh=cF1lpVOfX+qqIliOA4fJEolSXJPbWdzTY82HZk7M6Ss=; b=jmCoo2KfKiXBFPZ5Pu4hTHY3iB5JmJdEW2xDg11/qL0XDN1JfsdN7XyC2j8sEoH95y GFMTYGr0oCS+l4J4pMyQxd8aZ/F75Twiay44FOIBLAxCIaB28JlAUo9Ib+wLBPNGhIB5 JZos5e7Irlc2WYQAdv2N/x8jYTVEVdXO+q3lEFoIwQJafSWpP75RiwxCcuG16exsWsl1 /xaldJycza+FyblXi0w7S8BUo7Flvd3XJQ13gxiBJ6wf0KNWX2NhU00LCXp3C5ufBfWG twR6w2w9HVJM956l2GZbRuU5Uhi+JmF4wOm7e79KWTwg3krAg7CO6ep+OzxKH/X27pcp D66A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:content-transfer-encoding:from:mime-version :subject:date:message-id:references:cc:in-reply-to:to; bh=cF1lpVOfX+qqIliOA4fJEolSXJPbWdzTY82HZk7M6Ss=; b=A9/G9ALP3sV+MiMKWsIy6QX8UloxDXNX30imZwD9o4+hZNHNVkw9TCPLp31pF2m6ke 48B47ns6lNPROOQ/U4fVTVMiMkHMl4n7fEN6ojxoQgvajhTMvj7+dtSYH79LWlIbOpIL deEHzqZQnkcutf0op+4/MLyun0IPjKwav3mEkEbvGOqJL1YiLYKNKrturL7tHGCzP0Ug kiYRQm+sphUKP0kd+hfUvTazJ3poXGEsWLrjqnusQddf5revnYho/ElhwB0SZQk4rPll 4YcRQHOF9jUUilL2cZXE5TrzYdQ4SxRj8yCbgroJ6jf/AZJ8VlrDRMFi4Lx+/SV8r0IZ uNfQ== X-Gm-Message-State: AOAM531PlwsmXDkvC483u/Hda1OmgOqsVIV1WVXdKpslrsjAl1hH1tlP Nkmg0Y5rkab2LTAHPj5GhaqJLQ== X-Google-Smtp-Source: ABdhPJxqYT8PU/t0P9GGyShWjJrS4X7K4WGiEwYmQMQu9GEUzGcTkb4Xv4/Y1sZVsWotkJYwFWmFsw== X-Received: by 2002:a17:902:64:b029:da:a9cf:4065 with SMTP id 91-20020a1709020064b02900daa9cf4065mr36533471pla.26.1608219321514; Thu, 17 Dec 2020 07:35:21 -0800 (PST) Received: from ?IPv6:2601:646:c200:1ef2:9dc1:d988:a568:787a? ([2601:646:c200:1ef2:9dc1:d988:a568:787a]) by smtp.gmail.com with ESMTPSA id r123sm6059458pfr.68.2020.12.17.07.35.20 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 17 Dec 2020 07:35:20 -0800 (PST) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Andy Lutomirski Mime-Version: 1.0 (1.0) Subject: Re: [PATCH V3.1] entry: Pass irqentry_state_t by reference Date: Thu, 17 Dec 2020 07:35:18 -0800 Message-Id: <24F5DC49-1FB3-42CF-8323-B0B39D936F7F@amacapital.net> References: <20201217131924.GW3040@hirez.programming.kicks-ass.net> Cc: Thomas Gleixner , Andy Lutomirski , Weiny Ira , Ingo Molnar , Borislav Petkov , Dave Hansen , X86 ML , LKML , Andrew Morton , Fenghua Yu , "open list:DOCUMENTATION" , linux-nvdimm , Linux-MM , "open list:KERNEL SELFTEST FRAMEWORK" , Dan Williams , Greg KH In-Reply-To: <20201217131924.GW3040@hirez.programming.kicks-ass.net> To: Peter Zijlstra X-Mailer: iPhone Mail (18B121) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Dec 17, 2020, at 5:19 AM, Peter Zijlstra wrote: >=20 > =EF=BB=BFOn Thu, Dec 17, 2020 at 02:07:01PM +0100, Thomas Gleixner wrote: >>> On Fri, Dec 11 2020 at 14:14, Andy Lutomirski wrote: >>>> On Mon, Nov 23, 2020 at 10:10 PM wrote: >>> After contemplating this for a bit, I think this isn't really the >>> right approach. It *works*, but we've mostly just created a bit of an >>> unfortunate situation. Our stack, on a (possibly nested) entry looks >>> like: >>>=20 >>> previous frame (or empty if we came from usermode) >>> --- >>> SS >>> RSP >>> FLAGS >>> CS >>> RIP >>> rest of pt_regs >>>=20 >>> C frame >>>=20 >>> irqentry_state_t (maybe -- the compiler is within its rights to play >>> almost arbitrary games here) >>>=20 >>> more C stuff >>>=20 >>> So what we've accomplished is having two distinct arch register >>> regions, one called pt_regs and the other stuck in irqentry_state_t. >>> This is annoying because it means that, if we want to access this >>> thing without passing a pointer around or access it at all from outer >>> frames, we need to do something terrible with the unwinder, and we >>> don't want to go there. >>>=20 >>> So I propose a somewhat different solution: lay out the stack like this.= >>>=20 >>> SS >>> RSP >>> FLAGS >>> CS >>> RIP >>> rest of pt_regs >>> PKS >>> ^^^^^^^^ extended_pt_regs points here >>>=20 >>> C frame >>> more C stuff >>> ... >>>=20 >>> IOW we have: >>>=20 >>> struct extended_pt_regs { >>> bool rcu_whatever; >>> other generic fields here; >>> struct arch_extended_pt_regs arch_regs; >>> struct pt_regs regs; >>> }; >>>=20 >>> and arch_extended_pt_regs has unsigned long pks; >>>=20 >>> and instead of passing a pointer to irqentry_state_t to the generic >>> entry/exit code, we just pass a pt_regs pointer. >>=20 >> While I agree vs. PKS which is architecture specific state and needed in >> other places e.g. #PF, I'm not convinced that sticking the existing >> state into the same area buys us anything more than an indirect access. >>=20 >> Peter? >=20 > Agreed; that immediately solves the confusion Ira had as well. While > extending pt_regs sounds scary, I think we've isolated our pt_regs > implementation from actual ABI pretty well, but of course, that would > need an audit. We don't want to leak this into signals for example. >=20 I=E2=80=99m okay with this. My suggestion for having an extended pt_regs that contains pt_regs is to kee= p extensions like this invisible to unsuspecting parts of the kernel. In par= ticular, BPF seems to pass around struct pt_regs *, and I don=E2=80=99t know= what the implications of effectively offsetting all the registers relative t= o the pointer would be. Anything that actually broke the signal regs ABI should be noticed by the x8= 6 selftests =E2=80=94 the tests read and write registers through ucontext. >=20