From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9A76C4332F for ; Sat, 31 Dec 2022 00:42:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 46E138E0002; Fri, 30 Dec 2022 19:42:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 41DC98E0001; Fri, 30 Dec 2022 19:42:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E5408E0002; Fri, 30 Dec 2022 19:42:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1A41D8E0001 for ; Fri, 30 Dec 2022 19:42:26 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D1403120297 for ; Sat, 31 Dec 2022 00:42:25 +0000 (UTC) X-FDA: 80300750250.25.7747A98 Received: from mail-qt1-f178.google.com (mail-qt1-f178.google.com [209.85.160.178]) by imf15.hostedemail.com (Postfix) with ESMTP id DDB17A0003 for ; Sat, 31 Dec 2022 00:42:23 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=LsSISpjX; spf=pass (imf15.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.160.178 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672447344; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Hv+uaGor/dFf3uRzSMPbhI3giYMYL0ciKPN6qVvhgew=; b=fb1F0OlvHSuvozZgwxgfWBHpNXxhBZ3vyUuyUDjjOk0e6pmgtHy13FAlp8/HAc9L8qKmPC 9nSJ5RoJLw45X7PzGwbXXg9vc5biCW1//LjvNcSYOClUXBWd76KoiHzQnxeTCEPtZVXUCg Ygrpt4eyX/jRzJ2O/Wk+XmMBBzWGlRE= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=LsSISpjX; spf=pass (imf15.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.160.178 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672447344; a=rsa-sha256; cv=none; b=m8Scr5R8TRYxHSGsCwUBBP65BMIRoFhDh9hJbOajlGWId7no+vpN5/PWt6UY52/P7f5a3h 1sJQorrl0jXXkcqO0kb/oJ00p5l0BpjI/ObAm13VPKFxg8GsYKANwOHx3Y4uIgNlmV6SQx L49Lc7SnEA6Zq1JpVdqnGJo3RCvsoQQ= Received: by mail-qt1-f178.google.com with SMTP id c11so18238543qtn.11 for ; Fri, 30 Dec 2022 16:42:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Hv+uaGor/dFf3uRzSMPbhI3giYMYL0ciKPN6qVvhgew=; b=LsSISpjXK233Q0blGUZutaAad5E4+i3r2l4D42x2x/B+rH8n6gvjqZwzOzYl/H0A30 VGvGb7nxNEAsKTTG7LZ71NbJ5dy9UpZ+ro21VXi+94QTdWrOiNAzyoWKqjWwuG+5dJlM DjL9rf9apjZU3onlhTlbeMwh0E7vsnOhLmbVA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Hv+uaGor/dFf3uRzSMPbhI3giYMYL0ciKPN6qVvhgew=; b=wRGfNTZEluAEmFDdTQFL6XBwJHgnfjsqMIv7iACQMTYsf6ihALvCY2MNg9iZGHQAGX tTGtbJVxz9YzgVtobctzeLhKXomVc9m+Po4pIoKLQChX1bO51RPMDItZx0WfmsHn9vY3 7m1X2JCu9TKZQqDRGlUou3MZ2utXxcME9SfaNGteY9+Sl8iVfAdSOfx39dz1KahPRSa/ 3FsDnU2V9PBr4/6s4UsMpLdOKsV94IKljTIPeshqyEdTb/aT1XvhyMxxLXF0OL49ASDm MqJ1jdbMeHes0I+G+TyQwfvBE9G1xXqDCjdMYngwr8XjE5Xv3sYKPIJkZ6b7VJcQ4L9+ 8lsg== X-Gm-Message-State: AFqh2kp8AG1y8eh/24Il2XyDAaMq4ZCzIs9cAZXDcwLxvvg0gyHElaVI zkMMjrVb/RKMtY3Ahle6d1ZNC1HyZEqBAesX X-Google-Smtp-Source: AMrXdXvdhKsK9xmGxHDHJT1qYgBAsxqy0x9at5rU1ilJsSVwGCuHHFgCJBVLx6/IAH1v9YIJYMMSzg== X-Received: by 2002:ac8:60ce:0:b0:3a5:50c6:bdce with SMTP id i14-20020ac860ce000000b003a550c6bdcemr48539511qtm.47.1672447342407; Fri, 30 Dec 2022 16:42:22 -0800 (PST) Received: from mail-qk1-f176.google.com (mail-qk1-f176.google.com. [209.85.222.176]) by smtp.gmail.com with ESMTPSA id d18-20020ac85452000000b003a7e8ab2972sm13762984qtq.23.2022.12.30.16.42.21 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 30 Dec 2022 16:42:21 -0800 (PST) Received: by mail-qk1-f176.google.com with SMTP id a25so10963954qkl.12 for ; Fri, 30 Dec 2022 16:42:21 -0800 (PST) X-Received: by 2002:a05:620a:4720:b0:6ff:cbda:a128 with SMTP id bs32-20020a05620a472000b006ffcbdaa128mr1591268qkb.697.1672447340840; Fri, 30 Dec 2022 16:42:20 -0800 (PST) MIME-Version: 1.0 References: <20221227030829.12508-1-kirill.shutemov@linux.intel.com> <20221227030829.12508-6-kirill.shutemov@linux.intel.com> <20221231001029.5nckrhtmwahb65jo@box> In-Reply-To: <20221231001029.5nckrhtmwahb65jo@box> From: Linus Torvalds Date: Fri, 30 Dec 2022 16:42:05 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCHv13 05/16] x86/uaccess: Provide untagged_addr() and remove tags before address check To: "Kirill A. Shutemov" Cc: "Kirill A. Shutemov" , Dave Hansen , Andy Lutomirski , Peter Zijlstra , x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: DDB17A0003 X-Stat-Signature: 5ygbjqazaihuo7g6zri1fgut9b8a53tp X-Rspam-User: X-HE-Tag: 1672447343-602810 X-HE-Meta: U2FsdGVkX1/jW5rRYiF4u3Bi9TmccPTYfvsdA+0dGUgn4woS57kYGeW8NGFHxxljAXbK5pmGClTvlzAuDUfJWdbIu0ybjpzMr6Rsac8phdr439GPCb/frSAiwhCLyYJyWZMmBGnWM5OJNwlthWeoz0zhxjapd4wkBAdkwvhxkUoNZ/vqbDurb97+hzpSFNz6PnpznvBCU1ouho+C9QJwln7NHHE0OswVUYe09n9zVARFmDWTaa8Pzyow4mMaOTtlINJvJoAfEy6rvBZ/GYxbn85agNXMsQ/yOPSiS0XNsE/ykp2YmMG8lE7ZgL5rB/fTugCl9yLSouXihWM25QFELqljmkqGUzad7CIc/ShLtGddfcM8eDSb9/jAco4QTt4jeYUo7Kz7ELwErq2lUat0payldnzygyYSEA2/gVW+P1noToSJhlGro7PgH9Oay08Ox37IWHf50G04W4Gvzllqpr/i40Ox7BU2sgH9dRjzg58NcC34Oi//WBAtXUKjlBYBXK6T2h9jB7v3UcoTU6nvyt8JXGxn6LpQnazX4PBmW7blU27J2toZmSbAWx+Tpi7sAWL3nH4gojgiEDH/4d2FSKPbiQkR3s0LOg8xGQFeJ0pqOYWZ1GEf67GJUVbPChxJhJt/f2rZrE+MYcaT4RCR9IEpXHCbwz60hqbgiO1menDgziv0Mf3P8B8lIwr8Kc0/SEK5HbqKxhPELgEgCj2AmfNG/FBt7sx4nsML/lEi/o9JpGVOqHKPCplmwDwyabjooxitSaDOcq9fYVUg16bjFjq4R4UoTD5DxDU8/CWPu/K50lRyH/8vBYMvhJ9lbSkqOJnNie4BgBXt2V/uLfpq+xSNOQKb6t246KVrxAvQ1eSE5KBzFpyYExBw/bRlQkeGoKOlrNnacdB1+TpMGnH+JP4jCaJX9+HZZok88TBF0/JSuJYBaVYBuO+eUVZdVh0FomTQqy5Xc3i3BlkmLjT Ng1JybG3 ChQr1q9AL09IlQb5eiq+Mg+uzhDgP4ad9NW5KD9DTli9Pyv7i6AjAr6V/UcfEWrlRa2B7hKCpb7iWZXMPVjIpPAeb6XDwAHfg1A75Tv1FJWwAC0VrhUxvtgkDY3V1Pd8SzDB2nylNV1a9jJ6cJ397Al9opt5WQ072QjQtGNxbLsgMRDJfbg+N4sWrCN8zw09PWwO+VPfQgF8vY+IZEVybMfBquJZy8nIqtzJ5vCrNzKPEkxag3aNCkxwlTb8pV0rTLdec/umtFfYQ51HnV9uZw3kVULwVj0jisaP+/lQCZXHU7QBs/51p9IqVgXwTsbJ0MCWHcEaX0PVYpifU6XOgVPFtllX1hLbIa7iiC+tOTvc2EzaYifTRlgS6T1SAxXCs25EcKmbrMLVIMBfu7jIG3+Ockm3KQdnz1A/V X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Dec 30, 2022 at 4:10 PM Kirill A. Shutemov wrote: > > I made it a per-cpu variable (outside struct tlb_state to be visible in > modules). __get/put_user_X() now have a single instruction to untag the > address and it is gated by X86_FEATURE_LAM. Yeah, that looks more reasonable to me. > BTW, am I blind or we have no infrastructure to hookup static branches > from assembly? I think you're right. > I would be a better fit than ALTERNATIVE here. It would allow to defer > overhead until the first user of the feature. Well, it would make the overhead worse once people actually start using it. So it's not obvious that a static branch is really the right thing to do. That said, while I think that UNTAG_ADDR is quite reasonable now, the more I look at getuser.S and putuser.S, the more I'm thinking that getting rid of the TASK_SIZE comparison entirely is the right thing to do on x86-64. It's really rather nasty, with not just that whole LA57 alternative, but it's doing a large 64-bit constant too. Now, on 32-bit, we do indeed have to compare against TASK_SIZE explicitly, but on 32-bit we could just use an immediate for the cmp instruction, so even there that whole "load constant" isn't really optimal. And on 64-bit, we really only need to check the high bit. In fact, we don't even want to *check* it, because then we need to do that disgusting array_index_mask_nospec thing to mask the bits for it, so it would be even better to use purely arithmetic with no conditionals anywhere. And that's exactly what we could do on x86-64: movq %rdx,%rax shrq $63,%rax orq %rax,%rdx would actually be noticeably better than what we do now for for TASK_SIZE checking _and_ for the array index masking (for putuser.S, we'd use %rbx instead of %rax in that sequence). The above three simple instructions would replace all of the games we now play with LOAD_TASK_SIZE_MINUS_N(0) cmp %_ASM_DX,%_ASM_AX jae bad_get_user sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */ and %_ASM_DX, %_ASM_AX entirely. It would just turn all kernel addresses into all ones, which is then guaranteed to fault. So no need for any conditional that never triggers in real life anyway. On 32-bit, we'd still have to do that old sequence, but we'd replace the LOAD_TASK_SIZE_MINUS_N(0) cmp %_ASM_DX,%_ASM_AX with just the simpler cmp $TASK_SIZE_MAX-(n),%_ASM_AX since the only reason we do that immediate load is because there si no 64-bit immediate compare instruction. And once we don't test against TASK_SIZE, the need for UNTAG_ADDR just goes away, so now LAM is better too. In other words, we could actually improve on our current code _and_ simplify the LAM situation. Win-win. Anyway, I do not hate the version of the patch you posted, but I do think that the win-win of just making LAM not _have_ this issue in the first place might be the preferable one. The one thing that that "shift by 63 and bitwise or" trick does require is that the _ASM_EXTABLE_UA() thing for getuser/putuser would have to have an extra annotation to shut up the WARN_ONCE(trapnr == X86_TRAP_GP, "General protection fault in user access. Non-canonical address?"); in ex_handler_uaccess() for the GP trap that users can now cause by giving a non-canonical address with the high bit clear. So we'd probably just want a new EX_TYPE_* for these cases, but that still looks fairly straightforward. Hmm? Linus