From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FC19C4320A for ; Thu, 19 Aug 2021 13:21:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2F8266103A for ; Thu, 19 Aug 2021 13:21:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2F8266103A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id CFE9F6B0071; Thu, 19 Aug 2021 09:21:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C85E46B0072; Thu, 19 Aug 2021 09:21:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B27116B0073; Thu, 19 Aug 2021 09:21:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0135.hostedemail.com [216.40.44.135]) by kanga.kvack.org (Postfix) with ESMTP id 973496B0071 for ; Thu, 19 Aug 2021 09:21:39 -0400 (EDT) Received: from smtpin33.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 26653269D2 for ; Thu, 19 Aug 2021 13:21:39 +0000 (UTC) X-FDA: 78491892318.33.56A5E15 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf07.hostedemail.com (Postfix) with ESMTP id B25D21007ABA for ; Thu, 19 Aug 2021 13:21:38 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 9114D6108F; Thu, 19 Aug 2021 13:21:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1629379298; bh=PkpTsJt7pyHtfJ1gyD/LaaCvVbcW8fH7saHC271sUdo=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Z/38zcOUspTE5hna0KVWPiy+4K78IMj5xykRT/ye+MXCFMBLZFpcpXBwcdVdYa95t gzYXVxDeWhs6TETU/f/vP45/hIAnMJ+a9ycHKtaWDPnYSC9bRglvRe0GSSedLWLdql IZljRIZZwwWMjNtvvcusfzYXQavc9QDj0rEVUsMq81iJsyyLPxgNuzWPG+5LqKh4Of 3/LclWqgwqVF2mRG+1x8IToAjb6ZDsdy6RWe7sda/TvbY1dlC3Dy2suBH1khLr5gnu VA37+pOzMxIsDtjfM0xRDRAf+OO1+AAcToLsT5lOa/MQW/LiD3IXQqUokg9U+DbSyg Bfu2u8skPTYtg== Date: Thu, 19 Aug 2021 14:21:32 +0100 From: Will Deacon To: Andrew Morton , peterz@infradead.org, bp@alien8.de Cc: Xiyu Yang , Alistair Popple , Yang Shi , Shakeel Butt , Hugh Dickins , Miaohe Lin , linux-kernel@vger.kernel.org, linux-mm@kvack.org, yuanxzhang@fudan.edu.cn, Xin Tan , Will Deacon , Linus Torvalds Subject: Re: [PATCH] mm/rmap: Convert from atomic_t to refcount_t on anon_vma->refcount Message-ID: <20210819132131.GA15779@willie-the-truck> References: <1626665029-49104-1-git-send-email-xiyuyang19@fudan.edu.cn> <20210720160127.ac5e76d1e03a374b46f25077@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210720160127.ac5e76d1e03a374b46f25077@linux-foundation.org> User-Agent: Mutt/1.10.1 (2018-07-13) X-Rspamd-Queue-Id: B25D21007ABA Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="Z/38zcOU"; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf07.hostedemail.com: domain of will@kernel.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=will@kernel.org X-Rspamd-Server: rspam04 X-Stat-Signature: 7ew5ezddkiq9ipx3ktirfzg4cppw3pjh X-HE-Tag: 1629379298-923031 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: [+Peter and Boris] Sorry for not responding sooner; I looked at this and couldn't see a way to improve it at first, but then I had an idea the other day. On Tue, Jul 20, 2021 at 04:01:27PM -0700, Andrew Morton wrote: > On Mon, 19 Jul 2021 11:23:35 +0800 Xiyu Yang wrote: > > > refcount_t type and corresponding API can protect refcounters from > > accidental underflow and overflow and further use-after-free situations. > > Grumble. > > For x86_64 defconfig this takes rmap.o text size from 13226 bytes to > 13622. > > For x86_64 allmodconfig this takes rmap.o text size from 66576 bytes to > 67858. > > I didn't check which config option is making the bloat so much worse, > but this really is quite bad. We bust a gut to make savings which are > 1% the size of this! Is the refcount_t really so much better than a > bare atomic_t that this impact is justified? I think so: the saturation semantics have been shown to mitigate attacks that rely on over/underflow so it's a good security feature. On top of that, the memory ordering is what you'd expect so you don't have to worry about adding (or missing) the necessary barriers -- there's more info about that in Documentation/core-api/refcount-vs-atomic.rst. > Can someone pleeeeeeze take a look at what is happening here and put > the refcount code on a very serious diet? I suppose the sanitisers might contribute a fair bit to the cmpxchg() loops in the allmodconfig case, but looking at how defconfig affects mm/rmap.o the *big* difference there seems to be due to the dec_and_test() operation: atomic_dec_and_test() looks like: lock decl (%rdi) sete %al retq whereas refcount_dec_and_test() looks like: push %r12 mov $0xffffffff,%eax lock xadd %eax,(%rdi) cmp $0x1,%eax je 97d xor %r12d,%r12d test %eax,%eax jle 989 mov %r12d,%eax pop %r12 retq mov $0x1,%r12d mov %r12d,%eax pop %r12 retq mov $0x3,%esi callq 993 mov %r12d,%eax pop %r12 retq which is a monster by comparison! This is because refcount_dec_and_test() wraps __refcount_sub_and_test(), which is necessary so that we can look at the old value: (1) If it was equal to 1, then ensure we have acquire semantics and return true. (2) If it was < 0 or the new value written is < 0, then saturate (UAF) However, talking to Peter and Boris, we may be able to achieve the same thing by looking at the flags after a DECL instruction: * We can implement (1) by checking if we hit zero (ZF=1) * We can implement (2) by checking if the new value is < 0 (SF=1). We then need to catch the case where the old value was < 0 but the new value is 0. I think this is (SF=0 && OF=1). So maybe the second check is actually SF != OF? I could benefit from some x86 expertise here, but hopefully you get the idea. We could do something similar for refcount_dec() and refcount_inc(), although commit a435b9a14356 ("locking/refcount: Provide __refcount API to obtain the old value") add versions which return the old value, which we can't reconstruct from the flags. Since nobody is using these variants, I suggest we remove them. The other thing we could do is to inline the saturation logic and allow the WARN_ONCE() to be disabled with a Kconfig option, but I'm not sure that the gain is really worth it considering that you lose the useful diagnostic (and silent saturation is likely to be very confusing if it manifests as a crash somewhere else). Thoughts? Will