From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4863CCA479 for ; Thu, 21 Jul 2022 11:07:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1EC7F8E0005; Thu, 21 Jul 2022 07:07:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 19BD98E0002; Thu, 21 Jul 2022 07:07:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 014658E0005; Thu, 21 Jul 2022 07:07:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E632D8E0002 for ; Thu, 21 Jul 2022 07:07:50 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 0CF76AAD1F for ; Thu, 21 Jul 2022 07:28:55 +0000 (UTC) X-FDA: 79710280230.21.88B0BD8 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf10.hostedemail.com (Postfix) with ESMTP id 7CD65C008C for ; Thu, 21 Jul 2022 07:28:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1658388534; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gBu0C6SmChVBYE3EWfPl4wKuLdjdTei9NtZmcfmPFKU=; b=f4qQ8US7t9jx0olrJWlzEMbXIKkb8Hf9Veybp57/uck1GcCs0MiSkRc0uWvVrEkIXn6R0O Txhk/UFRK1L6Vlz7zpxfrLSIAUGqbVew3G8Tnhtk2HEhG+AXqx27ULUF3oljDIz5ojw3qJ /xdrib1uOqgrMueHTAqrQXvzUHqF0gM= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-558-Sf5WpyufNYyl0llBo1KpSA-1; Thu, 21 Jul 2022 03:28:50 -0400 X-MC-Unique: Sf5WpyufNYyl0llBo1KpSA-1 Received: by mail-wr1-f71.google.com with SMTP id v18-20020adf8b52000000b0021d641d2bb0so100348wra.11 for ; Thu, 21 Jul 2022 00:28:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:cc:references:from:organization:subject :in-reply-to:content-transfer-encoding; bh=gBu0C6SmChVBYE3EWfPl4wKuLdjdTei9NtZmcfmPFKU=; b=gwYgPGoyXq2xViGY//fCXPoh8JKxDnZ4SaTO39eebZK59GTylTj5ebCCzVl+Sz5mD1 XjVBiOhnmeio/0XUy2aH9BZn0DbQxLzQB5TKW09AET2DCIov1mCCjND70uXoRPEMcI7A afKOfD9vnjCpqYe+Yh/qNbYEFeEoLcPglfU3fvnQ3eEi1XqGFdpHWO1oheznU+IaD4b1 T2gNNseUKbPzar+os/3T66fpKesW3wWQl3eBZ+l1edv4Fl3cC1LIphNMrf9VKSpAqUOM FWmRcu/l8ybLG8h6g44sVcgalhkY4bANYW0/lnjUmEj+BAWPuszusBpsiuX1ss6C/NIh sjcw== X-Gm-Message-State: AJIora/ue39bmJszlQvI9NirKFmU0bFneN97WMOF57N+UdepEdiHWXvO 41UZNjvuMBkljVHFVD2NM832q1CtZkD6iKVBD+Z5oDR3Be5PxAI185VXcMKq8QQHnZ88KCCQSQw LwMVD6p0KiGY= X-Received: by 2002:a5d:59a5:0:b0:21d:abc4:29ec with SMTP id p5-20020a5d59a5000000b0021dabc429ecmr33151771wrr.666.1658388529215; Thu, 21 Jul 2022 00:28:49 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tzlvlSraysNcDTugirMAuzJNK+5byYzk1ueqL/RDMyI6446bYemuMQjrAQvkbKKV2hZBumYw== X-Received: by 2002:a5d:59a5:0:b0:21d:abc4:29ec with SMTP id p5-20020a5d59a5000000b0021dabc429ecmr33151748wrr.666.1658388528719; Thu, 21 Jul 2022 00:28:48 -0700 (PDT) Received: from ?IPV6:2003:cb:c707:e000:25d3:15fa:4c8b:7e8d? (p200300cbc707e00025d315fa4c8b7e8d.dip0.t-ipconnect.de. [2003:cb:c707:e000:25d3:15fa:4c8b:7e8d]) by smtp.gmail.com with ESMTPSA id e6-20020a05600c4e4600b003a3188bef63sm973238wmq.11.2022.07.21.00.28.48 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 21 Jul 2022 00:28:48 -0700 (PDT) Message-ID: Date: Thu, 21 Jul 2022 09:28:47 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 To: Peter Xu , linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Andrea Arcangeli , Nadav Amit , Andrew Morton References: <20220720220324.88538-1-peterx@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v2] mm/mprotect: Fix soft-dirty check in can_change_pte_writable() In-Reply-To: <20220720220324.88538-1-peterx@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=f4qQ8US7; spf=none (imf10.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658388534; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gBu0C6SmChVBYE3EWfPl4wKuLdjdTei9NtZmcfmPFKU=; b=TecOTm6/0bSyCBW8Xe8pjhAXCp9cw3H8hlrguGHHj9n92Ocm3TD8eDYtLXs0lPDV8Do4Ad KXtNgRIbd6q5keICirBYCwqjktBer78W7C8mnP3qrtfq15Q4STlpFkp/XgqyyNlH/cr/Ro cf/tUBO9YJ8hY/+0pa1sqOglX6OtpwM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658388534; a=rsa-sha256; cv=none; b=kX619BFpA4CkSmOeUJAunRwmYGvVNvPdIWsYIt4WmkwIS6F08oNIO30lOj5Slt9M7U7uIa BbBqQ1R+vVttQ/h8O4j7IDwl7rA5PTRGy9ranv6yRe4R541R9i4FNnEb7VEtSWFNRf7gjW LYVu0O1hipdK0r8/XdlrxIlrEkXjpvA= X-Rspam-User: X-Rspamd-Queue-Id: 7CD65C008C Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=f4qQ8US7; spf=none (imf10.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: p5k9ydonuuba143iguwge3qetz96upe4 X-Rspamd-Server: rspam07 X-HE-Tag: 1658388534-176338 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 21.07.22 00:03, Peter Xu wrote: > The check wanted to make sure when soft-dirty tracking is enabled we won't > grant write bit by accident, as a page fault is needed for dirty tracking. > The intention is correct but we didn't check it right because VM_SOFTDIRTY > set actually means soft-dirty tracking disabled. Fix it. Thanks for digging into this and writing the reproducer. The softdirty logic was rather confusing for me. > > It wasn't a bug for a long time because we used to only optimize the write > bit settings in change_pte_range() for page caches, and since we've got a > higher level check in vma_wants_writenotify(), we will never set the bit > MM_CP_TRY_CHANGE_WRITABLE for soft-dirty enabled page caches, hence even if > we checked with the wrong value of VM_SOFTDIRTY in change_pte_range() it'll > just be an no-op. Functionally it was still correct, even if cpu cycles > wasted. I don't quite follow that explanation and most probably I am missing something. Modifying your test to map page from a file MAP_SHARED gives me under 5.18.11-100.fc35.x86_64: 53,54d52 < FILE *file = fopen("tmpfile", "w+"); < int file_fd; 56d53 < assert(file); 59,61d55 < < file_fd = fileno(file); < ftruncate(file_fd, psize); 63c57 < MAP_SHARED, file_fd, 0); --- > MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); t480s: ~ $ sudo ./tmp ERROR: Wrote page again, soft-dirty=0 (expect: 1 IMHO, while the check in vma_wants_writenotify() is correct and makes sure that the pages are kept R/O by the generic machinery. We get vma_wants_writenotify(), so we activate MM_CP_TRY_CHANGE_WRITABLE. The wrong logic in can_change_pte_writable(), however, maps the page writable again without caring about softdirty. At least that would be my explanation for the failure. But maybe I messes up something else :) > > However after the recent work of anonymous page optimization on exclusive > pages we'll start to make it wrong because anonymous page does not require > the check in vma_wants_writenotify() hence it'll suffer from the wrong > check here in can_change_pte_writable(). > > We can easily verify this with any exclusive anonymous page, like program > below: > > =======8<====== > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > > #define BIT_ULL(nr) (1ULL << (nr)) > #define PM_SOFT_DIRTY BIT_ULL(55) > > unsigned int psize; > char *page; > > uint64_t pagemap_read_vaddr(int fd, void *vaddr) > { > uint64_t value; > int ret; > > ret = pread(fd, &value, sizeof(uint64_t), > ((uint64_t)vaddr >> 12) * sizeof(uint64_t)); > assert(ret == sizeof(uint64_t)); > > return value; > } > > void clear_refs_write(void) > { > int fd = open("/proc/self/clear_refs", O_RDWR); > > assert(fd >= 0); > write(fd, "4", 2); > close(fd); > } > > #define check_soft_dirty(str, expect) do { \ > bool dirty = pagemap_read_vaddr(fd, page) & PM_SOFT_DIRTY; \ > if (dirty != expect) { \ > printf("ERROR: %s, soft-dirty=%d (expect: %d)\n", str, dirty, expect); \ > exit(-1); \ > } \ > } while (0) > > int main(void) > { > int fd = open("/proc/self/pagemap", O_RDONLY); > > assert(fd >= 0); > psize = getpagesize(); > page = mmap(NULL, psize, PROT_READ|PROT_WRITE, > MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); > assert(page != MAP_FAILED); > > *page = 1; > check_soft_dirty("Just faulted in page", 1); > clear_refs_write(); > check_soft_dirty("Clear_refs written", 0); > mprotect(page, psize, PROT_READ); > check_soft_dirty("Marked RO", 0); > mprotect(page, psize, PROT_READ|PROT_WRITE); > check_soft_dirty("Marked RW", 0); > *page = 2; > check_soft_dirty("Wrote page again", 1); > > munmap(page, psize); > close(fd); > printf("Test passed.\n"); > > return 0; > } Can we turn that into a vm selftest in tools/testing/selftests/vm/soft-dirty.c, and also extend it by MAP_SHARED froma file as above? > =======8<====== > > So even if commit 64fe24a3e05e kept the old behavior and didn't attempt to > change the behavior here, the bug will only be able to be triggered after > commit 64fe24a3e05e because only anonymous page will suffer from it. > > Fixes: 64fe24a3e05e ("mm/mprotect: try avoiding write faults for exclusive anonymous pages when changing protection") > Signed-off-by: Peter Xu > --- > mm/mprotect.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/mm/mprotect.c b/mm/mprotect.c > index 0420c3ed936c..804807ab14e6 100644 > --- a/mm/mprotect.c > +++ b/mm/mprotect.c > @@ -48,8 +48,11 @@ static inline bool can_change_pte_writable(struct vm_area_struct *vma, > if (pte_protnone(pte) || !pte_dirty(pte)) > return false; > > - /* Do we need write faults for softdirty tracking? */ > - if ((vma->vm_flags & VM_SOFTDIRTY) && !pte_soft_dirty(pte)) > + /* > + * Do we need write faults for softdirty tracking? Note, > + * soft-dirty is enabled when !VM_SOFTDIRTY. > + */ > + if (!(vma->vm_flags & VM_SOFTDIRTY) && !pte_soft_dirty(pte)) > return false; I wonder if we now want, just as in vma_wants_writenotify(), an early check for IS_ENABLED(CONFIG_MEM_SOFT_DIRTY), to optimize this out completely. > > /* Do we need write faults for uffd-wp tracking? */ -- Thanks, David / dhildenb