From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B526C00140 for ; Mon, 15 Aug 2022 15:07:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 12E156B0074; Mon, 15 Aug 2022 11:07:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0DD8B6B0075; Mon, 15 Aug 2022 11:07:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EE6FD8D0002; Mon, 15 Aug 2022 11:07:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id DE99C6B0074 for ; Mon, 15 Aug 2022 11:07:47 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id AC89D160E23 for ; Mon, 15 Aug 2022 15:07:47 +0000 (UTC) X-FDA: 79802156574.30.7AF9BB8 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf18.hostedemail.com (Postfix) with ESMTP id 29F1B1C007D for ; Mon, 15 Aug 2022 15:07:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1660576066; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Tw1noeklhKvo4eFPa0CcJ3FiBdmqu8Dq+RMX4xesWq0=; b=ec4EnFf2l21vwbMSW+yz2GUHDGjTTKtU09F3iV8ocNh3PhqGSJqWugAa15eBcHEwNcZ8ep QLHu+IaClulS/zrg+ZEhCeknREUtGVPIz8cLfTBvoPjExLRXII6X8xnW1p4vqng94iG9NU ooId3m9yBGFuTNncQVmBukokyAWe9zo= Received: from mail-ej1-f69.google.com (mail-ej1-f69.google.com [209.85.218.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-324-OkLzuQVBNPqgMsGHRlMglw-1; Mon, 15 Aug 2022 11:07:45 -0400 X-MC-Unique: OkLzuQVBNPqgMsGHRlMglw-1 Received: by mail-ej1-f69.google.com with SMTP id hp22-20020a1709073e1600b007309edc4089so1129093ejc.9 for ; Mon, 15 Aug 2022 08:07:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=Tw1noeklhKvo4eFPa0CcJ3FiBdmqu8Dq+RMX4xesWq0=; b=uOBoW7il18WyD6dCNzS7EqGQJHn1IjWJ3ZODRyeol2aZoREcw61HmpiXuV5MfkfkDC OcD+MwRKlP3gfdENbIkMGWv1ZaHW5BrHK18kY3l7ZhZURQpRP0a0Cbc/kO5fy5jd45J8 QyUeNUfBXPVVBHQgmi8PbUNHybtTdfvR6t/gk9zpxxGw2zbQ7lcaz/Y3MCqtgymx1Ag7 xbjz8oqmoFwO9Iz+VUf3dtEw7txMPJ/AE4+umrBosKWB3nLr+266tS828rInOSICBP/0 pFvDpd0bHnJgPq9GZe4sVv+2a6MPSNclLczQgBaZ66b7fIaaByShEMN9txsxCnfHlPS9 Hf7Q== X-Gm-Message-State: ACgBeo0X27/rFdKeH1syrhpQKLDgiEQC5E9azUqHTkV6COyr4ouFE+fx sjCxfbc6ulX0o5h7UccwaIjMrEGRuj2vxxTZ3+Gx1wyHCXycgIIvaZyJvY6a05z09a6ozeTAerd sJR/tnVAoN1lrXUTyBkNcML6rHnI= X-Received: by 2002:a17:906:11d:b0:712:abf:3210 with SMTP id 29-20020a170906011d00b007120abf3210mr10788093eje.292.1660576064354; Mon, 15 Aug 2022 08:07:44 -0700 (PDT) X-Google-Smtp-Source: AA6agR7tSkwByq5xF+yQbvSmrIHPWHYtArmthzfrZzL3s9Mg7AqC/e7mdYkOfmwqETCp2bEDphqRJ1HIZBm4X4cThrU= X-Received: by 2002:a17:906:11d:b0:712:abf:3210 with SMTP id 29-20020a170906011d00b007120abf3210mr10788077eje.292.1660576064081; Mon, 15 Aug 2022 08:07:44 -0700 (PDT) MIME-Version: 1.0 References: <20220811103435.188481-1-david@redhat.com> <20220811103435.188481-3-david@redhat.com> <20220815153549.0288a9c6@thinkpad> In-Reply-To: <20220815153549.0288a9c6@thinkpad> From: David Hildenbrand Date: Mon, 15 Aug 2022 17:07:32 +0200 Message-ID: Subject: Re: [PATCH v2 2/2] mm/hugetlb: support write-faults in shared mappings To: Gerald Schaefer Cc: Mike Kravetz , Linux Kernel Mailing List , Linux MM , stable , linux-s390 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660576067; a=rsa-sha256; cv=none; b=igzopP1/gHBb9wYpB8npQMWrI3Lh6XHo8e7fBRWyFPVk5jM72LjCL6vlkJkqvYXGn/6ETp 7ipdVXwQFkyT2SiZ+t9JI8xNEziFVFlVJvN11HZZGTijUQnrat07siaL+HYY7Fda08YOVU Bp8MsnySa4a3UtgA95MJ0267eOZXVkk= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ec4EnFf2; spf=pass (imf18.hostedemail.com: domain of dhildenb@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhildenb@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660576067; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Tw1noeklhKvo4eFPa0CcJ3FiBdmqu8Dq+RMX4xesWq0=; b=Nv//ubPsl6MPVp51kYFOZCotEOTKFmuOtcSNjN6fLuJPfINTQ4Mfk/BUmVU3ARXUxbttet jOXSsnlODGGyOc5d44QpxVsgLl7BsbvEkoV56oyXOuQ/oGGcxFJH02CQYfZDNkRgy7laKq Rag6xw2EiGlZ8RnRr4Zco4LAnl5JwaM= Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ec4EnFf2; spf=pass (imf18.hostedemail.com: domain of dhildenb@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhildenb@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Stat-Signature: 7n16w167ani71a1qdwq6yonw5ba6reza X-Rspamd-Queue-Id: 29F1B1C007D X-Rspamd-Server: rspam06 X-HE-Tag: 1660576066-702564 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Aug 15, 2022 at 3:36 PM Gerald Schaefer wrote: > > On Thu, 11 Aug 2022 11:59:09 -0700 > Mike Kravetz wrote: > > > On 08/11/22 12:34, David Hildenbrand wrote: > > > If we ever get a write-fault on a write-protected page in a shared mapping, > > > we'd be in trouble (again). Instead, we can simply map the page writable. > > > > > > > > > > > Reason is that uffd-wp doesn't clear the uffd-wp PTE bit when > > > unregistering and consequently keeps the PTE writeprotected. Reason for > > > this is to avoid the additional overhead when unregistering. Note > > > that this is the case also for !hugetlb and that we will end up with > > > writable PTEs that still have the uffd-wp PTE bit set once we return > > > from hugetlb_wp(). I'm not touching the uffd-wp PTE bit for now, because it > > > seems to be a generic thing -- wp_page_reuse() also doesn't clear it. > > > > > > VM_MAYSHARE handling in hugetlb_fault() for FAULT_FLAG_WRITE > > > indicates that MAP_SHARED handling was at least envisioned, but could never > > > have worked as expected. > > > > > > While at it, make sure that we never end up in hugetlb_wp() on write > > > faults without VM_WRITE, because we don't support maybe_mkwrite() > > > semantics as commonly used in the !hugetlb case -- for example, in > > > wp_page_reuse(). > > > > Nit, > > to me 'make sure that we never end up in hugetlb_wp()' implies that > > we would check for condition in callers as opposed to first thing in > > hugetlb_wp(). However, I am OK with description as it. > Hi Gerald, > Is that new WARN_ON_ONCE() in hugetlb_wp() meant to indicate a real bug? Most probably, unless I am missing something important. Something triggers FAULT_FLAG_WRITE on a VMA without VM_WRITE and hugetlb_wp() would map the pte writable. Consequently, we'd have a writable pte inside a VMA that does not have write permissions, which is dubious. My check prevents that and bails out. Ordinary (!hugetlb) faults have maybe_mkwrite() (e.g., for FOLL_FORCE or breaking COW) semantics such that we won't be mapping PTEs writable if the VMA does not have write permissions. I suspect that either a) Some write fault misses a protection check and ends up triggering a FAULT_FLAG_WRITE where we should actually fail early. b) The write fault is valid and some VMA misses proper flags (VM_WRITE). c) The write fault is valid (e.g., for breaking COW or FOLL_FORCE) and we'd actually want maybe_mkwrite semantics. > It is triggered by libhugetlbfs testcase "HUGETLB_ELFMAP=R linkhuge_rw" > (at least on s390), and crashes our CI, because it runs with panic_on_warn > enabled. > > Not sure if this means that we have bug elsewhere, allowing us to > get to the WARN in hugetlb_wp(). That's what I suspect. Do you have a backtrace? Note that I'm on vacation this week and might not reply as fast as usual.