From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DF18C352A5 for ; Mon, 10 Feb 2020 17:27:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C3F4920842 for ; Mon, 10 Feb 2020 17:27:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C3F4920842 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=de.ibm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6946E6B013A; Mon, 10 Feb 2020 12:27:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 645556B013B; Mon, 10 Feb 2020 12:27:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 50DF16B013C; Mon, 10 Feb 2020 12:27:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0067.hostedemail.com [216.40.44.67]) by kanga.kvack.org (Postfix) with ESMTP id 37BAA6B013A for ; Mon, 10 Feb 2020 12:27:17 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id AF511181AC9CB for ; Mon, 10 Feb 2020 17:27:16 +0000 (UTC) X-FDA: 76474898472.29.pail66_5c9443e54361a X-HE-Tag: pail66_5c9443e54361a X-Filterd-Recvd-Size: 11783 Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf19.hostedemail.com (Postfix) with ESMTP for ; Mon, 10 Feb 2020 17:27:16 +0000 (UTC) Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 01AHO9ox023889 for ; Mon, 10 Feb 2020 12:27:15 -0500 Received: from e06smtp01.uk.ibm.com (e06smtp01.uk.ibm.com [195.75.94.97]) by mx0b-001b2d01.pphosted.com with ESMTP id 2y1u80g1pt-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 10 Feb 2020 12:27:15 -0500 Received: from localhost by e06smtp01.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 10 Feb 2020 17:27:13 -0000 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp01.uk.ibm.com (192.168.101.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Mon, 10 Feb 2020 17:27:07 -0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 01AHR6qO54132752 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 10 Feb 2020 17:27:06 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 30F8F52050; Mon, 10 Feb 2020 17:27:06 +0000 (GMT) Received: from oc7455500831.ibm.com (unknown [9.145.7.195]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 2CFB952059; Mon, 10 Feb 2020 17:27:05 +0000 (GMT) Subject: Re: [PATCH 01/35] mm:gup/writeback: add callbacks for inaccessible pages To: Janosch Frank , Andrew Morton , Marc Zyngier , Sean Christopherson , Tom Lendacky Cc: KVM , Cornelia Huck , David Hildenbrand , Thomas Huth , Ulrich Weigand , Claudio Imbrenda , Andrea Arcangeli , linux-s390 , Michael Mueller , Vasily Gorbik , linux-mm@kvack.org, kvm-ppc@vger.kernel.org, Paolo Bonzini References: <20200207113958.7320-1-borntraeger@de.ibm.com> <20200207113958.7320-2-borntraeger@de.ibm.com> From: Christian Borntraeger Autocrypt: addr=borntraeger@de.ibm.com; prefer-encrypt=mutual; keydata= xsFNBE6cPPgBEAC2VpALY0UJjGmgAmavkL/iAdqul2/F9ONz42K6NrwmT+SI9CylKHIX+fdf J34pLNJDmDVEdeb+brtpwC9JEZOLVE0nb+SR83CsAINJYKG3V1b3Kfs0hydseYKsBYqJTN2j CmUXDYq9J7uOyQQ7TNVoQejmpp5ifR4EzwIFfmYDekxRVZDJygD0wL/EzUr8Je3/j548NLyL 4Uhv6CIPf3TY3/aLVKXdxz/ntbLgMcfZsDoHgDk3lY3r1iwbWwEM2+eYRdSZaR4VD+JRD7p8 0FBadNwWnBce1fmQp3EklodGi5y7TNZ/CKdJ+jRPAAnw7SINhSd7PhJMruDAJaUlbYaIm23A +82g+IGe4z9tRGQ9TAflezVMhT5J3ccu6cpIjjvwDlbxucSmtVi5VtPAMTLmfjYp7VY2Tgr+ T92v7+V96jAfE3Zy2nq52e8RDdUo/F6faxcumdl+aLhhKLXgrozpoe2nL0Nyc2uqFjkjwXXI OBQiaqGeWtxeKJP+O8MIpjyGuHUGzvjNx5S/592TQO3phpT5IFWfMgbu4OreZ9yekDhf7Cvn /fkYsiLDz9W6Clihd/xlpm79+jlhm4E3xBPiQOPCZowmHjx57mXVAypOP2Eu+i2nyQrkapaY IdisDQfWPdNeHNOiPnPS3+GhVlPcqSJAIWnuO7Ofw1ZVOyg/jwARAQABzUNDaHJpc3RpYW4g Qm9ybnRyYWVnZXIgKDJuZCBJQk0gYWRkcmVzcykgPGJvcm50cmFlZ2VyQGxpbnV4LmlibS5j b20+wsF5BBMBAgAjBQJdP/hMAhsDBwsJCAcDAgEGFQgCCQoLBBYCAwECHgECF4AACgkQEXu8 gLWmHHy/pA/+JHjpEnd01A0CCyfVnb5fmcOlQ0LdmoKWLWPvU840q65HycCBFTt6V62cDljB kXFFxMNA4y/2wqU0H5/CiL963y3gWIiJsZa4ent+KrHl5GK1nIgbbesfJyA7JqlB0w/E/SuY NRQwIWOo/uEvOgXnk/7+rtvBzNaPGoGiiV1LZzeaxBVWrqLtmdi1iulW/0X/AlQPuF9dD1Px hx+0mPjZ8ClLpdSp5d0yfpwgHtM1B7KMuQPQZGFKMXXTUd3ceBUGGczsgIMipZWJukqMJiJj QIMH0IN7XYErEnhf0GCxJ3xAn/J7iFpPFv8sFZTvukntJXSUssONnwiKuld6ttUaFhSuSoQg OFYR5v7pOfinM0FcScPKTkrRsB5iUvpdthLq5qgwdQjmyINt3cb+5aSvBX2nNN135oGOtlb5 tf4dh00kUR8XFHRrFxXx4Dbaw4PKgV3QLIHKEENlqnthH5t0tahDygQPnSucuXbVQEcDZaL9 WgJqlRAAj0pG8M6JNU5+2ftTFXoTcoIUbb0KTOibaO9zHVeGegwAvPLLNlKHiHXcgLX1tkjC DrvE2Z0e2/4q7wgZgn1kbvz7ZHQZB76OM2mjkFu7QNHlRJ2VXJA8tMXyTgBX6kq1cYMmd/Hl OhFrAU3QO1SjCsXA2CDk9MM1471mYB3CTXQuKzXckJnxHkHOwU0ETpw8+AEQAJjyNXvMQdJN t07BIPDtbAQk15FfB0hKuyZVs+0lsjPKBZCamAAexNRk11eVGXK/YrqwjChkk60rt3q5i42u PpNMO9aS8cLPOfVft89Y654Qd3Rs1WRFIQq9xLjdLfHh0i0jMq5Ty+aiddSXpZ7oU6E+ud+X Czs3k5RAnOdW6eV3+v10sUjEGiFNZwzN9Udd6PfKET0J70qjnpY3NuWn5Sp1ZEn6lkq2Zm+G 9G3FlBRVClT30OWeiRHCYB6e6j1x1u/rSU4JiNYjPwSJA8EPKnt1s/Eeq37qXXvk+9DYiHdT PcOa3aNCSbIygD3jyjkg6EV9ZLHibE2R/PMMid9FrqhKh/cwcYn9FrT0FE48/2IBW5mfDpAd YvpawQlRz3XJr2rYZJwMUm1y+49+1ZmDclaF3s9dcz2JvuywNq78z/VsUfGz4Sbxy4ShpNpG REojRcz/xOK+FqNuBk+HoWKw6OxgRzfNleDvScVmbY6cQQZfGx/T7xlgZjl5Mu/2z+ofeoxb vWWM1YCJAT91GFvj29Wvm8OAPN/+SJj8LQazd9uGzVMTz6lFjVtH7YkeW/NZrP6znAwv5P1a DdQfiB5F63AX++NlTiyA+GD/ggfRl68LheSskOcxDwgI5TqmaKtX1/8RkrLpnzO3evzkfJb1 D5qh3wM1t7PZ+JWTluSX8W25ABEBAAHCwV8EGAECAAkFAk6cPPgCGwwACgkQEXu8gLWmHHz8 2w//VjRlX+tKF3szc0lQi4X0t+pf88uIsvR/a1GRZpppQbn1jgE44hgF559K6/yYemcvTR7r 6Xt7cjWGS4wfaR0+pkWV+2dbw8Xi4DI07/fN00NoVEpYUUnOnupBgychtVpxkGqsplJZQpng v6fauZtyEcUK3dLJH3TdVQDLbUcL4qZpzHbsuUnTWsmNmG4Vi0NsEt1xyd/Wuw+0kM/oFEH1 4BN6X9xZcG8GYUbVUd8+bmio8ao8m0tzo4pseDZFo4ncDmlFWU6hHnAVfkAs4tqA6/fl7RLN JuWBiOL/mP5B6HDQT9JsnaRdzqF73FnU2+WrZPjinHPLeE74istVgjbowvsgUqtzjPIG5pOj cAsKoR0M1womzJVRfYauWhYiW/KeECklci4TPBDNx7YhahSUlexfoftltJA8swRshNA/M90/ i9zDo9ySSZHwsGxG06ZOH5/MzG6HpLja7g8NTgA0TD5YaFm/oOnsQVsf2DeAGPS2xNirmknD jaqYefx7yQ7FJXXETd2uVURiDeNEFhVZWb5CiBJM5c6qQMhmkS4VyT7/+raaEGgkEKEgHOWf ZDP8BHfXtszHqI3Fo1F4IKFo/AP8GOFFxMRgbvlAs8z/+rEEaQYjxYJqj08raw6P4LFBqozr nS4h0HDFPrrp1C2EMVYIQrMokWvlFZbCpsdYbBI= Date: Mon, 10 Feb 2020 18:27:04 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.3.0 MIME-Version: 1.0 In-Reply-To: <20200207113958.7320-2-borntraeger@de.ibm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US X-TM-AS-GCONF: 00 x-cbid: 20021017-4275-0000-0000-0000039FD488 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 20021017-4276-0000-0000-000038B409F8 Message-Id: <28792269-e053-ac70-a344-45612ee5c729@de.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138,18.0.572 definitions=2020-02-10_06:2020-02-10,2020-02-10 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 adultscore=0 mlxlogscore=999 spamscore=0 priorityscore=1501 suspectscore=2 impostorscore=0 lowpriorityscore=0 phishscore=0 mlxscore=0 clxscore=1015 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002100131 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: CC Marc Zyngier for KVM on ARM. Marc, see below. Will there be any use for this on KVM/ARM in the future? CC Sean Christopherson/Tom Lendacky. Any obvious use case for Intel/AMD to have a callback before a page is used for I/O? Andrew (or other mm people) any chance to get an ACK for this change? I could then carry that via s390 or KVM tree. Or if you want to carry that yourself I can send an updated version (we need to kind of=20 synchronize that Linus will pull the KVM changes after the mm changes). Andrea asked if others would benefit from this, so here are some more information about this (and I can also put this into the patch description). So we have talked to the POWER folks. They do not use the standard normal memory management, instead they have a hard split between secure and normal memory. The secure memory is the handled by the hypervisor as device memory and the ultravisor and the hypervisor move this forth and back when needed. On s390 there is no *separate* pool of physical pages that are secure. Instead, *any* physical page can be marked as secure or not, by setting a bit in a per-page data structure that hardware uses to stop unauthorized access. (That bit is under control of the ultravisor.) Note that one side effect of this strategy is that the decision *which* secure pages to encrypt and then swap out is actually done by the hypervisor, not the ultravisor. In our case, the hypervisor is Linux/KVM, so we're using the regular Linux memory management scheme (active/inactive LRU lists etc.) to make this decision. The advantage is that the Ultravisor code does not need to itself implement any memory management code, making it a lot simpler. However, in the end this is why we need the hook into Linux memory management: once Linux has decided to swap a page out, we need to get a chance to tell the Ultravisor to "export" the page (i.e., encrypt its contents and mark it no longer secure). As outlined below this should be a no-op for anybody not opting in. Christian =20 On 07.02.20 12:39, Christian Borntraeger wrote: > From: Claudio Imbrenda >=20 > With the introduction of protected KVM guests on s390 there is now a > concept of inaccessible pages. These pages need to be made accessible > before the host can access them. >=20 > While cpu accesses will trigger a fault that can be resolved, I/O > accesses will just fail. We need to add a callback into architecture > code for places that will do I/O, namely when writeback is started or > when a page reference is taken. >=20 > Signed-off-by: Claudio Imbrenda > Signed-off-by: Christian Borntraeger > --- > include/linux/gfp.h | 6 ++++++ > mm/gup.c | 2 ++ > mm/page-writeback.c | 1 + > 3 files changed, 9 insertions(+) >=20 > diff --git a/include/linux/gfp.h b/include/linux/gfp.h > index e5b817cb86e7..be2754841369 100644 > --- a/include/linux/gfp.h > +++ b/include/linux/gfp.h > @@ -485,6 +485,12 @@ static inline void arch_free_page(struct page *pag= e, int order) { } > #ifndef HAVE_ARCH_ALLOC_PAGE > static inline void arch_alloc_page(struct page *page, int order) { } > #endif > +#ifndef HAVE_ARCH_MAKE_PAGE_ACCESSIBLE > +static inline int arch_make_page_accessible(struct page *page) > +{ > + return 0; > +} > +#endif > =20 > struct page * > __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferr= ed_nid, > diff --git a/mm/gup.c b/mm/gup.c > index 7646bf993b25..a01262cd2821 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -257,6 +257,7 @@ static struct page *follow_page_pte(struct vm_area_= struct *vma, > page =3D ERR_PTR(-ENOMEM); > goto out; > } > + arch_make_page_accessible(page); > } > if (flags & FOLL_TOUCH) { > if ((flags & FOLL_WRITE) && > @@ -1870,6 +1871,7 @@ static int gup_pte_range(pmd_t pmd, unsigned long= addr, unsigned long end, > =20 > VM_BUG_ON_PAGE(compound_head(page) !=3D head, page); > =20 > + arch_make_page_accessible(page); > SetPageReferenced(page); > pages[*nr] =3D page; > (*nr)++; > diff --git a/mm/page-writeback.c b/mm/page-writeback.c > index 2caf780a42e7..0f0bd14571b1 100644 > --- a/mm/page-writeback.c > +++ b/mm/page-writeback.c > @@ -2806,6 +2806,7 @@ int __test_set_page_writeback(struct page *page, = bool keep_write) > inc_lruvec_page_state(page, NR_WRITEBACK); > inc_zone_page_state(page, NR_ZONE_WRITE_PENDING); > } > + arch_make_page_accessible(page); > unlock_page_memcg(page); As outlined by Ulrich, we can move the callback after the unlock. > return ret; > =20 >=20