From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 71C13C43334
	for <linux-mm@archiver.kernel.org>; Fri, 24 Jun 2022 14:20:07 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id AE4E78E0226; Fri, 24 Jun 2022 10:20:06 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id A94288E020E; Fri, 24 Jun 2022 10:20:06 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 935AF8E0226; Fri, 24 Jun 2022 10:20:06 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12])
	by kanga.kvack.org (Postfix) with ESMTP id 7EE1C8E020E
	for <linux-mm@kvack.org>; Fri, 24 Jun 2022 10:20:06 -0400 (EDT)
Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay10.hostedemail.com (Postfix) with ESMTP id 44536A3A
	for <linux-mm@kvack.org>; Fri, 24 Jun 2022 14:20:06 +0000 (UTC)
X-FDA: 79613338812.20.37CD901
Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com [209.85.208.171])
	by imf30.hostedemail.com (Postfix) with ESMTP id D12CC80035
	for <linux-mm@kvack.org>; Fri, 24 Jun 2022 14:20:05 +0000 (UTC)
Received: by mail-lj1-f171.google.com with SMTP id n15so2930696ljg.8
        for <linux-mm@kvack.org>; Fri, 24 Jun 2022 07:20:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to
         :cc:content-transfer-encoding;
        bh=DX4gKze+gSLugENwiJT/N67T0nY4kiuroWR4tE5f6EI=;
        b=VfEBDmF+RRx2AePp1LKNOXvU7WJWT/O2vbwOK6TcFPNB4pQVsyY0RfX4vhJAh+Mbnb
         CSpLC4DuU04b6TaGJLqjByKLSlK8rMKagv+1X/BnF+DG6FGTeJTqco41IJxg4BBknO2c
         9nU6xj5VrURoXdlBlHE2MMDs0tD3YctaUDGt7Tex6JVeCdE7WaXXjW/s4w+mK5j07sok
         iU8nUCW2GGMNSUiRP8CU+ZLEXk2iKTAe2x+UOMTYWOweClggqSOEWwk7/xetU7NHaeBl
         PymrdlAltUTGV2UX8jh43nTF58/YGtacCmmHQYeAbukWxoteTQOVOrjG4tdcO81x/oRY
         ogCA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to:cc:content-transfer-encoding;
        bh=DX4gKze+gSLugENwiJT/N67T0nY4kiuroWR4tE5f6EI=;
        b=qihrIr3f9JIkivh/816gwCa3Ozpdc/J5hr57Hx8gpAsAiHIBuptkpU6oA0CjVhS2Tc
         RnUDnjZjSCRzHdVJUGuT6D84A7l2xto5MpsuUGlifd3pLYH7RfXpNHO+fPpR4BtVLjYf
         kyhsXALbq1I/w7g1LmVFgCxyZxj9N8a+JNROdb/3cESQspz1Y72yfkFS9KQTIsDmyW4s
         ASxf88EJQXwtZLg2mawqOfS1iFQV/BwuwKnI5yJFmE3/1jPxddaQBcKFelSty+abVyvy
         oLkEr8Ijo+tOD7ovzXHJNKDGpaWUbxfgpf+RUXg4pk8bgIqWaAZXIrKferI+3d0eUNfJ
         rrPQ==
X-Gm-Message-State: AJIora/wWqmJlXZrsB4Okw0MR0qz4xJdkYx+vnYvLLS8HjzJlOm2HDD2
	j2LkFx7AP9zNaSwBjN1Q2qHiT6+beKcBtxhY8UVlbA==
X-Google-Smtp-Source: AGRyM1vckgLN+sbO6XgC0Dq/BPpCpn8v6O+aur4z8QeasCmZg9NVntlD3MncTmewgOzkhrloR2oEBJ1vuZjhWgVXoSA=
X-Received: by 2002:a2e:2a43:0:b0:25a:84a9:921c with SMTP id
 q64-20020a2e2a43000000b0025a84a9921cmr7482170ljq.83.1656080403810; Fri, 24
 Jun 2022 07:20:03 -0700 (PDT)
MIME-Version: 1.0
References: <cover.1655761627.git.ashish.kalra@amd.com> <3a51840f6a80c87b39632dc728dbd9b5dd444cd7.1655761627.git.ashish.kalra@amd.com>
 <CAMkAt6ruxMazN3NmWHsemDNQj6Uj0PhCVeaxw2unCxU=YZFRWw@mail.gmail.com> <SN6PR12MB276722570164ECD120BA4D628EB39@SN6PR12MB2767.namprd12.prod.outlook.com>
In-Reply-To: <SN6PR12MB276722570164ECD120BA4D628EB39@SN6PR12MB2767.namprd12.prod.outlook.com>
From: Peter Gonda <pgonda@google.com>
Date: Fri, 24 Jun 2022 08:19:52 -0600
Message-ID: <CAMkAt6pcsgp7BK4WGnvTTNayN9zD8wx5CjprnY2Xe_RnpP3sEA@mail.gmail.com>
Subject: Re: [PATCH Part2 v6 14/49] crypto: ccp: Handle the legacy TMR
 allocation when SNP is enabled
To: "Kalra, Ashish" <Ashish.Kalra@amd.com>
Cc: "the arch/x86 maintainers" <x86@kernel.org>, LKML <linux-kernel@vger.kernel.org>, 
	kvm list <kvm@vger.kernel.org>, 
	"linux-coco@lists.linux.dev" <linux-coco@lists.linux.dev>, "linux-mm@kvack.org" <linux-mm@kvack.org>, 
	Linux Crypto Mailing List <linux-crypto@vger.kernel.org>, Thomas Gleixner <tglx@linutronix.de>, 
	Ingo Molnar <mingo@redhat.com>, Joerg Roedel <jroedel@suse.de>, 
	"Lendacky, Thomas" <Thomas.Lendacky@amd.com>, "H. Peter Anvin" <hpa@zytor.com>, 
	Ard Biesheuvel <ardb@kernel.org>, Paolo Bonzini <pbonzini@redhat.com>, 
	Sean Christopherson <seanjc@google.com>, Vitaly Kuznetsov <vkuznets@redhat.com>, 
	Jim Mattson <jmattson@google.com>, Andy Lutomirski <luto@kernel.org>, 
	Dave Hansen <dave.hansen@linux.intel.com>, Sergio Lopez <slp@redhat.com>, 
	Peter Zijlstra <peterz@infradead.org>, 
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>, David Rientjes <rientjes@google.com>, 
	Dov Murik <dovmurik@linux.ibm.com>, Tobin Feldman-Fitzthum <tobin@ibm.com>, Borislav Petkov <bp@alien8.de>, 
	"Roth, Michael" <Michael.Roth@amd.com>, Vlastimil Babka <vbabka@suse.cz>, 
	"Kirill A . Shutemov" <kirill@shutemov.name>, Andi Kleen <ak@linux.intel.com>, 
	Tony Luck <tony.luck@intel.com>, Marc Orr <marcorr@google.com>, 
	Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com>, 
	Alper Gun <alpergun@google.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, 
	"jarkko@kernel.org" <jarkko@kernel.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
ARC-Authentication-Results: i=1;
	imf30.hostedemail.com;
	dkim=pass header.d=google.com header.s=20210112 header.b=VfEBDmF+;
	dmarc=pass (policy=reject) header.from=google.com;
	spf=pass (imf30.hostedemail.com: domain of pgonda@google.com designates 209.85.208.171 as permitted sender) smtp.mailfrom=pgonda@google.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656080405; a=rsa-sha256;
	cv=none;
	b=WwbIkbtYe9o8LM70fSty6U7sJAh3tbW4g/PPqV2RiKbxtnXEb1GQq6i69vaLBYwgUjNvTD
	ygKF6JRbJC3pqI4YhOGL97Hr6KXZlCMM0BfStXLhoieJ4Qy6Ds8zSGBlb8Zl6x6Q46kG7L
	Z0WfTSzhLg4PD/1n6A/yAL60JtEpYcA=
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1656080405;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=DX4gKze+gSLugENwiJT/N67T0nY4kiuroWR4tE5f6EI=;
	b=71k925qkj9PHUN841Ce9F6JuOuqBBktT4/WnMysaKCM1swXP1piPYr7uMzEa16sINC9RjR
	At8YufP+YdLB9rwT7twPlSUeILGbLps7di/irtQyOpLEE/5eQ3nUkX1nYvjlEiu/9GTBXA
	tw4OJZvZcH7Yumz/crbqDvoG1hsyZfc=
X-Stat-Signature: tfyr9wtkfxas7dtih8qk7a59kzypyfot
X-Rspamd-Server: rspam08
X-Rspam-User: 
X-Rspamd-Queue-Id: D12CC80035
Authentication-Results: imf30.hostedemail.com;
	dkim=pass header.d=google.com header.s=20210112 header.b=VfEBDmF+;
	dmarc=pass (policy=reject) header.from=google.com;
	spf=pass (imf30.hostedemail.com: domain of pgonda@google.com designates 209.85.208.171 as permitted sender) smtp.mailfrom=pgonda@google.com
X-HE-Tag: 1656080405-159157
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Tue, Jun 21, 2022 at 2:17 PM Kalra, Ashish <Ashish.Kalra@amd.com> wrote:
>
> [Public]
>
> Hello Peter,
>
> >> +static int snp_reclaim_pages(unsigned long pfn, unsigned int npages,
> >> +bool locked) {
> >> +       struct sev_data_snp_page_reclaim data;
> >> +       int ret, err, i, n =3D 0;
> >> +
> >> +       for (i =3D 0; i < npages; i++) {
>
> >What about setting |n| here too, also the other increments.
>
> >for (i =3D 0, n =3D 0; i < npages; i++, n++, pfn++)
>
> Yes that is simpler.
>
> >> +               memset(&data, 0, sizeof(data));
> >> +               data.paddr =3D pfn << PAGE_SHIFT;
> >> +
> >> +               if (locked)
> >> +                       ret =3D __sev_do_cmd_locked(SEV_CMD_SNP_PAGE_R=
ECLAIM, &data, &err);
> >> +               else
> >> +                       ret =3D sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM,
> >> + &data, &err);
>
> > Can we change `sev_cmd_mutex` to some sort of nesting lock type? That c=
ould clean up this if (locked) code.
>
> > +static inline int rmp_make_firmware(unsigned long pfn, int level) {
> > +       return rmp_make_private(pfn, 0, level, 0, true); }
> > +
> > +static int snp_set_rmp_state(unsigned long paddr, unsigned int npages,=
 bool to_fw, bool locked,
> > +                            bool need_reclaim)
>
> >This function can do a lot and when I read the call sites its hard to se=
e what its doing since we have a combination of arguments which tell us wha=
t behavior is happening, some of which are not valid (ex: to_fw =3D=3D true=
 and need_reclaim =3D=3D true is an >invalid argument combination).
>
> to_fw is used to make a firmware page and need_reclaim is for freeing the=
 firmware page, so they are going to be mutually exclusive.
>
> I actually can connect with it quite logically with the callers :
> snp_alloc_firmware_pages will call with to_fw =3D true and need_reclaim =
=3D false
> and snp_free_firmware_pages will do the opposite, to_fw =3D false and nee=
d_reclaim =3D true.
>
> That seems straightforward to look at.

This might be a preference thing but I find it not straightforward.
When I am reading through unmap_firmware_writeable() and I see

  /* Transition the pre-allocated buffer to the firmware state. */
  if (snp_set_rmp_state(__pa(map->host), npages, true, true, false))
   return -EFAULT;

I don't actually know what snp_set_rmp_state() is doing unless I go
look at the definition and see what all those booleans mean. This is
unlike the rmp_make_shared() and rmp_make_private() functions, each of
which tells me a lot more about what the function will do just from
the name.


>
> >Also this for loop over |npages| is duplicated from snp_reclaim_pages().=
 One improvement here is that on the current
> >snp_reclaim_pages() if we fail to reclaim a page we assume we cannot rec=
laim the next pages, this may cause us to snp_leak_pages() more pages than =
we actually need too.
>
> Yes that is true.
>
> >What about something like this?
>
> >static snp_leak_page(u64 pfn, enum pg_level level) {
> >   memory_failure(pfn, 0);
> >   dump_rmpentry(pfn);
> >}
>
> >static int snp_reclaim_page(u64 pfn, enum pg_level level) {
> >  int ret;
> >  struct sev_data_snp_page_reclaim data;
>
> >  ret =3D sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
> >  if (ret)
> >    goto cleanup;
>
> >  ret =3D rmp_make_shared(pfn, level);
> >  if (ret)
> >    goto cleanup;
>
> > return 0;
>
> >cleanup:
> >    snp_leak_page(pfn, level)
> >}
>
> >typedef int (*rmp_state_change_func) (u64 pfn, enum pg_level level);
>
> >static int snp_set_rmp_state(unsigned long paddr, unsigned int npages, r=
mp_state_change_func state_change, rmp_state_change_func cleanup) {
> >  struct sev_data_snp_page_reclaim data;
> >  int ret, err, i, n =3D 0;
>
> >  for (i =3D 0, n =3D 0; i < npages; i++, n++, pfn++) {
> >    ret =3D state_change(pfn, PG_LEVEL_4K)
> >    if (ret)
> >      goto cleanup;
> >  }
>
> >  return 0;
>
> > cleanup:
> >  for (; i>=3D 0; i--, n--, pfn--) {
> >    cleanup(pfn, PG_LEVEL_4K);
> >  }
>
> >  return ret;
> >}
>
> >Then inside of __snp_alloc_firmware_pages():
>
> >snp_set_rmp_state(paddr, npages, rmp_make_firmware, snp_reclaim_page);
>
> >And inside of __snp_free_firmware_pages():
>
> >snp_set_rmp_state(paddr, npages, snp_reclaim_page, snp_leak_page);
>
> >Just a suggestion feel free to ignore. The readability comment could be =
addressed much less invasively by just making separate functions for each v=
alid combination of arguments here. Like snp_set_rmp_fw_state(), snp_set_rm=
p_shared_state(),
> >snp_set_rmp_release_state() or something.
>
> >> +static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int
> >> +order, bool locked) {
> >> +       unsigned long npages =3D 1ul << order, paddr;
> >> +       struct sev_device *sev;
> >> +       struct page *page;
> >> +
> >> +       if (!psp_master || !psp_master->sev_data)
> >> +               return NULL;
> >> +
> >> +       page =3D alloc_pages(gfp_mask, order);
> >> +       if (!page)
> >> +               return NULL;
> >> +
> >> +       /* If SEV-SNP is initialized then add the page in RMP table. *=
/
> >> +       sev =3D psp_master->sev_data;
> >> +       if (!sev->snp_inited)
> >> +               return page;
> >> +
> >> +       paddr =3D __pa((unsigned long)page_address(page));
> >> +       if (snp_set_rmp_state(paddr, npages, true, locked, false))
> >> +               return NULL;
>
> >So what about the case where snp_set_rmp_state() fails but we were able =
to reclaim all the pages? Should we be able to signal that to callers so th=
at we could free |page| here? But given this is an error path already maybe=
 we can optimize this in a >follow up series.
>
> Yes, we should actually tie in to snp_reclaim_pages() success or failure =
here in the case we were able to successfully unroll some or all of the fir=
mware state change.
>
> > +
> > +       return page;
> > +}
> > +
> > +void *snp_alloc_firmware_page(gfp_t gfp_mask) {
> > +       struct page *page;
> > +
> > +       page =3D __snp_alloc_firmware_pages(gfp_mask, 0, false);
> > +
> > +       return page ? page_address(page) : NULL; }
> > +EXPORT_SYMBOL_GPL(snp_alloc_firmware_page);
> > +
> > +static void __snp_free_firmware_pages(struct page *page, int order,
> > +bool locked) {
> > +       unsigned long paddr, npages =3D 1ul << order;
> > +
> > +       if (!page)
> > +               return;
> > +
> > +       paddr =3D __pa((unsigned long)page_address(page));
> > +       if (snp_set_rmp_state(paddr, npages, false, locked, true))
> > +               return;
>
> > Here we may be able to free some of |page| depending how where inside o=
f snp_set_rmp_state() we failed. But again given this is an error path alre=
ady maybe we can optimize this in a follow up series.
>
> Yes, we probably should be able to free some of the page(s) depending on =
how many page(s) got reclaimed in snp_set_rmp_state().
> But these reclamation failures may not be very common, so any failure is =
indicative of a bigger issue, it might be the case when there is a single p=
age reclamation error it might happen with all the subsequent
> pages and so follow a simple recovery procedure, then handling a more com=
plex recovery for a chunk of pages being reclaimed and another chunk not.
>
> Thanks,
> Ashish
>
>
>