From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F33FCC00A89 for ; Mon, 2 Nov 2020 17:53:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5B0042225E for ; Mon, 2 Nov 2020 17:53:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="fR3h+iuT" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5B0042225E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A8D6A6B0036; Mon, 2 Nov 2020 12:53:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A63E36B0068; Mon, 2 Nov 2020 12:53:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 97A366B006C; Mon, 2 Nov 2020 12:53:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0141.hostedemail.com [216.40.44.141]) by kanga.kvack.org (Postfix) with ESMTP id 6BCC96B0036 for ; Mon, 2 Nov 2020 12:53:46 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 0CE1B1EF1 for ; Mon, 2 Nov 2020 17:53:46 +0000 (UTC) X-FDA: 77440226052.22.club16_2209a3a272b1 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id E110A18038E67 for ; Mon, 2 Nov 2020 17:53:45 +0000 (UTC) X-HE-Tag: club16_2209a3a272b1 X-Filterd-Recvd-Size: 6062 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf05.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Nov 2020 17:53:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1604339624; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jbDjjKhbn/CeKz9vwEfJGyUWdP2/xRCz/gMfQtrHdY0=; b=fR3h+iuTiIGn9fnifvuq6Tuf4MjemTXX0bsE7/BoDF+gBhIERIA0kT57KnYalOJA10HcNB NTk8Je4i3PmbcBHxnxvWsOs+rn1y2REBrIwXlCnjy+EEAKq8JCRCCUV9XIzAY3ty1tdVVn iya0zzVgIcAZs9eE71kmui0SVUc6AFg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-213-QZZMNBYBOrKVXiq4zv3hPw-1; Mon, 02 Nov 2020 12:53:40 -0500 X-MC-Unique: QZZMNBYBOrKVXiq4zv3hPw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C9088100CD1F; Mon, 2 Nov 2020 17:53:38 +0000 (UTC) Received: from [10.36.113.163] (ovpn-113-163.ams2.redhat.com [10.36.113.163]) by smtp.corp.redhat.com (Postfix) with ESMTP id 062E31002393; Mon, 2 Nov 2020 17:53:32 +0000 (UTC) Subject: Re: Onlining CXL Type2 device coherent memory To: Vikram Sethi , Dan Williams Cc: "linux-cxl@vger.kernel.org" , "Natu, Mahesh" , "Rudoff, Andy" , Jeff Smith , Mark Hairgrove , "jglisse@redhat.com" , Linux MM , Linux ACPI , Anshuman Khandual , "alex.williamson@redhat.com" , Samer El-Haj-Mahmoud , Shanker Donthineni References: <451b2571-c3e8-97d8-bfd0-f8054a1b75c5@redhat.com> <958912b2-1436-378f-43d7-cbc5c8955ffd@redhat.com> From: David Hildenbrand Organization: Red Hat GmbH Message-ID: <2f9fa312-e080-d995-eb82-1ac9e6128a33@redhat.com> Date: Mon, 2 Nov 2020 18:53:32 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 02.11.20 17:17, Vikram Sethi wrote: > Hi David, >> From: David Hildenbrand >> On 31.10.20 17:51, Dan Williams wrote: >>> On Sat, Oct 31, 2020 at 3:21 AM David Hildenbrand = wrote: >>>> >>>> On 30.10.20 21:37, Dan Williams wrote: >>>>> On Wed, Oct 28, 2020 at 4:06 PM Vikram Sethi wr= ote: >>>>>> >>>>>> Hello, >>>>>> >>>>>> I wanted to kick off a discussion on how Linux onlining of CXL [1]= type 2 >> device >>>>>> Coherent memory aka Host managed device memory (HDM) will work for >> type 2 CXL >>>>>> devices which are available/plugged in at boot. A type 2 CXL devic= e can be >> simply >>>>>> thought of as an accelerator with coherent device memory, that als= o has a >>>>>> CXL.cache to cache system memory. >>>>>> >>>>>> One could envision that BIOS/UEFI could expose the HDM in EFI memo= ry map >>>>>> as conventional memory as well as in ACPI SRAT/SLIT/HMAT. However,= at >> least >>>>>> on some architectures (arm64) EFI conventional memory available at= kernel >> boot >>>>>> memory cannot be offlined, so this may not be suitable on all arch= itectures. >>>>> >>>>> That seems an odd restriction. Add David, linux-mm, and linux-acpi = as >>>>> they might be interested / have comments on this restriction as wel= l. >>>>> >>>> >>>> I am missing some important details. >>>> >>>> a) What happens after offlining? Will the memory be remove_memory()'= ed? >>>> Will the device get physically unplugged? >>>> > Not always IMO. If the device was getting reset, the HDM memory is goin= g to be > unavailable while device is reset. Offlining the memory around the rese= t would Ouch, that speaks IMHO completely against exposing it as System RAM as=20 default. > be sufficient, but depending if driver had done the add_memory in probe= , > it perhaps would be onerous to have to remove_memory as well before res= et, > and then add it back after reset. I realize you=E2=80=99re saying such = a procedure > would be abusing hotplug framework, and we could perhaps require that m= emory > be removed prior to reset, but not clear to me that it *must* be remove= d for > correctness. >=20 > Another usecase of offlining without removing HDM could be around > Virtualization/passing entire device with its memory to a VM. If device= was > being used in the host kernel, and is then unbound, and bound to vfio-p= ci > (vfio-cxl?), would we expect vfio-pci to add_memory_driver_managed? At least for passing through memory to VMs (via KVM), you don't actually=20 need struct pages / memory exposed to the buddy via=20 add_memory_driver_managed(). Actually, doing that sounds like the wrong=20 approach. E.g., you would "allocate" the memory via devdax/dax_hmat and directly=20 map the resulting device into guest address space. At least that's what=20 some people are doing with --=20 Thanks, David / dhildenb