From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 647ECC433DB for ; Thu, 21 Jan 2021 22:14:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C9DC9221E5 for ; Thu, 21 Jan 2021 22:14:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C9DC9221E5 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BB3DE6B0006; Thu, 21 Jan 2021 17:14:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B64756B0007; Thu, 21 Jan 2021 17:14:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A54516B0008; Thu, 21 Jan 2021 17:14:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0034.hostedemail.com [216.40.44.34]) by kanga.kvack.org (Postfix) with ESMTP id 8C7946B0006 for ; Thu, 21 Jan 2021 17:14:28 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 4775B181AF5C6 for ; Thu, 21 Jan 2021 22:14:28 +0000 (UTC) X-FDA: 77731187016.28.silk51_080962f27566 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin28.hostedemail.com (Postfix) with ESMTP id 198566D66 for ; Thu, 21 Jan 2021 22:14:28 +0000 (UTC) X-HE-Tag: silk51_080962f27566 X-Filterd-Recvd-Size: 5771 Received: from mail-io1-f43.google.com (mail-io1-f43.google.com [209.85.166.43]) by imf06.hostedemail.com (Postfix) with ESMTP for ; Thu, 21 Jan 2021 22:14:27 +0000 (UTC) Received: by mail-io1-f43.google.com with SMTP id e22so7274200iog.6 for ; Thu, 21 Jan 2021 14:14:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=O2R6G+JQomFrmP9HO7+cEOKXT8jw8uCDjvCWyH87E3w=; b=FozQaxbvtJXB+IozviQH2OUL+/gJEXx3LdIg5KlULHJ/JcijC3OTsPE0MwypuN3fqo O3zhoj3RuyZmmxvJKd/WbU6AIfeLwcnOqdrFcu3dMWp5qwK3yr0GSYc4gBw7l8uAu2AB MyKJopx+bY1jlZX2i4Qqx26ZUjVsQAYoeM8nI2EM+z7JYv4c/n9kK71N6N6utGRBzB+J WFsOFwvNoIZL1r5txOYFCPYvT1KczLHV5VHqIuhZ7qIdW/3bwtPgKBc8Woi+7msRmNgz FDMzIltl1bnCnGBOW0qUoBy8SePB68bkYe2R53WNLe2FF8KCtNRSdR21g2faMHF9Qayx y2dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=O2R6G+JQomFrmP9HO7+cEOKXT8jw8uCDjvCWyH87E3w=; b=nzH426XO1No7W3TTkyax+xao6RvynLk/aRd3uVrb0dHW2D1zD5VIqtZY4wEfmExCcA w0Y4CO01ieJfJxuHwZAysCOAfqxN1jWiu8tqjxCIAB+5WcwRO9bBpe3hUEzEdHFqkKAw cTnUyvOybEO5rwU1tR1kmgn0QmYCP1Conlgh66k83viNfwcDSx5cwChdA60xLn3+VhGZ aB5qQL3d7+e+r+UB/sd2ZLjpOzbwyE/6kM/Ag+XNBsBc9ujD9dUOGHHJyi8DlrohpEfe MoBh3cXMv+aOuqoXLEtw3sVUNi3Ot6E+AkS3lzr6OBi+fW8ZcR3iXng/wlPaM6NMVYoI 2CdQ== X-Gm-Message-State: AOAM532JMSdWBPdxj8XZGGah8mf+rB92quVDlnSy15fw4MA+/WnsParG B2/3P0+SaJsUxulheOLGBe9s/9ivHXNe4KzJbUqI6Q== X-Google-Smtp-Source: ABdhPJzTjt3TB+ykOZeLyHdIzwOleNs/mipT9Li327q3wBoTK1DHufVnbXilw914LnUDbHj3R79jXmVoKNYblt4kgD8= X-Received: by 2002:a02:7610:: with SMTP id z16mr1116868jab.99.1611267266926; Thu, 21 Jan 2021 14:14:26 -0800 (PST) MIME-Version: 1.0 References: <20210115190451.3135416-1-axelrasmussen@google.com> <20210121191241.GG260413@xz-x1> In-Reply-To: <20210121191241.GG260413@xz-x1> From: Axel Rasmussen Date: Thu, 21 Jan 2021 14:13:50 -0800 Message-ID: Subject: Re: [PATCH 0/9] userfaultfd: add minor fault handling To: Peter Xu Cc: Alexander Viro , Alexey Dobriyan , Andrea Arcangeli , Andrew Morton , Anshuman Khandual , Catalin Marinas , Chinwen Chang , Huang Ying , Ingo Molnar , Jann Horn , Jerome Glisse , Lokesh Gidra , "Matthew Wilcox (Oracle)" , Michael Ellerman , =?UTF-8?Q?Michal_Koutn=C3=BD?= , Michel Lespinasse , Mike Kravetz , Mike Rapoport , Nicholas Piggin , Shaohua Li , Shawn Anastasio , Steven Rostedt , Steven Price , Vlastimil Babka , LKML , linux-fsdevel@vger.kernel.org, Linux MM , Adam Ruprecht , Cannon Matthews , "Dr . David Alan Gilbert" , David Rientjes , Oliver Upton Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jan 21, 2021 at 11:12 AM Peter Xu wrote: > > On Fri, Jan 15, 2021 at 11:04:42AM -0800, Axel Rasmussen wrote: > > UFFDIO_COPY and UFFDIO_ZEROPAGE cannot be used to resolve minor faults. Without > > modifications, the existing codepath assumes a new page needs to be allocated. > > This is okay, since userspace must have a second non-UFFD-registered mapping > > anyway, thus there isn't much reason to want to use these in any case (just > > memcpy or memset or similar). > > > > - If UFFDIO_COPY is used on a minor fault, -EEXIST is returned. > > When minor fault the dst VM will report to src with the address. The src could > checkup whether dst contains the latest data on that (pmd) page and either: > > - it's latest, then tells dst, dst does UFFDIO_CONTINUE > > - it's not latest, then tells dst (probably along with the page data? if > hugetlbfs doesn't support double map, we'd need to batch all the dirty > small pages in one shot), dst does whatever to replace the page > > Then, I'm thinking what would be the way to replace an old page.. is that one > FALLOC_FL_PUNCH_HOLE plus one UFFDIO_COPY at last? When I wrote this, my thinking was that users of this feature would have two mappings, one of which is not UFFD registered at all. So, to replace the existing page contents, userspace would just write to the non-UFFD mapping (with memcpy() or whatever else, or we could get fancy and imagine using some RDMA technology to copy the page over the network from the live migration source directly in place). After performing the write, we just UFFDIO_CONTINUE. I believe FALLOC_FL_PUNCH_HOLE / MADV_REMOVE doesn't work with hugetlbfs? Once shmem support is implemented, I would expect FALLOC_FL_PUNCH_HOLE + UFFDIO_COPY to work, but I wonder if such an operation would be more expensive than just copying using the other side of the shared mapping? > > Thanks, > > -- > Peter Xu >