From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3ACE6C433EF for ; Mon, 27 Jun 2022 14:59:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C878F8E0001; Mon, 27 Jun 2022 10:59:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C370C6B0072; Mon, 27 Jun 2022 10:59:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AD7588E0001; Mon, 27 Jun 2022 10:59:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 9B86C6B0071 for ; Mon, 27 Jun 2022 10:59:14 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 702A234DC4 for ; Mon, 27 Jun 2022 14:59:14 +0000 (UTC) X-FDA: 79624323828.03.6AE40A1 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf19.hostedemail.com (Postfix) with ESMTP id 94D111A003D for ; Mon, 27 Jun 2022 14:59:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1656341953; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=KyBtbZwF4aVDlnUFcipTZ6vcJOdMpDgb4cK7oilJ8bc=; b=a0Rovgo6g0E2Tgd4kIMoxzmWMgvd1OJ+bR8gVfFiw1gozasXKhWsbHVhQh8Ka5QLFCfNmy HtVrrfBen0CDuUPPV3f12ep4ZcQaEFgVY1i01fSAnlh1UV0digg2rScYMv48OpGbWOEzg/ HRNgu9b60WCxuSL2z4vKcWePmyibbtE= Received: from mail-io1-f71.google.com (mail-io1-f71.google.com [209.85.166.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-523-GrOc7KOENKuPn-VsvL1r2Q-1; Mon, 27 Jun 2022 10:59:11 -0400 X-MC-Unique: GrOc7KOENKuPn-VsvL1r2Q-1 Received: by mail-io1-f71.google.com with SMTP id o6-20020a5eda46000000b00674f9e7e8b4so5807990iop.1 for ; Mon, 27 Jun 2022 07:59:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=KyBtbZwF4aVDlnUFcipTZ6vcJOdMpDgb4cK7oilJ8bc=; b=0WrJbjUzrl62SKvtiKLFAtXbimLUdk82NhcH1XqfMbd+YRa5V++0qLCT5GsICeIptK jDSGvjJA/eMDttcE9HSWQczGTmf1m9jtU37IOvnZy9qw+ChIZdPTDh1lDOqWpXJnH/fm YUY3ZgaWsZjDA8CQEtp00F5qS6zD7fxkbWBRzix7Ol9hRR8Z+60wNaSDCUdpJZ7Tz8yF HE19bp1fCyJHwfDqZhVYYocnGbITqwn0ob8y4GGYMrQ4IGEyZIjjslzbsRAzvCGuVGT6 dI/RzXS4Dn2Hl2aGdPTDamY2yMLHPBf2J2YJ7ELUgZKdWGqVpvIA1w/17XWfahbM0h6c oHow== X-Gm-Message-State: AJIora/hcUUOtMHfN22GXwvIuLXCWJdNYt3LDOp0/ESG5ZzzLzUa915Q RJ/5qhLJP6WVisSvniZKki2I3t+jPc3f8uAelOZ6rtY5k/YLGL6hamKgtyq15YZEg6fEgU1yefU wZj1VSy9Efuw= X-Received: by 2002:a5d:96da:0:b0:674:f433:3595 with SMTP id r26-20020a5d96da000000b00674f4333595mr6880674iol.184.1656341950820; Mon, 27 Jun 2022 07:59:10 -0700 (PDT) X-Google-Smtp-Source: AGRyM1ukald73YkWBy8aEiaNxttUtk+e+IC+wwH0iXp9HOezq4ivFDT3Obqh+NE2Ms1MRceQnchzhw== X-Received: by 2002:a5d:96da:0:b0:674:f433:3595 with SMTP id r26-20020a5d96da000000b00674f4333595mr6880658iol.184.1656341950506; Mon, 27 Jun 2022 07:59:10 -0700 (PDT) Received: from xz-m1.local (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id h8-20020a92d848000000b002da9f82c703sm574303ilq.5.2022.06.27.07.59.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Jun 2022 07:59:09 -0700 (PDT) Date: Mon, 27 Jun 2022 10:59:08 -0400 From: Peter Xu To: David Hildenbrand Cc: Nadav Amit , Linux MM , Mike Kravetz , Hugh Dickins , Andrew Morton , Axel Rasmussen , Mike Rapoport Subject: Re: [PATCH v1 2/5] userfaultfd: introduce access-likely mode for common operations Message-ID: References: <18BCC23E-B344-41A8-926D-A49D768485AF@vmware.com> <6EF7D3B4-CF17-407B-A50F-B14D595E99A5@vmware.com> <07B65135-CA6D-4839-BAC0-6D63A94F50C2@vmware.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=a0Rovgo6; spf=none (imf19.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656341954; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KyBtbZwF4aVDlnUFcipTZ6vcJOdMpDgb4cK7oilJ8bc=; b=ltNcyVxWlHp+8E0fzhp8HyKIbV8P2mlspSbDgfrgqw7lEOY5rMtk5YcSTgQCgpetVBGC66 Pls2A3FA1SN6dDfvL9KtR9HKj309mTmnfD6kHUWrA4FIgOxgWmgIVtF5UbDIt0lFfyv3gC BkjqnOjiK0e5LjtwmbWNTrOyHaD5h8I= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656341954; a=rsa-sha256; cv=none; b=xjs6M8wuf4jyx7Q5Yw/gcqsy5sOavb0xHw80m+zDwQp1NiV1y1B0Ufg/Puk1+W5P0qTqD2 w1Fpw4y00BP/BKpL8KN03terJJDmlVxPpQ8SBYFMLe5q7+WFAjC4tgIeqSyex37ANb/OeU c2rirc0JBFXYi+/3c+PgT6sE6/G9e5g= X-Stat-Signature: 5afmxpm8q76twrcri9tin8u1fgzd56pq X-Rspamd-Queue-Id: 94D111A003D X-Rspam-User: Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=a0Rovgo6; spf=none (imf19.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam02 X-HE-Tag: 1656341953-235733 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jun 27, 2022 at 03:27:49PM +0200, David Hildenbrand wrote: > > Fundamentally, access bit has more meaningful context (0 means cold, 1 > > means hot), for dirty it's really more a perf thing to me (when clear, > > it'll take extra cycles to set it when memory write happens to it; being > > clear _may_ help only for the tlb flush example you mentioned but I'm not > > fully convinced that's correct). > > > > Maybe with the to be proposed RFC patch for tlb flush we can know whether > > that should be something we can rely on. It'll add more dependency on this > > work which I'm sorry to say. It's just that IMHO we should think carefully > > for the write-hint because this is a solid new uABI we're talking about. > > > > The other option is we can introduce the access hint first and think more > > on the dirty one (we can always add it when proper). What do you think? > > Also, David please chim in anytime if I missed the whole point when you > > proposed the idea. > > Well, if we have an ABI that places pages without further information > *why* we're doing that makes us having to guess what to do or what not > to do, and I think the zeropage placement is a prime example for that. > Personally, I think communicating the intention in forms of hints is > something that doesn't leak implementation details into an ABI. > > So "no planned access" vs. "read_likely" vs. "write_likely" conceptually > makes sense to me. > > As I raised previously, *if* we want to let the user affect the dirty > bit setting (1) is then a pure implementation detail. Or whatever else > we might want to do. > > But I also want to raise awareness that architectures that don't have a > hw-set dirty bit have to use page faults to mimic dirty tracking. IIRC, > s390x is a prime example for that: pte_mkclean() sets the WP bit and > marks the page dirty from the write fault. So it's even more expensive > than on other architectures. The last input seems to be supporting that we'd better even have redundant dirty bit in ptes rather than accidentally not having it, even when both are safe. So to me WRITE_LIKELY was still mostly around dirty bit besides the ZEROPAGE case. I don't have a strong opinion on how we should name that flag, if we want to insist on WRITE_LIKELY but only on ZEROPAGE I think it's fine, it's just that if I'm the user app I prefer making sure the page is allocated after UFFDIO_ZEROPAGE returned, rather than only providing a hint and then the kernels says "we'll do something but nothing is guaranteed". I also fully agree we don't want to expose impl details but my question was more on whether we want that hint at all as a generic one, and in what case that hint helps outside ZEROPAGE. For "it can be accessed" hint, I have an answer and it seems to apply to most of the uffd ioctls; but not so generic for a "it can be written" hint. Thanks, -- Peter Xu