From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2886AC4338F for ; Mon, 16 Aug 2021 14:10:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BEDAA610F9 for ; Mon, 16 Aug 2021 14:10:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org BEDAA610F9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 3612C6B006C; Mon, 16 Aug 2021 10:10:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 30FB06B0072; Mon, 16 Aug 2021 10:10:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1D8A56B0073; Mon, 16 Aug 2021 10:10:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0079.hostedemail.com [216.40.44.79]) by kanga.kvack.org (Postfix) with ESMTP id F28356B006C for ; Mon, 16 Aug 2021 10:10:34 -0400 (EDT) Received: from smtpin38.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 70EB2182A0742 for ; Mon, 16 Aug 2021 14:10:34 +0000 (UTC) X-FDA: 78481129188.38.636EA32 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf21.hostedemail.com (Postfix) with ESMTP id D2D4ED016CB6 for ; Mon, 16 Aug 2021 14:10:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629123033; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LtEw9oJqR1u9FpBUyNw8cDLhgtMuacfPHoXopx0aC28=; b=Mw32REsl/FrpQjSY8H+L/R+VU9d5ZS51z60jE1JkrxWonxSLh/njNOaJGRVuFbQ2I5kIgE LXB8b9x7XBzAD2tUTXB3z+hd+lxQz8RQcVrjAJrNf14Cw5shlqHYlliJTwlyV/eIcofaH6 vxz8pGB/inFV+D/SPDQBX6xggGwh5Ps= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-20-_ArfmP7eN5WyKCPSOmCLUg-1; Mon, 16 Aug 2021 10:10:31 -0400 X-MC-Unique: _ArfmP7eN5WyKCPSOmCLUg-1 Received: by mail-wm1-f72.google.com with SMTP id 11-20020a05600c024bb02902e679d663d1so4179429wmj.1 for ; Mon, 16 Aug 2021 07:10:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=LtEw9oJqR1u9FpBUyNw8cDLhgtMuacfPHoXopx0aC28=; b=FBxaI1De4cX63k39RSEmOMJj8Z6sgvcIHAm0b621hi3w6tz6RJM9tOkVQLJQo9Bsfg Srp+p5fLKumSjrJvffrQ4Ucsw7XjsOZ9ZONTJK/M0vNnlh7q+MYlXjLwBlGVbJbJH6/n zNFT8Gwwn55qpxw5ZQoM1Z1HQMil6YgGQ+0kATJ/5jVf7Pvja7mdHZpYV6+4v1xeYozy QcpSfMWwrGbPzafzNK+Mz//XuzebQ9RVf0W2u+KVJJUOqB4lQBVq1ejGXEj+vL/WKGDY ZjW7DmV3d0DhI55RpFFq9VbP71n1SAK+Y9do0Pk1dS0zUfJRrVXbsG64Nk5bgqhXQEEi De7w== X-Gm-Message-State: AOAM5303tof/ucOe43LiTEpzLMAPkh7dOCvtHjtvciyVXHmz3dHB8SnF HyWchwbBVRaaQfnwK/zp40j6fOIEAYgEVf7X+9QeaSrgx181sfwBIvkw2qSQOFtE/LRg0thvfXu xtmKbxx4sVHo= X-Received: by 2002:adf:c549:: with SMTP id s9mr18927909wrf.344.1629123030234; Mon, 16 Aug 2021 07:10:30 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwyh478LnIv/W9hNY+Cgrjtfa9Spn+4MIEAiJ4nAUbO5MjC2/AaMd/NhxgBSuBZoZyqMR0e+Q== X-Received: by 2002:adf:c549:: with SMTP id s9mr18927876wrf.344.1629123030004; Mon, 16 Aug 2021 07:10:30 -0700 (PDT) Received: from [192.168.3.132] (p5b0c67f1.dip0.t-ipconnect.de. [91.12.103.241]) by smtp.gmail.com with ESMTPSA id 129sm10573715wmz.26.2021.08.16.07.10.28 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 16 Aug 2021 07:10:29 -0700 (PDT) To: Matthew Wilcox Cc: Khalid Aziz , "Longpeng (Mike, Cloud Infrastructure Service Product Dept.)" , Steven Sistare , Anthony Yznaga , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "Gonglei (Arei)" References: <43471cbb-67c6-f189-ef12-0f8302e81b06@oracle.com> <55720e1b39cff0a0f882d8610e7906dc80ea0a01.camel@oracle.com> <88884f55-4991-11a9-d330-5d1ed9d5e688@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [RFC PATCH 0/5] madvise MADV_DOEXEC Message-ID: <40bad572-501d-e4cf-80e3-9a8daa98dc7e@redhat.com> Date: Mon, 16 Aug 2021 16:10:28 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Mw32REsl; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf21.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com X-Stat-Signature: zfcy3otxe9oa7zmrwok5weis8kde5pbe X-Rspamd-Queue-Id: D2D4ED016CB6 X-Rspamd-Server: rspam05 X-HE-Tag: 1629123033-941324 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: >>> Until recently, the CPUs only having 4 1GB TLB entries. I'm sure we >>> still have customers using that generation of CPUs. 2MB pages perfor= m >>> better than 1GB pages on the previous generation of hardware, and I >>> haven't seen numbers for the next generation yet. >> >> I read that somewhere else before, yet we have heavy 1 GiB page users, >> especially in the context of VMs and DPDK. >=20 > I wonder if those users actually benchmarked. Or whether the memory > savings worked out so well for them that the loss of TLB performance > didn't matter. These applications are extremely performance sensitive (i.e., RT=20 workloads), that's why I'm wondering. I recall that they are most=20 certainly using more than 4 GiB memory in real applications. E.g., the doc [1] even has a note that "For 64-bit applications, it is=20 recommended to use 1 GB hugepages if the platform supports them." [1] https://doc.dpdk.org/guides-16.04/linux_gsg/sys_reqs.html >=20 >> So, it only works for hugetlbfs in case uffd is not in place (-> no >> per-process data in the page table) and we have an actual shared mappi= ngs. >> When unsharing, we zap the PUD entry, which will result in allocating = a >> per-process page table on next fault. >=20 > I think uffd was a huge mistake. It should have been a filesystem > instead of a hack on the side of anonymous memory. Yes it was. Especially, looking at all the special-casing, for example,=20 even in mm/pagewalk.c. >=20 >> I will rephrase my previous statement "hugetlbfs just doesn't raise th= ese >> problems because we are special casing it all over the place already".= For >> example, not allowing to swap such pages. Disallowing MADV_DONTNEED. S= pecial >> hugetlbfs locking. >=20 > Sure, that's why I want to drag this feature out of "oh this is a > hugetlb special case" and into "this is something Linux supports". I would have understood the move to optimize SHMEM internally - similar=20 to how we seem to optimize hugetlbfs SHMEM right now internally.=20 (although sharing page tables for shmem can still be quite tricky) I did not follow why we have to play games with MAP_PRIVATE, and having=20 private anonymous pages shared between processes that don't COW,=20 introducing new syscalls etc. --=20 Thanks, David / dhildenb