From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D02CDC25B75 for ; Mon, 27 May 2024 03:12:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 376716B0083; Sun, 26 May 2024 23:12:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 325996B0088; Sun, 26 May 2024 23:12:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1ED016B0089; Sun, 26 May 2024 23:12:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id F1F206B0083 for ; Sun, 26 May 2024 23:12:17 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 7473F140E49 for ; Mon, 27 May 2024 03:12:17 +0000 (UTC) X-FDA: 82162702314.26.D9A2347 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) by imf18.hostedemail.com (Postfix) with ESMTP id DD7F91C0014 for ; Mon, 27 May 2024 03:12:13 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Gp26xrOW; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf18.hostedemail.com: domain of ying.huang@intel.com designates 192.198.163.18 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716779535; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=p860NPQXRjaoZCp7UPghlrtOrGPvv2wUjQzQuRDOmX4=; b=5SHSLHc5uAaiptNJZWOpu3SWRpHSRh52k51M+n7anbtMx4XOvCD96IAj2gZOZ5e03hyf3b IKz6yLObP87Kd00UIZQc/rF3ieEdLlejIVyHCWuIWRZqBN0Nrn4stVz7utIJX0qY7e26JF vBUThT3l0FZloO0f8/qY5+q+s6ygReE= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Gp26xrOW; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf18.hostedemail.com: domain of ying.huang@intel.com designates 192.198.163.18 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716779535; a=rsa-sha256; cv=none; b=MlfOvqHnCa1RnA9zcpQZKnFYME8ZvPApv+u4TVNpr6gW+FbzCBVpndpIlHnGmHtzglVUXq Q2MY14sKcTmKLXQvsR3N3mtHO7ylgCLoew+m8mGj3nqRRvNt31bpx+q+ZWl9u4qMNopX8T k2zdNo6Ye4usVWptoJ5fOW/3xQA3PWI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1716779533; x=1748315533; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version; bh=PqFnJjXda4JrJs9kQcuylggEyr0qX+SbbLX07eixydI=; b=Gp26xrOW/ccnrLdmZ6px4v7+kQL3sS4ESXhuqs1P87Ul1Z8JJhgMNwcU E5GBWdSpGDiVe7kjBua2xtI+zd/+H++FRHD/xbVJc0HBrprx4FO53oxMD iNYxi4SEUnSkbnKEBEGj8q7+FsnQRJpR5LjhwF+NIBhDBTPSLIBn8A1AJ mwv/WECjptFSJqAQjh0AWzHkQxO83itySQn/1uSim9GIx3WoDntZgxBIO pUBPhEzkqgNmOVIJd3ZmjWWIGmf1Uu+WWg2OfD+KPHQXeeS57oBmYYX8n xX07CROCc+vMnGyrQ+0acI55EAgKdZjP6oOKyGpu1UVvRr1rrTLxbBKCZ Q==; X-CSE-ConnectionGUID: hdqgHvxTTT6ZahExXY76Nw== X-CSE-MsgGUID: Jh9DDozhSSGXr0ZMsknmXg== X-IronPort-AV: E=McAfee;i="6600,9927,11084"; a="12857599" X-IronPort-AV: E=Sophos;i="6.08,191,1712646000"; d="scan'208";a="12857599" Received: from orviesa010.jf.intel.com ([10.64.159.150]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 May 2024 20:12:11 -0700 X-CSE-ConnectionGUID: ZpYjmYHSR9u/mnr/7ahbMg== X-CSE-MsgGUID: H5t9OQiESRCNuJEFRMB4Bw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,191,1712646000"; d="scan'208";a="34510052" Received: from unknown (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orviesa010-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 May 2024 20:12:07 -0700 From: "Huang, Ying" To: Byungchul Park Cc: Dave Hansen , , , , , , , , , , , , , , , , Subject: Re: [PATCH v10 00/12] LUF(Lazy Unmap Flush) reducing tlb numbers over 90% In-Reply-To: <20240527015732.GA61604@system.software.com> (Byungchul Park's message of "Mon, 27 May 2024 10:57:32 +0900") References: <20240510065206.76078-1-byungchul@sk.com> <982317c0-7faa-45f0-82a1-29978c3c9f4d@intel.com> <20240527015732.GA61604@system.software.com> Date: Mon, 27 May 2024 11:10:15 +0800 Message-ID: <8734q46jc8.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Rspamd-Queue-Id: DD7F91C0014 X-Stat-Signature: gs546wqpozra8wa1c1o5ugebue8t1z1o X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1716779533-972156 X-HE-Meta: U2FsdGVkX19JM0cz8Fg+F0mdTrwScEgKk8ZszmgOOS8uZlBLcoaRpMxePpmmeQ3GNdWNCenEgWghSNdcNW1Kn1IT4jLG16KyKqJKx/pD7xl+tS5tdBiMmBuETYJxMVtAb8DasbtWuW1rFXUcaUpO5Jc0lZlJ5VD+LQxHF2l4dODgqX1RUGIXhZ4ySuDRhftacb8QKaKLY84FUJiUvUpaS6ptX8IC/dJxUpniY6juDd8g6mBYaJmPZtGplP7o9l5iBp7KrDGWaqUWc9P50BbCq6/megn3ScBuLXPtjT9tvh/GlVXGHBg6bRywdLrA2+Cpyu7+uHs5VaeWnI59+65JUEbJeZrV2nlWP9cn96Bp76C/lErS5oHRn1XDf7X8CFNJq5NU6UwlmPrSmeCemL3+0mPAEsJgKjW1Yps+ZA5Z84tf9r8AxQSw7nB3rMinIsGrG4HcrZxnF4VFK6IDce3lZWum77Z+VvXMwj0l+C6ThPgb4z2LEdOtg1clESM+0mKxi0ez1myn2SEHQJ0D+KS/ZEZTgO6PzXlPqQ/k+/Mb+rRCXxDWujJWSMZKG8gr8gGArdxWkfl3T4fDiAqebAeMTujvxCNgBJJN2oDEXwDUet+IllHOC6xqV5LFNRYOR5kXpXQNKSQrGa169dojSttlR4uNPXEad5WodR5LiWTGAA46KHdqKMMQWVJ6Xvysauc+VDViEncyjpQr53hsn9dxhU9oNcWODYippA/wovJeeYXtVs9vt/04V8DA1Woq4bVnl8uL3wtqG5GZyKV1YmsR1HKkDXNXKxKvO5OE5XNsgUkFboiroRO1dKL5aSv09dQkcODB+OLVQzRNGu6V2VK2+odF/tuENrmUXqFhSwV2dUdWg50uSOQCzxDHASzLZW0P5YGqIgtzc6SXwTcFwIyrbtUMAuRh/kHcZAVFnSewNKXSlJkdUj4IbWx490tgRSm1zI2SB/3ODHfZrOdkuDd xiqi7dpt hF31SkKYThXgkXiKfQtr4+NRk2RYibKTrjwVAbARcA4MjmkgISsJd2Q2Gdmfid1xU+i/cRUiWgcQyOr2sHHejxbUZszzv/6MeYCiE6b5S9k6wEfLOmB9m8Y/PlUl/lij87dgqHsZoNytKnmTw0e5vwJF+hLcQ92+iamuADv0mCcSinNmdNagF5IIHsxqClta5p5evdiZNsE91/Gc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Byungchul Park writes: > On Fri, May 24, 2024 at 10:16:39AM -0700, Dave Hansen wrote: >> On 5/9/24 23:51, Byungchul Park wrote: >> > To achieve that: >> > >> > 1. For the folios that map only to non-writable tlb entries, prevent >> > tlb flush during unmapping but perform it just before the folios >> > actually become used, out of buddy or pcp. >> >> Is this just _pure_ unmapping (like MADV_DONTNEED), or does it apply to >> changing the memory map, like munmap() itself? > > I think it can be applied to any unmapping of ro ones but LUF for now is > working only with unmapping during folio migrion and reclaim. > >> > 2. When any non-writable ptes change to writable e.g. through fault >> > handler, give up luf mechanism and perform tlb flush required >> > right away. >> > >> > 3. When a writable mapping is created e.g. through mmap(), give up >> > luf mechanism and perform tlb flush required right away. >> >> Let's say you do this: >> >> fd = open("/some/file", O_RDONLY); >> ptr1 = mmap(-1, size, PROT_READ, ..., fd, ...); >> foo1 = *ptr1; >> >> You now have a read-only PTE pointing to the first page of /some/file. >> Let's say try_to_unmap() comes along and decides it can_luf_folio(). >> The page gets pulled out of the page cache and freed, the PTE is zeroed. >> But the TLB is never flushed. >> >> Now, someone does: >> >> fd2 = open("/some/other/file", O_RDONLY); >> ptr2 = mmap(ptr1, size, PROT_READ, MAP_FIXED, fd, ...); >> foo2 = *ptr2; >> >> and they overwrite the old VMA. Does foo2 have the contents of the new >> "/some/other/file" or the old "/some/file"? How does the new mmap() > > Good point. It should've give up LUF at the 2nd mmap() in this case. > I will fix it by introducing a new flag in task_struct indicating if LUF > has left stale maps for the task so that LUF can give up and flush right > away in mmap(). > >> know that there was something to flush? >> >> BTW, the same thing could happen without a new mmap(). Someone could >> modify the file in the middle, maybe even from another process. > > Thank you for the pointing out. I will fix it too by introducing a new > flag in inode or something to make LUF aware if updating the file has > been tried so that LUF can give up and flush right away in the case. > > Plus, I will add another give-up at code changing the permission of vma > to writable. I guess that you need a framework similar as "flush_tlb_batched_pending()" to deal with interaction with other TLB related operations. -- Best Regards, Huang, Ying > Thank you very much. > > Byungchul > >> fd = open("/some/file", O_RDONLY); >> ptr1 = mmap(-1, size, PROT_READ, ..., fd, ...); >> foo1 = *ptr1; >> // LUF happens here >> // "/some/file" changes >> foo2 = *ptr1; // Does this see the change?