From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5AB21C27C76 for ; Wed, 25 Jan 2023 16:23:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C7A1B6B0071; Wed, 25 Jan 2023 11:23:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C29726B0072; Wed, 25 Jan 2023 11:23:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF0EF6B0073; Wed, 25 Jan 2023 11:23:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 9F8326B0071 for ; Wed, 25 Jan 2023 11:23:05 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 72092C0DCB for ; Wed, 25 Jan 2023 16:23:05 +0000 (UTC) X-FDA: 80393840730.30.E03E32D Received: from mail-wm1-f43.google.com (mail-wm1-f43.google.com [209.85.128.43]) by imf10.hostedemail.com (Postfix) with ESMTP id 63258C0014 for ; Wed, 25 Jan 2023 16:23:02 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Qn66adWR; spf=pass (imf10.hostedemail.com: domain of jthoughton@google.com designates 209.85.128.43 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674663782; a=rsa-sha256; cv=none; b=RzKUQpFr1LrKoUteR18bgjcVsQvdqk6RCMZExj6oldjXEUgvsmtMq/SjZ2+a1YhqVYwOkL PLaA0JrAgf+SfT8yYdmkNNMiP+2cie2BdU2K5xp0KuQ9aeSq+v6NosdNjK0LRj0VjayZgr OAGmMU4EH16zs3ZMXoE7/1jfDdfzclQ= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Qn66adWR; spf=pass (imf10.hostedemail.com: domain of jthoughton@google.com designates 209.85.128.43 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674663782; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4oCjkBZLMchqsD7znVvFi0lht/NYlQnJmNjl/nC8XPo=; b=t74Hj8ehJbhp2kyI7/e2Q9T075QgiY5U1BggLL5R84uut7Vn+shfn64vcepcWCJKlLEV3B nSihNKut8KYn37KQ26x5+vHaO6M3W2fkzoLKsJs7ejYjgJkE9Dkcog86m5+zxW8VeWY0OE fQw6Jxvf6l8tBc2ccB562Y/7dRutdUo= Received: by mail-wm1-f43.google.com with SMTP id fl11-20020a05600c0b8b00b003daf72fc844so1633207wmb.0 for ; Wed, 25 Jan 2023 08:23:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=4oCjkBZLMchqsD7znVvFi0lht/NYlQnJmNjl/nC8XPo=; b=Qn66adWRmWY7aGXVT4OnjqW7B1NbYjXnscReom3eQwMLn1yl3xspPL8bIG4cq5QmuA PTuALDMyr3DqKGW1bZTvEu3MfZZaopZnnhYhVJm79nZYo/W71HDLDk8c2hlc0Z6P/POA sILNGQtpnlCReFGGYcLmS2l/8x9RoLC7ahmjMgLJ6r721ceEsGfGQ92nU3Zj35dMtePi pRNl6huzemUK/6A1+U6M/ZKl5owM32DBd0er/X7aCFwisERaWeHnGp8uL/xgh8RiYGs/ Vq3kRsKGe6TWlxaT3f/UFzA8QKrBEp2QZvLuOzBdNdmNCorBl8FeQpQi8iNM/B/Ax50E DTLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=4oCjkBZLMchqsD7znVvFi0lht/NYlQnJmNjl/nC8XPo=; b=SkPf340LzM2t8ejgDpp8hGroaH5I9thvuwtSNp95ddM7E+qm4k4s56pH9JkL1zEFNa KZIG55ugSHu6uO2mIqAnaTAS3S08zK2PycAgFQy4HggPCYBv2ZKrW4YFEhrx1Qdxdb7H Eqxu8Web/mKwERuanUGZDIq7hSvIY4tIC1xDH7oC8bXTbEAWMtD0+iQaPXlEnLcix6eS AieJUf0vbIq2u/lUrhjQSoIVFXYCW3ek1VKzyHibXBZvYlqJEFBIkKqjQM/61LwOYwp1 8omn0lJmhOld1COZw0ni4ZwE1gdhe7ppYDoRRxW+5pYIW5uKkDxUY6chuDI7+X4PXLdh 9E4A== X-Gm-Message-State: AFqh2kqiLU5U7lFVdybqCOR9LuCgolHZuqZxa5A9rmyCF7WuzOnUNB+U rvSl0n8UjJt2w7buHzWIw6g2QJdkTNsv7mdKbpfPaA== X-Google-Smtp-Source: AMrXdXs6hag0xa8b4XQfnyAbXZPxfTaiAuTDMa4ag2xZQzx5rIiN/6qActuT6BZMkbL4/gYxeVmVw0edZlfa+siabxQ= X-Received: by 2002:a05:600c:3095:b0:3d9:7950:dc5f with SMTP id g21-20020a05600c309500b003d97950dc5fmr1743785wmn.120.1674663781003; Wed, 25 Jan 2023 08:23:01 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: James Houghton Date: Wed, 25 Jan 2023 08:22:24 -0800 Message-ID: Subject: Re: A mapcount riddle To: Peter Xu Cc: Mike Kravetz , linux-mm@kvack.org, Naoya Horiguchi , David Rientjes , Michal Hocko , Matthew Wilcox , David Hildenbrand , Muchun Song Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Queue-Id: 63258C0014 X-Rspamd-Server: rspam01 X-Stat-Signature: d7181rtq5m4sunmpmkjyxq6n5hp3b1mw X-HE-Tag: 1674663782-873210 X-HE-Meta: U2FsdGVkX1+yaGuiL3244Ech94CXn2eju69CNhzehdGoAFpuoQ+/5mSuf+LiMnLS8lJhif1yeM5BYKeHTa8Fw/zo8+PUVr6XHys3hHIswFUp+z5xevSuUV+yQINn/kPXBcjdLEyH0nKFwrXalfM9OOxfs3yLReiMCxYq+Phcg73J0wfFiieiZcilW+yTY3JrYedUlg1NnRX0kGLdYa6Tg80Bwt2Q9XUeuFVhpqd7jHTgLBuwXXTWQa+hbApAHhoGpVtNwSF0HxAFdfdmoboQwQ7hFFSMGJ9vu1JIhLN1pdkzWgjTY7OS/2xsshgngi66jtt+IR+iPXyBxWRrZGNPACQaf5WQB0MhKU9aW5y2viuKvdTxSVAAqAg117CmI8OIADA4BFzq11xx6M3n9pzjuLlHUvszyZk4B0gifSoTBotZB3jQ67ueJWlSVRpGj3UBSznxN0lPRLeEAFoAnjFcOKe2R90haPDs+Qy3XDG8E0kaYFESWkMV30O75M91p8N0eO2XfoPdzwExQFzXBBnRhi+GUCpMli2Kodiwm9x1wrDjuGTMAqY+WkvCjhL3xw3As17gVSnENcP0eRN0lHnnBml8bIvl3sEmCgbdKyf1KVadVSZS/i9rmsWuA63HvdOUlqvoWP4g2vQQm/NyMIRFXaPh0YZa5v5jWP3kxz/2MqUeyr+w7MYyBZ7dZENMO1XXYBmDiphlj8ZH5f1Ae7f+/XUk4J2iNJnhBnR6wMsRBmPJ811QYCoQu4LGfxiBxMT7HEBTYkYcUAZ7cv38bkYU1GE4QC+mRGvNXUTkHhh0BWlFJuvMnXcsqfAAZ3KEqkeDhGIEfCb0/aKx/5XjJ6pmnite3ZQDDLmXNda8EGUw18/fgsKP8Q5unDyrh0F5RftCW2B6Ox0sPluWPEmW9ztMVCF8bLy0ygUKHXwPMEL2ylwA4OeJKOWClQNeYNSYQ99g/7NI7HN9heRczD32mx3 fTBiZQIB mq4oEMxs0oShz9SK/rX+2jQiEDre71VhoejxUiGo2rwqEfokx2RBkUV4ptf70XwLEe3CXH/NvbUOktXeI+HZkKy5QMU6wT8tVu1YjgcCYHXbJ+HXeuUL9zbD7P5VuF45DSGsChvTHvFxpjM4HI2QJkM1WFufNbbyrmdsNxLMXIFN7vrwmTUsBd1XFzI9MAbTEGnL5Rv06JsTdXCLOGGKsXLSL897hHti9BCKvM5xOxRKBJ9An/VFe7mhMoS2jnb17NMKmBU7VBWEM/jE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jan 25, 2023 at 7:54 AM Peter Xu wrote: > > On Wed, Jan 25, 2023 at 07:26:49AM -0800, James Houghton wrote: > > > At first thought this seems bad. However, I believe this has been the > > > behavior since hugetlb PMD sharing was introduced in 2006 and I am > > > unaware of any reported issues. I did a audit of code looking at > > > mapcount. In addition to the above issue with smaps, there appears > > > to be an issue with 'migrate_pages' where shared pages could be migrated > > > without appropriate privilege. > > > > > > /* With MPOL_MF_MOVE, we migrate only unshared hugepage. */ > > > if (flags & (MPOL_MF_MOVE_ALL) || > > > (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) { > > > if (isolate_hugetlb(page, qp->pagelist) && > > > (flags & MPOL_MF_STRICT)) > > > /* > > > * Failed to isolate page but allow migrating pages > > > * which have been queued. > > > */ > > > ret = 1; > > > } > > > > This isn't the exact same problem you're fixing Mike, but I want to > > point out a related problem. > > > > This is the generic-mm-equivalent of the hugetlb code above: > > > > static int migrate_page_add(struct page *page, struct list_head > > *pagelist, unsigned long flags) > > { > > struct page *head = compound_head(page); > > /* > > * Avoid migrating a page that is shared with others. > > */ > > if ((flags & MPOL_MF_MOVE_ALL) || page_mapcount(head) == 1) { > > if (!isolate_lru_page(head)) { > > list_add_tail(&head->lru, pagelist); > > mod_node_page_state(page_pgdat(head), > > NR_ISOLATED_ANON + page_is_file_lru(head), > > thp_nr_pages(head)); > > ... > > } > > > > If you have a partially PTE-mapped THP, page_mapcount(head) will not > > accurately determine if a page is mapped in multiple VMAs or not (it > > only tells you how many times the head page is mapped). > > > > For example... > > 1) You could have the THP PMD-mapped in one VMA, and then one tail > > page of the THP can be mapped in another. page_mapcount(head) will be > > 1. > > 2) You could have two VMAs map two separate tail pages of the THP, in > > which case page_mapcount(head) will be 0. > > > > I bring this up because we have the same problem with HugeTLB > > high-granularity mapping. > > Maybe a better match here is total_mapcount() rather than page_mapcount() > (despite the overheads on the sub-page loop)? This would kind of fix the problem, but it would be too conservative now. :) In both example 1 and 2 above, total_mapcount(head) for both would be 2, so that's ok. But now consider: you have one VMA that is PTE-mapping two pieces of the same THP. total_mapcount(head) is still 2, even though only a single VMA is mapping the page. - James