From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86D42ECAAA1 for ; Tue, 30 Aug 2022 12:33:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E70D7940007; Tue, 30 Aug 2022 08:33:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E20426B0074; Tue, 30 Aug 2022 08:33:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CE79E940007; Tue, 30 Aug 2022 08:33:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id C0EF46B0073 for ; Tue, 30 Aug 2022 08:33:49 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 8A595140B90 for ; Tue, 30 Aug 2022 12:33:49 +0000 (UTC) X-FDA: 79856200578.01.7E14581 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf02.hostedemail.com (Postfix) with ESMTP id 1E28F8001A for ; Tue, 30 Aug 2022 12:33:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661862827; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yHf1R0yRa5PLkeHO3mixZ0igrrsKa53gGZnPGXCHmJs=; b=FUPC6YPO2SzHXKFEoe7zmmG2RbQyVLMUTHJgIWmQcN+xhADlXtftOr4aXcGJKtvamsXAe9 DwTmvLVtf6DJbz1O7rEq2TRa9U4Ul38FpYNLMXBt54wGrJCZhRVbX4C7YKG6kDz7Fv1JxY FeE0WUfQbsArlGdToLe8CUrkKjKWdzM= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-171-cy5s3XIpOW2GiwwhKKyCSg-1; Tue, 30 Aug 2022 08:33:45 -0400 X-MC-Unique: cy5s3XIpOW2GiwwhKKyCSg-1 Received: by mail-wm1-f72.google.com with SMTP id j3-20020a05600c1c0300b003a5e72421c2so658442wms.1 for ; Tue, 30 Aug 2022 05:33:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:subject:organization:from :references:cc:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc; bh=yHf1R0yRa5PLkeHO3mixZ0igrrsKa53gGZnPGXCHmJs=; b=IYhE302qwi0eUgFBu88wAX72lREsYEV5/0Lu8nalvbPxsG0+qlCjnTIKCvAIc4ZPI7 NUCA1HW3NavSP+RSfpz70Gzk5b9c0oWWDbcdnrDV7CwagqAAUKHDDpllzZP4DwqsF7my CsJyfiiqoenGFUhW1iW4+HSjSpl8Dal14VYSN50uLuPsekC5Tg3z5lX++u7ivK1fILcj MA/zcFR5R0bLxqFB7e4yB1FepQsBu+jS6/ip7x/w342U4NQPZ8GX6PMjNadzlQLn9Va7 hW+xj/4vKu18TMFahgc1GWflg3gSYMqL8VMkSXWicZPBsYVIbmTbOkowDgehKPtQhF/h V9og== X-Gm-Message-State: ACgBeo1HxmgoNlJClk0119SiHUVOuy+iR5nD3SXeXnwp5uONDGVPTOWG IjSGkMWVHIebsjP3kfdcDRRbQX/0i3HGXGcbJwfga9Ac50EolpZrn2bSqXeYuhLpaqFQXlzmJ1S ngD0DARtvyUs= X-Received: by 2002:a05:600c:22c8:b0:3a5:c134:1f50 with SMTP id 8-20020a05600c22c800b003a5c1341f50mr9791119wmg.55.1661862823877; Tue, 30 Aug 2022 05:33:43 -0700 (PDT) X-Google-Smtp-Source: AA6agR7fV7UlnayoRTU1wY6uREoAOzHee3qXuVMo38aaSfaVyAWlUh4403AykH+xcIKYhpoChaFmAg== X-Received: by 2002:a05:600c:22c8:b0:3a5:c134:1f50 with SMTP id 8-20020a05600c22c800b003a5c1341f50mr9791098wmg.55.1661862823512; Tue, 30 Aug 2022 05:33:43 -0700 (PDT) Received: from ?IPV6:2003:cb:c70a:1000:ecb4:919b:e3d3:e20b? (p200300cbc70a1000ecb4919be3d3e20b.dip0.t-ipconnect.de. [2003:cb:c70a:1000:ecb4:919b:e3d3:e20b]) by smtp.gmail.com with ESMTPSA id i2-20020a5d55c2000000b00226dd738b9dsm4519579wrw.46.2022.08.30.05.33.42 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 30 Aug 2022 05:33:43 -0700 (PDT) Message-ID: <00f2dee2-ebc1-e732-f230-bc5b17da9f80@redhat.com> Date: Tue, 30 Aug 2022 14:33:42 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.12.0 To: Rik van Riel , alexlzhu@fb.com, linux-mm@kvack.org Cc: willy@infradead.org, hannes@cmpxchg.org, akpm@linux-foundation.org, kernel-team@fb.com, linux-kernel@vger.kernel.org References: <490fcdd204ae129a2e43614a569a1cf4bdde9196.1661461643.git.alexlzhu@fb.com> <6448b9a8dba8ef39e42e56a3c0ce0633fff7c6a6.camel@surriel.com> <42c164c6-8c69-7b4b-d965-ac62d1607061@redhat.com> <37db29410990991555362154a371b58f47d3cb0c.camel@surriel.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [RFC 2/3] mm: changes to split_huge_page() to free zero filled tail pages In-Reply-To: <37db29410990991555362154a371b58f47d3cb0c.camel@surriel.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661862829; a=rsa-sha256; cv=none; b=xqhHrbVznkZs7lE8fHc9RnxLpVlo7SQS4ZX2hcATTWEZVshEFkk8+N5eWrZGGGojqcE3kJ BmjApPkcDhH+/wM8J61vvYh3lt7aOiMV6Ef3mP/AFZX5vqxXvCT92Tv93w7E7L1o3j52Kk jpdBRadwwEzzXlWt/1SF40J2OhjBxjE= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FUPC6YPO; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf02.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661862829; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yHf1R0yRa5PLkeHO3mixZ0igrrsKa53gGZnPGXCHmJs=; b=Ckflh4byTqjUzzEGX7nCFyEpagqtWtZ1+rQGX3ZTOyTIbFSqnB+xhIOxSzqZuyy6sjg2lZ GA5bwv3IBaP1lLhFpFZFPmM30DB1D1zLuKjTXjbzITAwBL0qQgR7Pwrj81t0AN1juCHe04 oug34+++APZPiXMmYwmqJUqLp0IBEsc= Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FUPC6YPO; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf02.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com X-Rspam-User: X-Rspamd-Server: rspam10 X-Stat-Signature: yofzwj4bfzaaxp3ijpmsz5rnoz6a8uc4 X-Rspamd-Queue-Id: 1E28F8001A X-HE-Tag: 1661862828-962662 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 29.08.22 15:17, Rik van Riel wrote: > On Mon, 2022-08-29 at 12:02 +0200, David Hildenbrand wrote: >> On 26.08.22 23:18, Rik van Riel wrote: >>> On Fri, 2022-08-26 at 12:18 +0200, David Hildenbrand wrote: >>>> On 25.08.22 23:30, alexlzhu@fb.com wrote: >>>>> From: Alexander Zhu >>> >>> I could see wanting to maybe consolidate the scanning between >>> KSM and this thing at some point, if it could be done without >>> too much complexity, but keeping this change to split_huge_page >>> looks like it might make sense even when KSM is enabled, since >>> it will get rid of the unnecessary memory much faster than KSM >>> could. >>> >>> Keeping a hundred MB of unnecessary memory around for longer >>> would simply result in more THPs getting split up, and more >>> memory pressure for a longer time than we need. >> >> Right. I was wondering if we want to map the shared zeropage instead >> of >> the "detected to be zero" page, similar to how KSM would do it. For >> example, with userfaultfd there would be an observable difference. >> >> (maybe that's already done in this patch set) >> > The patch does not currently do that, but I suppose it could? > It would be interesting to know why KSM decided to replace the mapped page with the shared zeropage instead of dropping the page and letting the next read fault populate the shared zeropage. That code predates userfaultfd IIRC. > What exactly are the userfaultfd differences here, and how does > dropping 4kB pages break things vs. using the shared zeropage? Once userfaultfd (missing mode) is enabled on a VMA: 1) khugepaged will no longer collapse pte_none(pteval), independent of khugepaged_max_ptes_none setting -- see __collapse_huge_page_isolate. [it will also not collapse zeropages, but I recall that that's not actually required] So it will not close holes, because the user space fault handler is in charge of making a decision when something will get mapped there and with which content. 2) Page faults will no longer populate a THP -- the user space handler is notified instead and has to decide how the fault will be resolved (place pages). If you unmap something (resulting in pte_none()) where previously something used to be mapped in a page table, you might suddenly inform the user space fault handler about a page fault that it doesn't expect, because it previously placed a page and did not zap that page itself (MADV_DONTNEED). So at least with userfaultfd I think we have to be careful. Not sure if there are other corner cases (again, KSM behavior is interesting) -- Thanks, David / dhildenb