From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB538C433EF for ; Thu, 10 Mar 2022 19:26:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 747B48D0003; Thu, 10 Mar 2022 14:26:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D0618D0001; Thu, 10 Mar 2022 14:26:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 549DC8D0003; Thu, 10 Mar 2022 14:26:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.26]) by kanga.kvack.org (Postfix) with ESMTP id 43AD98D0001 for ; Thu, 10 Mar 2022 14:26:19 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 1B158249A0 for ; Thu, 10 Mar 2022 19:26:19 +0000 (UTC) X-FDA: 79229457678.01.49231CA Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by imf10.hostedemail.com (Postfix) with ESMTP id 99FF1C001F for ; Thu, 10 Mar 2022 19:26:18 +0000 (UTC) Received: by mail-pl1-f170.google.com with SMTP id n15so5777623plh.2 for ; Thu, 10 Mar 2022 11:26:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version; bh=GKFc0VHvwbqPQ5egBr6Xu+MbRLgZxADpDY+F7sL7kNA=; b=Moocn4AV9g1AwLwZ1ScCPRyvaPbE3YL3jNCvj80gfLJuK79J6YxJxFjofacPLjKxjW ZqRxukVnU15AGOOp5KAGX7fggimOi+yXAeA14cCP9yeBus5zuzQywF1vvgZaPqt838iL 3gEiftighPmlWcRVnTx75CopLP+ypwo2GZKHjx5sS2uBRf1LTeUpnsJKI/GtXexpGy/d R3cEPKfKZo1rjZLrvXjSgli1uQLtAtO1QSIJAZ6utNgtmCZw+uoqb2od2s2Vmgmi27ny 5AJk6ZDj6VfVluaCzS+y2MYpVmf9I/+w3N/qCnWxMj/gTgQoY13vJrSk1ewlqCslHlPc 6wyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:mime-version; bh=GKFc0VHvwbqPQ5egBr6Xu+MbRLgZxADpDY+F7sL7kNA=; b=bwE3Q4+AbTbUrwHQPL/ByFLyzX/IaBpgxGfozp3FNU3V2rGHNwNzqqPi+9HvYe9b3C 9tS569W/S0rym6pNW56H/+3wBiEegeCimBks2G9wdVcT9FM5/sWzSApbFCbGlp22wQn+ CW8LmBgd0pdBea4KZyoEK/4M6nTqNd8vjtukNvVjutFjmKAuzmqn2PUcSVm+rHjPJiIN pKdGkLI0g9qemzg2s4jTnIHn1cawQIFZtjkbF+Y8uUhq91XvSz60PbEtQnMo/7AnYCRm N5lFMahevQnT8wWTDrU2ij+SE0fsMc3+tE9/gQNqbWQluHALIAQjw9W27VVJozHc5WPq XrHQ== X-Gm-Message-State: AOAM530oN8edw1STpWO/ANL5azOjCsE8h9n6FykuKvlFRxhBGHdt+b9F htcaQz3vZPsemJ87otfWWsdybw== X-Google-Smtp-Source: ABdhPJy7uggYDGiBb0cHNuTYrUaLjXfwldTEo9ADY121QRP7t6Wn6H8XosM8tTPeUsCoXcIU1V74AA== X-Received: by 2002:a17:902:ce87:b0:151:eb21:67d8 with SMTP id f7-20020a170902ce8700b00151eb2167d8mr6839823plg.22.1646940377312; Thu, 10 Mar 2022 11:26:17 -0800 (PST) Received: from [2620:15c:29:204:95da:6506:5de6:988] ([2620:15c:29:204:95da:6506:5de6:988]) by smtp.gmail.com with ESMTPSA id g18-20020a056a000b9200b004f783f5e890sm1535897pfj.156.2022.03.10.11.26.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Mar 2022 11:26:16 -0800 (PST) Date: Thu, 10 Mar 2022 11:26:15 -0800 (PST) From: David Rientjes To: Yang Shi cc: Zach O'Keefe , Alex Shi , David Hildenbrand , Michal Hocko , Pasha Tatashin , SeongJae Park , Song Liu , Vlastimil Babka , Zi Yan , Linux MM , Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matthew Wilcox , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Peter Xu , Richard Henderson , Thomas Bogendoerfer Subject: Re: [RFC PATCH 12/14] mm/madvise: introduce batched madvise(MADV_COLLPASE) collapse In-Reply-To: Message-ID: References: <20220308213417.1407042-1-zokeefe@google.com> <20220308213417.1407042-13-zokeefe@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 99FF1C001F X-Stat-Signature: ec3gh35ib9on9hdj6e3j7krowj1nxzo5 Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Moocn4AV; spf=pass (imf10.hostedemail.com: domain of rientjes@google.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1646940378-135555 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 9 Mar 2022, Yang Shi wrote: > > Introduce the main madvise collapse batched logic, including the overall > > locking strategy. Stubs for individual batched actions, such as > > scanning pmds in batch, have been stubbed out, and will be added later > > in the series. > > > > Note the main benefit from doing all this work in a batched manner is > > that __madvise__collapse_pmd_batch() (stubbed out) can be called inside > > a single mmap_lock write. > > I don't get why this is preferred? Isn't it more preferred to minimize > the scope of write mmap_lock? Assuming you batch large number of PMDs, > MADV_COLLAPSE may hold write mmap_lock for a long time, it doesn't > seem it could scale. > One concern might be the queueing of read locks needed for page faults behind a collapser of a long range of memory that is otherwise looping and repeatedly taking the write lock. To have minimal impact on concurrent page faults, which I think we should be optimizing for, I don't know the answer without data. Any ideas you have as a general rule-of-thumb for what would be optimal here between collapsing one page at a time vs handling multiple collapses per mmap_lock write so that readers aren't constantly getting queued? The easiest answer would be to not do batching at all and leave the impact to readers up to the userspace doing the MADV_COLLAPSE :) I was wondering if there was a better default behavior we could implement in the kernel, however.