From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E694DC54E76 for ; Tue, 17 Jan 2023 21:38:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C5536B0075; Tue, 17 Jan 2023 16:38:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 79B256B0078; Tue, 17 Jan 2023 16:38:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 662A86B007B; Tue, 17 Jan 2023 16:38:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 570696B0075 for ; Tue, 17 Jan 2023 16:38:39 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 2EB33120C28 for ; Tue, 17 Jan 2023 21:38:39 +0000 (UTC) X-FDA: 80365605558.04.1F0B848 Received: from mail-wr1-f53.google.com (mail-wr1-f53.google.com [209.85.221.53]) by imf21.hostedemail.com (Postfix) with ESMTP id 8FAD11C000E for ; Tue, 17 Jan 2023 21:38:37 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=sqRGz6JO; spf=pass (imf21.hostedemail.com: domain of jthoughton@google.com designates 209.85.221.53 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673991517; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jQ9UkgQeYnMSm21br+/WRYTcl1lVOfAIfRZkpO3j0Eo=; b=uZokX5i2LkwxX9qz/H47LADzJEcQMViXiO6n90TR+B+9B+u2JDySedwYTtVqmPO2o7QOUI rPxouBdZ7wUdYKwpa+L/6BcGPB0ZFaTqivUgr++C5nTvjqlJnArI8IcY1iGOg35sl/axf0 G+b7PJb2beoQ+KtQ4iLBDqbThAUS6uI= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=sqRGz6JO; spf=pass (imf21.hostedemail.com: domain of jthoughton@google.com designates 209.85.221.53 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673991517; a=rsa-sha256; cv=none; b=6pbhPH9p5p7oVRhxXv+oHTpC9AkMarr2Ed1idCt73e8uoL7dePXRYwJYjgT+KecvkREdk5 kM9iVGuDvXb4+S6Cixcfw2yBJEdpszt8vcSxPq+/KjK9UWkbfhuZkG0UumjbFAajKyiOE2 7a4Oh81VCJ6Je0N6YWVKf7cCJMcgR+Q= Received: by mail-wr1-f53.google.com with SMTP id b5so10855808wrn.0 for ; Tue, 17 Jan 2023 13:38:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=jQ9UkgQeYnMSm21br+/WRYTcl1lVOfAIfRZkpO3j0Eo=; b=sqRGz6JOhYVpP9SdbBRwv5+UC5QMn+EWTLENKt3ipj1KVv5EohZco3oue3RZET0H/G qqZY3NSsLYd1zY0WuHMUi/3vuNR8pHoxXTQrQW6u0pOc+e4BCgz8SAjw4PpszoM42pD+ g4g+f04JFI8V5GDBnV1K6yfs0BXFT3tQtJepwieiU9KO8FJy5l80uf7v4Sje6nKFfiGv 0Qk0V+bhCYPybT8kNg0NP+F5wYh/qgjZKh3iYP16vcF9+UGPkO98HDA4SSKRCV2hJ1tl kxMmgudFowSuYd2w2XUwqzO7yUmhL0ggbfGfbwYEwraJ+cJFudfzlujvqHQVORb9+l9O fYbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=jQ9UkgQeYnMSm21br+/WRYTcl1lVOfAIfRZkpO3j0Eo=; b=x1XTsfa9fkUOH8CBffK7VfX7Ka9F8wI3bSghQ/atG2PfLXYuwhIR0LwQUffarBaVoK /DhQCwG78qEwGZocnQi0V24AKuX3uyp5ktwO4R4ab8NrH0ODUy2VckrHWbU2yXJMpg/4 xjGdE00jrowe6xtHjq4UGN/8SK6a9AnH/m53Y/sdNuWWkf7zM2lCaYUrUH50CxKJxkXW P+fht+0ORk2LtR+WF5MBgZAaLrMvs9IdK3CGJXbSdAcZM60XIOidTBLB5Zgmq3CtvRpG kExCqbiEOW5I26s53b+0lRS9WFE7z6UQYEcQduFwTqUwVjRSYLO87foeK8Y4B12FMCev WZSQ== X-Gm-Message-State: AFqh2kr6tdOaPgPPvt3dR3lK30m1fRJK3FEqCf0B8R9OvUvFSkOF9UU2 AXxxpt9JkpmPx66j4ZZTNwq33JiMnRyD5jHzrQLLeg== X-Google-Smtp-Source: AMrXdXuBncssOtbY3SF4hHimBtnI+qNtDjcF9d0KJNY9ySE4KG6Iuh6dT12P8C82lXCmkX6VTpPY/Czt1T6CQ0g3sRg= X-Received: by 2002:a5d:6205:0:b0:2be:1447:c36d with SMTP id y5-20020a5d6205000000b002be1447c36dmr267996wru.39.1673991515881; Tue, 17 Jan 2023 13:38:35 -0800 (PST) MIME-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> <20230105101844.1893104-36-jthoughton@google.com> In-Reply-To: From: James Houghton Date: Tue, 17 Jan 2023 13:38:24 -0800 Message-ID: Subject: Re: [PATCH 35/46] hugetlb: add MADV_COLLAPSE for hugetlb To: Peter Xu Cc: Mike Kravetz , Muchun Song , David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: 9qdfa555ujfdetd4jb35jmi7t9ywzs1w X-Rspamd-Queue-Id: 8FAD11C000E X-HE-Tag: 1673991517-24325 X-HE-Meta: U2FsdGVkX1/krmJV6WjZeTiE99LHwUeDPLn+fICR7C4bZdeAzm/IlCXt335y40qRJGWH2Cx3+bJkE8h/EK84/alt4dJ+eaANsqlO1+HlDDiRY5b3tfrNr8HWt0muR03GeaxakWPD1fMRvuxTFBxQomsvfs5RYs1gSnW1Z4uLu+bDvddY0h+NtJrPr/WUZtVexihpvYvwOlpE6Z0AWb4pdhVyWhoan8BAgrtJWFmcsuU9JhcFOsKoBaaIMhtle/AYiUYvPqbbRt1SMEEwTlFCoilrgfpcbT32mUe7E/PWGp/G1N+QlY+coD78FdlAMAVO49CZozWK3rIgYivhiwBM0q2yCy8u4cAZqbs9WyUnbHy/tnXhc+bZk0BpgDw+CzKbFqSGjW77fJfvYyEImJyxL5G2LC/qWheNIKjaN3Z7L7Qm0C4o0IBxL2XMnqX9TX2dmrkknnka4OCrDkMD91neYWeOK9fpAXLN0QQs82EnmA9WgUeovs7lbNEfy+zISWYar5gSmmmzeRIB9HVc4EZjxRKefTq6fInMnMaW0Flw+0Y47iBoxO2oFwNohhUaNPB12+Uc/2iGai0djHyHuESH4FD+A6/a1GdsmlRBNVfQQHgqNZ1MDeJPpGw3OMQRfCrWvPIyHtQdQ8+sRZRzScHtp/Vc6a/Z0/ta651qO3I5hMKQoHWwKnypW0EqaO6l27on1JOHn3vYRvRWKlf8Lw15Zpm3lrPEv/cii3AJwnTunjTvo/bbF27BR2Bj4HUU0DsvTXAtz61GUZC8BUsGU+m2NzCH6SU3CxYlLlcZF/7B7FWpaNUvhSjxY+MjyU7Daa/7iXmETbPW6cbwvkd4FV1JJbrl0PrYcAIQTeo+k1MCKBZlWF+aPvGUZC5XpJsHHn5lcit4PxY7wa4FLYQknY7DcDKlHQivdfOOsXIdiwuwKRFSKol5RXCZyuqE2u5TQwToAEC+piSZBMDNv4sKBKI i5uQgsdB q0IjTS3tCds+G86dDaHEtx3uRmQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > > + if (curr < end) { > > + /* Don't hold the VMA lock for too long. */ > > + hugetlb_vma_unlock_write(vma); > > + cond_resched(); > > + hugetlb_vma_lock_write(vma); > > The intention is good here but IIUC this will cause vma lock to be taken > after the i_mmap_rwsem, which can cause circular deadlocks. If to do this > properly we'll need to also release the i_mmap_rwsem. Sorry if you spent a long time debugging this! I sent a reply a week ago about this too. > > However it may make the resched() logic over complicated, meanwhile for 2M > huge pages I think this will be called for each 2M range which can be too > fine grained, so it looks like the "cur < end" check is a bit too aggresive. > > The other thing is I noticed that the long period of mmu notifier > invalidate between start -> end will (in reallife VM context) causing vcpu > threads spinning. > > I _think_ it's because is_page_fault_stale() (when during a vmexit > following a kvm page fault) always reports true during the long procedure > of MADV_COLLAPSE if to be called upon a large range, so even if we release > both locks here it may not tremedously on the VM migration use case because > of the long-standing mmu notifier invalidation procedure. Oh... indeed. Thanks for pointing that out. > > To summarize.. I think a simpler start version of hugetlb MADV_COLLAPSE can > drop this "if" block, and let the userapp decide the step size of COLLAPSE? I'll drop this resched logic. Thanks Peter.