From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 575B8C433EF for ; Wed, 5 Jan 2022 01:25:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DE6126B0071; Tue, 4 Jan 2022 20:25:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D94676B0073; Tue, 4 Jan 2022 20:25:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C0E926B0074; Tue, 4 Jan 2022 20:25:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0143.hostedemail.com [216.40.44.143]) by kanga.kvack.org (Postfix) with ESMTP id B1D686B0071 for ; Tue, 4 Jan 2022 20:25:11 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 7370D8BEE9 for ; Wed, 5 Jan 2022 01:25:11 +0000 (UTC) X-FDA: 78994490022.24.B00328E Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2057.outbound.protection.outlook.com [40.107.94.57]) by imf30.hostedemail.com (Postfix) with ESMTP id C13D980003 for ; Wed, 5 Jan 2022 01:25:10 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=c49pOesk3EPWw7VAbTEPhXYkaNUFRP2BtRFKTj1RBa4dcusWSOmgfB1CYL/BdbjVOryhLaHXHK57LQ1UQXrE1yvtoch46jH4bu00/GgEmduk5IjC2E5qNxcro/4VnEiKIXk0tVbndOVj/Y78OlQ5JFnFdHPOpz2pwMdlaxw01+HwgMk0Tw4U3KkdUTTwBiS5CkzFoDGdmIc/ZAe0spXMdg3uCdf/OAHckh/2TtMrY+uc9qcmxEPJzwDVhsBHW7r81qfM5Z6b8R2dg1EpQs/kJ69paWnCIxYSP24KZWTvCWxuDUIIAf86zFkkABft2jhBL9jNLFQNKutM9002VJfSog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=keJo3wcBbWP5WhUl18XbW514w8/AIs6SoALydWgbGtU=; b=Wzt+cJes3VDJ7x3Bvq2YVgHSmoXNvM19ODhhUz4ZzvP0EJ1NXOJDKSXD/18NbfH6H8RB9TZJGEZeT8z8pAjaMuo/Rj0bjaJrx+N/J8JqDx3ZwIplt9WbwHY8iCRg3Y9dkumsblBJ2q5TMxYbQPyg2RiE8qIUEzmptFFkJ+BdzQ9/BC/dUszFQnr/+laBo2WIWTO6esUw61a8PANkcvT8fopHwwqGTPp+DeqIFHavMcBLvJxAU7jDqRlnrxsQKlQpsRCIPhywYXOZRUlv23E5Dxghb9HGUOqUuZmAqwzb+hyEP/06+kkMlcUMUezHYfl29lymVUTZLknYr5tuE843pQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=keJo3wcBbWP5WhUl18XbW514w8/AIs6SoALydWgbGtU=; b=EOHOb8qVyzpotco+Eg60sm8EzXRAdAMRCBFXsjEbeOYycks+CSmE6BSK/2CUC+ZSW/70msvq0OXHRkrxbcolI+yNJTEpyjE1tnRIzW+8HGyZgpzaL6e8LSjpB2LvWE0lxkT/wQ/gFWtYvR/nVxuwqraUBYC9bf8o9ew9FP0Go7juwuQ7M4mASodnnKJSnRf08BIfLtGIp35shxYX1lJctbFde+YXEgy/DGOULKLNX3uUOU/PPXH7kUJuAL2Cng0aaj2A/KczNkt96nIZMgI1MK+aWizFkhGqhILBog7lRoSkQ1O20EKExbY62zbFVL/oJRpfW0MPUG4GfWrniZgrbg== Received: from BY5PR12MB4130.namprd12.prod.outlook.com (2603:10b6:a03:20b::16) by BY5PR12MB4164.namprd12.prod.outlook.com (2603:10b6:a03:207::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4844.14; Wed, 5 Jan 2022 01:25:09 +0000 Received: from BY5PR12MB4130.namprd12.prod.outlook.com ([fe80::8496:16fd:65c5:4af7]) by BY5PR12MB4130.namprd12.prod.outlook.com ([fe80::8496:16fd:65c5:4af7%4]) with mapi id 15.20.4844.015; Wed, 5 Jan 2022 01:25:09 +0000 Message-ID: <3ac8af50-dff6-4a0f-dba6-8b8fe5f611d4@nvidia.com> Date: Tue, 4 Jan 2022 17:25:07 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.4.1 Subject: Re: [PATCH 05/17] gup: Add try_get_folio() Content-Language: en-US To: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org Cc: Andrew Morton References: <20220102215729.2943705-1-willy@infradead.org> <20220102215729.2943705-6-willy@infradead.org> From: John Hubbard In-Reply-To: <20220102215729.2943705-6-willy@infradead.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-ClientProxiedBy: SJ0PR03CA0082.namprd03.prod.outlook.com (2603:10b6:a03:331::27) To BY5PR12MB4130.namprd12.prod.outlook.com (2603:10b6:a03:20b::16) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: ebe8e487-e224-4fa1-e777-08d9cfea3621 X-MS-TrafficTypeDiagnostic: BY5PR12MB4164:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:8273; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: IV/q+zw4mG7MrbNuaf+nEW8MJObacDvywAi+lxgSVzCW65hSoIu1F40Zg34CqJsSXErdlOJ7/sTll28fPia2hn1h6260mPpE330M8PEjL6vvI94R5CQuOs4eszadgbakkPKIFsHrvaiydyf8cXXqfVsjSoFkFb96jfln3xRnrf01WRjqshHLat9jrO2pN//GRG9ndsO9bOUFXj5bZRBzHK/Tr0IMb2xLF5UpAAlgHXHEL4E6qF3AyZuGLDTfhEXsjMG4g8bCOyoDu+x/3hIzl2sHIgITlIWYSRg1u8L+p53Tdx2erbqAJYsXftDPWpeyg2WltmSf+TnhIAOYzwBb0wAbX+Yx3k51Q+X6bUyrFwLN/omqzkneKVMxg4lfKDXS2W1dd5pctRLKCdh+O8NCQtH7CtB+XwXHeV/KS14yPiYPI+di4rLl2iDCIn71uDyuQsccWKxZUAoK5m/BBIjI8XGm3O0R69+ql4NudgY8MrEWXkv0x6Wd88+dHg1ZxAHazPdp0EW4PIg5Ju0618vESuXtoX69MB1nBerpL3n1VqM3Fw3yM+4nrFcMl/Jjx1KLLLwNKTGwago8Py5C+rK5tNjGtzn2RY4QcEqx/1OxahQ68kgHd1AmJCnC35D0qKVouzrtT/Psm3cQYofXIEM3h0xIzKe0AXwLzaVfuts5I1c9NfUp+grI9Ru/6xvrlaDbR0KQXYeQ3x6GjlCT/GFF2tCx14KlhQMUMQ0f6iTRiA8= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY5PR12MB4130.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(4636009)(366004)(4326008)(36756003)(38100700002)(31686004)(6506007)(53546011)(2906002)(508600001)(2616005)(66556008)(186003)(26005)(66946007)(8936002)(5660300002)(83380400001)(6486002)(6512007)(66476007)(31696002)(8676002)(86362001)(316002)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?VHIyOCtOWGpNQWlGQlkxUXhleEVTME5QNTJpNjlMSkNCRnp3RE9ZSFNNN1NL?= =?utf-8?B?a1lsUEpHa29PMVBHUytBVktab0NsM0RQdVUxYWxqOXhQQ3hpcmJ4TEJuMGRL?= =?utf-8?B?cU9UeU1maVBNWHNQVUZseVh0SjQ2MUsrYndQUlR4QU1CWFNsR2hNd0RoRXRE?= =?utf-8?B?ZjdMbFBIcFBLcHYrTEVYOUxzdE5iT1NyYUZOcWxnNzRRUlFrRHBjN1A5RWpw?= =?utf-8?B?ajlHSDlyNVNLZ3FqVHhZaU1EcXFPdmZZUUFwdEI1SGNtMFVNbXMxNnhFVzVq?= =?utf-8?B?RTlNWGZQSDVZWjd1SlVPOGQyUGdNOVNqV2Q1bC9Wb1k1Qi9LSUNlaFJVbXda?= =?utf-8?B?Njlyd2hvVTU2QkFRSGEyQnhsR1VjWHhlNlRkTzNiQzZuTTRjWGNUUFdBZStL?= =?utf-8?B?WDhsYks5T0VqSlJVNTV0bXBFczRjbzhpSnZxbFBOVHE2cWRnVWM4UVkrQVVo?= =?utf-8?B?cm1TWC90WkdYcGdEYXBPZGM0RWxLdWtPRFdPMUxzcjVFVm9GNkY2blBiRFUw?= =?utf-8?B?VHZPWHJHb2FHRk9NUEUrRUswaUJoUE1vSVpzK1l3K09GeGVIbUxDNGdNbi9q?= =?utf-8?B?M0J0UTQ3UTZEQ0NOWnkwVDhXb0pKOHZmVXFrdE5iS0plV2lUNEZNamFUaGpO?= =?utf-8?B?YnJkWmxkeHpEbm9YMzQyZjFHbVB5ZnA5UnJWRmhEUDRVWnFmTitUUXlZRDRu?= =?utf-8?B?L3htU2IzWjRwZUJhbElRZmlndHlRTVpXVFFtWXl1OFBTcTJrNlU5MVl6a3JF?= =?utf-8?B?dTBCR3FOOFBZQ3IyRjlYTEgvc1Yrd0d2QXpLdXlLSmlRNS9DLzAzaXJqRXpm?= =?utf-8?B?REltc3Q4d1NJb2VBT0lDSFp3bEx3WThiVElMYnZuK3EvOE9lcy9vNkJRQmNw?= =?utf-8?B?b0F1azdUbTZJUm93TXZNQ0hoQktBM2NsYm1vQ0g1ZDhDTzUvMnNHZGFXUFNt?= =?utf-8?B?WExJdXd5aC9YK0ZjU1ZzVWo2WmNMMnQweGlMN2lSNjBkcm1KZW50dzkxTzAv?= =?utf-8?B?MjU1SzEyYVJ6MVhBSHp3Rzg3YVpEakVwcm5IWDJzcThoZVdrWFNFajdpZVJ5?= =?utf-8?B?MDc2UmZ4UENud3pJclVXSUlISDRLT1RCRC9IbGQ3V0c2WmlXUlM2a0RTYXRt?= =?utf-8?B?elZJOGZPb0hDVGNEVkxuR25RY2lZSGlNWTIvSDBRK2dqckxCem5vUzdhQVJF?= =?utf-8?B?b0Z5cS9peEIzak1xdi8vV3Z5MG55RzBOQTJpMTYyS2tDamdtbjdNYW95UkJo?= =?utf-8?B?TWM3Nkk1S0FDODZIMlJsdVBvdEtYSkNlMlhTMFJUbTNIdm93Q003cWJsTldF?= =?utf-8?B?VjMraDNsTmNjV1BONUtwc0dsd1NnOVVrbHFoMGw0dm5zeVdoYnJOM3daVEhv?= =?utf-8?B?ZitHYjljeWhXUC9OdnZyWWxXM2ROMytqZUl1cWY5SEIxeHI0amk3dGFkZGVX?= =?utf-8?B?eEI0bHVaejZFbjQ2bEoxR1lkbmNhK0NNWjJCTjB0ZnJ6QTlzeHFIWWJsYk5E?= =?utf-8?B?RkUwYkhhNDQ0WGQ1d29wU0ErY1RQOEZ2TDNGbkhRazY5emVTam5pNC95eWZ0?= =?utf-8?B?eE1ZRmNxQTVsRVRPZHRNMEdqdGNJQ3Btc3lMVkZXYnFoS2Jxa2szOHpWazJn?= =?utf-8?B?Rkk1TlVNc1h4bFZlV2VqaUJCaGN2eitZaktUN0MxV3NCUFk5aVJJQm9RWjd3?= =?utf-8?B?Q3VSbzZsNDBmY2xCN1JLaHZya3QzeUxDUUh4T3lEZ3pmZWxEWHBUdzNpV3JU?= =?utf-8?B?VjFLbDA0L1JyQkdhUmdMejZCaE9zNG9UbzcxL2pZUnp4Q25MMnYxUGM5bnIy?= =?utf-8?B?YjZacWlKWmRZb1FVZzJQUHRzK09ESW5iRDMrMENpZ2VrWnNuQXZNOE0vZGIv?= =?utf-8?B?bVZRUzNabUU2eWdKQnRPOFpjQTlNNGVwTUVVc1dnREdXOWo0eUh6MHhBWlZB?= =?utf-8?B?SGdHMk1odFZrdkdmb0krbjRtMm1UYVUzY05IcmgzS2oxWDY5R2ZSWWxxK29E?= =?utf-8?B?NUNVTG44ZTFqdXR1dmdKTnErWHNwci85Q3VvOG9lNTRCR3ordUczNnl2aDN0?= =?utf-8?B?YW12L05NSExzMGtGc3ZKRW5ubnlmcjdtNzFGd0E5NnNaRHFxS3N0NG1iUSt2?= =?utf-8?B?cTI1cDdZbC9FeG4xUFhocUluVE03elBpczZTdGxhMittZUFoUnVvSGx3dFpG?= =?utf-8?Q?DNJtLj97Ss8ULDxEEnEaYSM=3D?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: ebe8e487-e224-4fa1-e777-08d9cfea3621 X-MS-Exchange-CrossTenant-AuthSource: BY5PR12MB4130.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Jan 2022 01:25:09.0248 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: uaxTqglrKgxiwkKwYeAqDGZI2XmdMFIZwRooAnrxpIHHLCSLy2GM2bdxwx/DpEOJ4HOkEO8ZfpCeDQ+076CNhw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR12MB4164 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: C13D980003 X-Stat-Signature: xqdzuqtw1d9ao55z4c6fjzw6d78xet1d Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=EOHOb8qV; spf=none (imf30.hostedemail.com: domain of jhubbard@nvidia.com has no SPF policy when checking 40.107.94.57) smtp.mailfrom=jhubbard@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com X-HE-Tag: 1641345910-604970 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 1/2/22 13:57, Matthew Wilcox (Oracle) wrote: > This replaces try_get_compound_head(). It includes a small optimisation > for the race where a folio is split between being looked up from its > tail page and the reference count being obtained. Before, it returned > NULL, which presumably triggered a retry under the mmap_lock, whereas > now it will retry without the lock. Finding a frozen page will still > return NULL. > > Signed-off-by: Matthew Wilcox (Oracle) > --- > mm/gup.c | 69 +++++++++++++++++++++++++++++--------------------------- > 1 file changed, 36 insertions(+), 33 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index 2c51e9748a6a..58e5cfaaa676 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -29,12 +29,11 @@ struct follow_page_context { > unsigned int page_mask; > }; > > -static void hpage_pincount_add(struct page *page, int refs) > +static void folio_pincount_add(struct folio *folio, int refs) > { > - VM_BUG_ON_PAGE(!hpage_pincount_available(page), page); > - VM_BUG_ON_PAGE(page != compound_head(page), page); > + VM_BUG_ON_FOLIO(!folio_pincount_available(folio), folio); > > - atomic_add(refs, compound_pincount_ptr(page)); > + atomic_add(refs, folio_pincount_ptr(folio)); > } > > static void hpage_pincount_sub(struct page *page, int refs) > @@ -63,33 +62,35 @@ static void put_page_refs(struct page *page, int refs) > } > > /* > - * Return the compound head page with ref appropriately incremented, > + * Return the folio with ref appropriately incremented, > * or NULL if that failed. > */ > -static inline struct page *try_get_compound_head(struct page *page, int refs) > +static inline struct folio *try_get_folio(struct page *page, int refs) > { > - struct page *head = compound_head(page); > + struct folio *folio; > > - if (WARN_ON_ONCE(page_ref_count(head) < 0)) > +retry: Yes, this new retry looks like a solid improvement. Retrying at this low level makes a lot of sense, given that it is racing with a very transient sort of behavior. > + folio = page_folio(page); > + if (WARN_ON_ONCE(folio_ref_count(folio) < 0)) > return NULL; > - if (unlikely(!page_cache_add_speculative(head, refs))) > + if (unlikely(!folio_ref_try_add_rcu(folio, refs))) I'm a little lost about the meaning and intended use of the _rcu aspects of folio_ref_try_add_rcu() here. For example, try_get_folio() does not require that callers are in an rcu read section, right? This is probably just a documentation question, sorry if it's obvious--I wasn't able to work it out on my own. > return NULL; > > /* > - * At this point we have a stable reference to the head page; but it > - * could be that between the compound_head() lookup and the refcount > - * increment, the compound page was split, in which case we'd end up > - * holding a reference on a page that has nothing to do with the page > + * At this point we have a stable reference to the folio; but it > + * could be that between calling page_folio() and the refcount > + * increment, the folio was split, in which case we'd end up > + * holding a reference on a folio that has nothing to do with the page > * we were given anymore. > - * So now that the head page is stable, recheck that the pages still > - * belong together. > + * So now that the folio is stable, recheck that the page still > + * belongs to this folio. > */ > - if (unlikely(compound_head(page) != head)) { > - put_page_refs(head, refs); > - return NULL; > + if (unlikely(page_folio(page) != folio)) { > + folio_put_refs(folio, refs); > + goto retry; > } > > - return head; > + return folio; > } > > /** > @@ -128,8 +129,10 @@ struct page *try_grab_compound_head(struct page *page, > int refs, unsigned int flags) > { > if (flags & FOLL_GET) > - return try_get_compound_head(page, refs); > + return &try_get_folio(page, refs)->page; Did you want to use folio_page() here, instead? > else if (flags & FOLL_PIN) { > + struct folio *folio; > + > /* > * Can't do FOLL_LONGTERM + FOLL_PIN gup fast path if not in a > * right zone, so fail and let the caller fall back to the slow > @@ -143,29 +146,29 @@ struct page *try_grab_compound_head(struct page *page, > * CAUTION: Don't use compound_head() on the page before this > * point, the result won't be stable. > */ > - page = try_get_compound_head(page, refs); > - if (!page) > + folio = try_get_folio(page, refs); > + if (!folio) > return NULL; > > /* > - * When pinning a compound page of order > 1 (which is what > + * When pinning a folio of order > 1 (which is what > * hpage_pincount_available() checks for), use an exact count to > - * track it, via hpage_pincount_add/_sub(). > + * track it, via folio_pincount_add/_sub(). > * > - * However, be sure to *also* increment the normal page refcount > - * field at least once, so that the page really is pinned. > + * However, be sure to *also* increment the normal folio refcount > + * field at least once, so that the folio really is pinned. > * That's why the refcount from the earlier > - * try_get_compound_head() is left intact. > + * try_get_folio() is left intact. > */ > - if (hpage_pincount_available(page)) > - hpage_pincount_add(page, refs); > + if (folio_pincount_available(folio)) > + folio_pincount_add(folio, refs); > else > - page_ref_add(page, refs * (GUP_PIN_COUNTING_BIAS - 1)); > + folio_ref_add(folio, > + refs * (GUP_PIN_COUNTING_BIAS - 1)); > > - mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_ACQUIRED, > - refs); > + node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, refs); > > - return page; > + return &folio->page; > } > > WARN_ON_ONCE(1); Just minor questions above. This all looks solid though. thanks, -- John Hubbard NVIDIA