From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98349C433B4 for ; Fri, 16 Apr 2021 04:19:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0087B611AE for ; Fri, 16 Apr 2021 04:19:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0087B611AE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3EA2F6B0036; Fri, 16 Apr 2021 00:19:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 39AE16B006C; Fri, 16 Apr 2021 00:19:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 23A8F6B0070; Fri, 16 Apr 2021 00:19:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0006.hostedemail.com [216.40.44.6]) by kanga.kvack.org (Postfix) with ESMTP id 03CDC6B0036 for ; Fri, 16 Apr 2021 00:19:31 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id A8FE78248047 for ; Fri, 16 Apr 2021 04:19:31 +0000 (UTC) X-FDA: 78036926142.10.EB81852 Received: from mail-ej1-f46.google.com (mail-ej1-f46.google.com [209.85.218.46]) by imf11.hostedemail.com (Postfix) with ESMTP id 9E0C02000242 for ; Fri, 16 Apr 2021 04:19:20 +0000 (UTC) Received: by mail-ej1-f46.google.com with SMTP id sd23so31465648ejb.12 for ; Thu, 15 Apr 2021 21:19:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=BYPjZinW4Vog3I9TnLyI0GoInS9aoZ8KHr6MlIE/U9Y=; b=JV97mdetnzQPwWYULXzQjwsDFwoXrfrVaelXO+XvlFqj7bz4KYuzZKmPuao7s+MaUK 4Bdx2aPtqgH2NwImTLgv0agvRRC8e3CjWz6v99oxuUgfCaFuXqJbQNO8kbtgrZZNSxN9 F2gE5PpzvsAj5vkcViGixiDc5dGMp7wcJd0JzucJf/X44Zzm6T/8D8TBh2u67RPxbfRO z2fmcWX3BDU2+NB75rYo683lljBvPaJV4CdNY9JlA3F6SRh/29U7DUIl+8RKiTnTqiyi 8HySzAT3iqDNgJ+qIwg9U/9CJg/3qGDxgGoXlAUJ1FVrOdlfxchghfPIx/9bLOhfgtkW SS4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=BYPjZinW4Vog3I9TnLyI0GoInS9aoZ8KHr6MlIE/U9Y=; b=FDdCmw78iT+0T/1ZJaiQ/Jz3oS0JK7aRLyxi8EKp5xQxz/f5Q0KaVnVnlpA/UZ/2L3 ZtNO9YjmzutgkjtoGTbIbiFLJIX1FATrPDVmPP6gEIoS1DIzmRlGjYtM8NkDp5T+idu0 cF1S6VIdBAXva/DFOEDIUdA9D9N3YiDVxaDNQ4faL1WCUuN0vsQxwWBG4G5AVvEs6Bin Acg5bA6LtVdLeFFvw0Oh1MCE0hyBWUeCQZus1aUTbzGNgR8vXxk8iaidskXpoLcr+6cd yU/r34PVD7o1PWwKSAWmr72b0nnSMLGbgoprQDqfS4v0Ob0uKUYU8tNneo7IG78fW3Gr jIRQ== X-Gm-Message-State: AOAM533+4iJnV/lare/djaoHqIRwWGpWED/6B75EezfegaT7IKROAexQ Xrubd2j7U2qBbxDwoP+FvxpamvEE+ygKRtxrRuY9tQ== X-Google-Smtp-Source: ABdhPJxD9dfxeOsgGmfqe+w1gIEwZ7QHRp/Y/l6TVaR6etOgU4eBcst1e8X7wND65V5EZEqALyErpFcU6soADVBAuMQ= X-Received: by 2002:a17:907:20e9:: with SMTP id rh9mr6672157ejb.523.1618546769476; Thu, 15 Apr 2021 21:19:29 -0700 (PDT) MIME-Version: 1.0 References: <20210416025745.8698-1-apopple@nvidia.com> In-Reply-To: <20210416025745.8698-1-apopple@nvidia.com> From: Dan Williams Date: Thu, 15 Apr 2021 21:19:18 -0700 Message-ID: Subject: Re: [PATCH v4] kernel/resource: Fix locking in request_free_mem_region To: Alistair Popple Cc: Andrew Morton , Linux MM , Linux Kernel Mailing List , David Hildenbrand , Daniel Vetter , Greg KH , John Hubbard , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Balbir Singh , Muchun Song , kernel test robot Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 9E0C02000242 X-Stat-Signature: 3b3x1apmeppbeumcwmkpp55wdxu1dg6w X-Rspamd-Server: rspam02 Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf11; identity=mailfrom; envelope-from=""; helo=mail-ej1-f46.google.com; client-ip=209.85.218.46 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618546760-832729 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Apr 15, 2021 at 7:58 PM Alistair Popple wrote: > > request_free_mem_region() is used to find an empty range of physical > addresses for hotplugging ZONE_DEVICE memory. It does this by iterating > over the range of possible addresses using region_intersects() to see if > the range is free. > > region_intersects() obtains a read lock before walking the resource tree > to protect against concurrent changes. However it drops the lock prior > to returning. This means by the time request_mem_region() is called in > request_free_mem_region() another thread may have already reserved the > requested region resulting in unexpected failures and a message in the > kernel log from hitting this condition: > > /* > * mm/hmm.c reserves physical addresses which then > * become unavailable to other users. Conflicts are > * not expected. Warn to aid debugging if encountered. > */ > if (conflict->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY) { > pr_warn("Unaddressable device %s %pR conflicts with %pR", > conflict->name, conflict, res); > > To fix this create versions of region_intersects() and > request_mem_region() that allow the caller to take the appropriate lock > such that it may be held over the required calls. > > Instead of creating another version of devm_request_mem_region() that > doesn't take the lock open-code it to allow the caller to pre-allocate > the required memory prior to taking the lock. > > On some architectures and kernel configurations revoke_iomem() also > calls resource code so cannot be called with the resource lock held. > Therefore call it only after dropping the lock. The patch is difficult to read because too many things are being changed at once, and the changelog seems to confirm that. Can you try breaking this down into a set of incremental changes? Not only will this ease review it will distribute any regressions over multiple bisection targets. Something like: * Refactor region_intersects() to allow external locking * Refactor __request_region() to allow external locking * Push revoke_iomem() down into... * Fix resource_lock usage in [devm_]request_free_mem_region() The revoke_iomem() change seems like something that should be moved into a leaf helper and not called by __request_free_mem_region() directly.