From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1978CC25B74 for ; Mon, 27 May 2024 07:53:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A11E36B0092; Mon, 27 May 2024 03:53:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C1796B0093; Mon, 27 May 2024 03:53:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 889DF6B0095; Mon, 27 May 2024 03:53:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 6DC786B0092 for ; Mon, 27 May 2024 03:53:56 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E5F5B14144D for ; Mon, 27 May 2024 07:53:55 +0000 (UTC) X-FDA: 82163412030.01.3A427EC Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf10.hostedemail.com (Postfix) with ESMTP id CCC21C0006 for ; Mon, 27 May 2024 07:53:53 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ZPk5ff3e; spf=pass (imf10.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716796433; a=rsa-sha256; cv=none; b=NU2CKyU/bGy18HdRpVA5QHH303HXHkP69wAmqdVFFcw8Vdsp19OLVXt8gzwnzeipMATKTH C/CFs9K3eWBjF2rplHb+acazS78xIUbSgzRQsD2vwnVAs4+z32a2z/O/jAPU2mJcpNdFyM i64MgsECDKnoqMMKhEgeCOXiLsLq3NQ= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ZPk5ff3e; spf=pass (imf10.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716796433; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=94Nn+gDWguFMQFL1W0OycJxi9V0LL/+VpNzMqcw5SdI=; b=lIClcRvr592mG6S8FKfXS2hBvkjOcJ32MXWvxaIk9OwgnxBSpdGrZYWaC3ATwoLLx9+YPA 1ic5F9C1WD2lxwQOlUnND4kXy+Q1rGTS1nGvD1DlntCWUN1tHL2cxyKSi22PzxbjLqRNKq ld3rXQ2Qy+nF++tuiFvlaxIXE7tnRy8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1716796433; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=94Nn+gDWguFMQFL1W0OycJxi9V0LL/+VpNzMqcw5SdI=; b=ZPk5ff3eP4gDXiCNJI4RnQOpfQw3knBMQD3gtSpUhCBBrVviYFoWaAbmO8oLpBypITxq3b PQrwh8KuRgzSCovHhwEn0DaCFz3tErFAEClf7T8+D0dPIY36oYeo7c2pJKHbNMUmy5Nq3c a6cx7zRmVd2XPF9KuTw9H4YKOrQhQgI= Received: from mail-lj1-f199.google.com (mail-lj1-f199.google.com [209.85.208.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-463-lCY12zlUP8STTzaCg2IUOA-1; Mon, 27 May 2024 03:53:50 -0400 X-MC-Unique: lCY12zlUP8STTzaCg2IUOA-1 Received: by mail-lj1-f199.google.com with SMTP id 38308e7fff4ca-2e95877b88fso25313221fa.2 for ; Mon, 27 May 2024 00:53:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716796429; x=1717401229; h=content-transfer-encoding:in-reply-to:content-language:from :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=94Nn+gDWguFMQFL1W0OycJxi9V0LL/+VpNzMqcw5SdI=; b=SFxa/jlNc/GM99EZKT54vFzgn+uUgEyCekoZwvQB+nav1jTv2jSH5Nrad3j9anP3DP svaMy6oqZ++8J27QtW0b6nwrO8JBCu++5pwvuyPVpz2Z/68NMwtPQ1NAIU+yb6nMk5pF 8XBsbiiwZ8u9Oe+mPz9IfhUFWXoW0oGh7Pb05Swp2/7S58eFXj7Tyk6xB9zWVJJc89/1 uMLUfg6Ylw6clj4xfYWJjiDqnzVkbHhGO1uW8Gp5OAm0bAVGJJ/103HYRqCUMbwcs7qk oNL2yXd+p5NdeeBwYQuYM6JSYdOZxr1WjAABAqXDsXBkKFHIpBW5OTCkJpnZNammfHrX q/lA== X-Forwarded-Encrypted: i=1; AJvYcCXwnm+og5IN25DEcz1icDT2N++D620HIPjKmmVvlcdMRGsyJqjmoVxfo6wooCV/9GXNyz444aSW9fjp+7K8YVINPi4= X-Gm-Message-State: AOJu0YwRWV5L7YI3cyvbUEBLyJzlqgXAt+R3ry1cFccEFltfPOCEx/a6 Rp1iKHi+AKz6KLS5AkF6RZUBYmEHrUGYDUqsrZNumLGs+PojtI1rvWMzAWy+cN9Eb2+StWfuVy6 fkLl02wSrZgeC9jhNx1ERb7IjXHo5IB4f9JQLeFuRKlpXb3OD X-Received: by 2002:a2e:9bc6:0:b0:2e9:7417:bb0a with SMTP id 38308e7fff4ca-2e97417bbe0mr18593021fa.3.1716796429006; Mon, 27 May 2024 00:53:49 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH7UsmrWTRpKTMmDwSTymTCt/+SN36zCsBuKQO3VhRHcXruZ8UCSn/4VhRRIuJyyC4L2SEuhw== X-Received: by 2002:a2e:9bc6:0:b0:2e9:7417:bb0a with SMTP id 38308e7fff4ca-2e97417bbe0mr18592911fa.3.1716796428423; Mon, 27 May 2024 00:53:48 -0700 (PDT) Received: from ?IPV6:2003:d8:2f28:4600:d3a7:6c26:54cf:e01e? (p200300d82f284600d3a76c2654cfe01e.dip0.t-ipconnect.de. [2003:d8:2f28:4600:d3a7:6c26:54cf:e01e]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-42100ee954bsm132541905e9.4.2024.05.27.00.53.46 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 27 May 2024 00:53:48 -0700 (PDT) Message-ID: <8201bd1d-f617-491b-a10d-1fe689e9eb9b@redhat.com> Date: Mon, 27 May 2024 09:53:45 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/2] mm,memory_hotplug: Remove un-taken lock To: Brendan Jackman Cc: Oscar Salvador , Andrew Morton , Mike Rapoport , Michal Hocko , Anshuman Khandual , Vlastimil Babka , Pavel Tatashin , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20240521-mm-hotplug-sync-v1-0-6d53706c1ba8@google.com> <20240521-mm-hotplug-sync-v1-1-6d53706c1ba8@google.com> <78e646af-e8b5-4596-8fbf-17b139cfdddd@redhat.com> <0506ae4e-e17d-4c3c-aa3e-1cea04909e5a@redhat.com> From: David Hildenbrand In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: CCC21C0006 X-Stat-Signature: bdbi7m8gcu69g5ai933iijhrfbm8mnci X-HE-Tag: 1716796433-431047 X-HE-Meta: U2FsdGVkX1/GqOYP366AFTrdEnxUi2x4EsoNG14P0qcAl6H0ry1TR5XJvI0LuryQ0LGE7qTCdVknH2OSM3dSdo/coMttfQh4HFlnlY4WRK+dVenPBRC7GuOAhyXD9RvbsvrVnYT4CZ1cuFffslMscozkNzSfUmRbSwhXGtBRg+VTN4hHR2JWBhccMVFT8jBZuScNfPsX9IeLEhebsQ7nECte/nj4TFcasj3UJ+z1fSGhSba+Cj9/REgUwRwerkkoi0nPzSzWOstPs0R5piezzkTz2fBcTRohn0RDpQAX4R+4uwisctfRkqpYJJ2ynNJrGuNqSGKhmG5XWJlVc2vWYp3SBilMtgG2S8Ul3r35eWbBUkojNuH5kP7ozbBiN7a9GAChpwP1EjaHpl2ORgQPHFUuboIR6AzVBbosHzrYHCgkmHoa4dE84G+GTQcRd2hSFHBbzog96guT9lU3xyyJ45bQGfwE52xIHeZcP3MZQp1nixHwyipJoeH0Qq4SBEsvJDANRJ4r5eTnLrd9mKEXaKIaUPbLC0F63WsVe4c++08VYnsRq741em1AMiCUNMx87oT349hemzhKIcZaN7lcLYNUr506ZV9W4PbbGXqC1cz530cNk/748OyGnizMO7B0kjp0u5P5OptL+vuS3sMbAB3G3pvuao7XaycEZywNWVjGHVC8X5MAV0DLr7uiYt7shlJ5Uq6z/t6quano+LJ77Vzp7IJiNwZrJhg1zKptSese2zSTKkNbNJb+rINSQthgi7BhjIArPBHOjZP3qGbMWxA8jO5l8KrT3EvDDL0N7dIUW8Ww8P3Muy3Ef8g1ws8RZ+ZOZeuM9+wRpWQ3LBJj1SoM7+SaDSEQsFCl7zMzx0IC64tSuvlIiFM1d45yYEFhNNWAEm6HVbCP/wHBjm2Q4L6/WBmRlHanNTinKlSQcPvsOadcBoXnNO0nXoCsH5GIyEKGPn1lL0MUr5T6T3k sOd1IfFN tSPgDS5XUbx646oGUAgHWq8XPBn/1HNDqKrJeGh4ASxr+/azoItUoCtVLbk6keh809SxT1v2xggST1y0xoU5m06mA9s/DbTPe/IHlyOGLnhtZhGAgccdNUx2BU8+xILkkeFsOh4QwuOaGx8KCV84FjHCkF+ErD8PnN+pzej6CI8mi6+tJjeOqU09Mxb3D2Q4tMQGCD5MTtojYSbSoi/7RQ3QCturDBJmAtTdIgYTgVxHTtPpJb7El73c8bPLjGT2iYeA2hRXH4DeUQiYhF47H84YMYz/PCovRlR5k3UJEfedohXSzARqzI1QSnBFkQhXl+q0Kb21xHCkWqMYoUFO++ScABRZm55mYDu86NXSn8DirgUP9U/m3tEIm5KjZafpuEPvoePmzw83z0/a6A2+K+IDh9J7tKfAWs0whpDLh4qAaO922oEERFmFX+0WIi5cErcvszq3c+l8qELtWiBeU7E3GlP+2VFa5fgZGk5M30tsz1ug= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000085, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Am 24.05.24 um 14:02 schrieb Brendan Jackman: > On Wed, May 22, 2024 at 05:24:17PM +0200, David Hildenbrand wrote: >> On 22.05.24 16:27, Brendan Jackman wrote: >>> On Wed, May 22, 2024 at 04:09:41PM +0200, David Hildenbrand wrote: > >>> By the way, some noob questions: am I OK with my assumption that it's >>> fine for reader code to operate on zone spans that are both stale and >>> "from the future"? thinking abstractly I guess that seeing a stale >>> value when racing with offline_pages is roughly the same as seeing a >>> value "from the future" when racing with online_pages? >> >> Right. PFN walkers should be using pfn_to_online_page(), where races are >> possible but barely seen in practice. >> >> zone handlers like mm/compaction.c can likely deal with races, although it >> might all be cleaner (and safer?) when using start+end. I recall it also >> recalls on pfn_to_online_page(). >> >> Regarding page_outside_zone_boundaries(), it should be fine if we can read >> start+end atomically, that way we would not accidentally report "page >> outside ..." when changing the start address. I think with your current >> patch that might happen (although likely extremely hard to trigger) when >> growing the zone at the start, reducing zone_start_pfn. > > Thanks a lot, this is very helpful > >>> Also, is it ever possible for pages to get removed and then added back >>> and end up in a different zone than before? >> >> Yes. Changing between MOVABLE and NORMAL is possible and can easily be >> triggered by offlining+re-onlining memory blocks. > > So, even if we make it impossible to see a totally bogus zone span, > you can observe a stale/futuristic span which currently contains pages > from a different zone? Yes. Note that zones/nodes can easily overlap, so a page being spanned by another zones is common and supported already. > > That seems to imply you could look up a page page from a PFN within > zone A's apparent span, lock zone A and assume you can safely modify > the freelist the page is on, but actually that page is now in zone B. That's why we obtain the zone/node always from the page itself (stored in page flags). This data can only change when offlining+reonlining memory (and pfn_to_online_page() would refuse to hand out the page while temporarily online). There were discussions around using RCU to improve pfn_to_online_page() racing with memory offlining, but the motivation to do that has been rather small: we barely see such races in practice. Memory offlining+re-onlining simply takes too long. > > So for example: > > 1. compact_zone() sets cc->free_pfn based on zone_end_pfn > 2. isolate_freepages() sets isolate_start_pfn = cc->free_pfn > 3. isolate_freepages_block() looks up a page based on that PFN > 3. ... then takes the cc->zone lock > 4. ... then calls __isolate_free_page which removes the page from > whatever freelist it's on. > > Is anything stopping part 4 from modifying a zone that wasn't locked > in part 3? Likely that overlapping zones already exist and are handled accordingly. -- Thanks, David / dhildenb