From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DDEEC41513 for ; Wed, 16 Aug 2023 17:13:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BE2158D004C; Wed, 16 Aug 2023 13:13:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B92DD8D0001; Wed, 16 Aug 2023 13:13:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A5A938D004C; Wed, 16 Aug 2023 13:13:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 93E708D0001 for ; Wed, 16 Aug 2023 13:13:09 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 725611A0F5A for ; Wed, 16 Aug 2023 17:13:09 +0000 (UTC) X-FDA: 81130613298.09.498FCA7 Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com [209.85.128.41]) by imf07.hostedemail.com (Postfix) with ESMTP id 8AB4C40015 for ; Wed, 16 Aug 2023 17:13:06 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=KXJiRmmY; spf=pass (imf07.hostedemail.com: domain of jannh@google.com designates 209.85.128.41 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692205986; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZYUypjySzrAs8GAD7XVdfOM4s0Fcr0Thimjx1ALtpVI=; b=HafaldPWBigoV7++sM1Ak2Bou9ZMD86Ppmw+G/orivD1twazyjlOZlHh4YwTcexqDvOBqc 9/Llzx4CP8Pz7yw94QtdaotkxjLfLa6Lpua8P3nMAwIQuHe6pCFiMqoflXNcTSdYEFp3Hr UXoEjjrz1RvTjw/WDgPM1pIv/F6VGSM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692205986; a=rsa-sha256; cv=none; b=0LYu2lJ//WZWqBf4Q0MqiSHjY5Gy35L3RvKdWqxNgFMoFIwRt0MfJzQaN54Ohkwj9ugsZW VVu0LyLcWVYC99LqYC8btA+PT0RlolNKQuEZZ3bsL1uZ37iXthokaFBWS4ukodTLL9k8Qx XYgfKihp6IESO8Ii1d/aT+IxbeDWJ6w= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=KXJiRmmY; spf=pass (imf07.hostedemail.com: domain of jannh@google.com designates 209.85.128.41 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-wm1-f41.google.com with SMTP id 5b1f17b1804b1-3fe2d620d17so5925e9.0 for ; Wed, 16 Aug 2023 10:13:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1692205985; x=1692810785; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ZYUypjySzrAs8GAD7XVdfOM4s0Fcr0Thimjx1ALtpVI=; b=KXJiRmmYEO/zdoFBPp/YAhd426XdPnK00YUWC+8mkii6PtisTHWhvdS5tkZ0xGKMtk fHk3BqdfOMpgwYwuvAG7FpL6rospMidGIxerCPzwoKvxbDBU52MX5lF2D+nBY+DLtxyt RGc50o7vFONOM/2TNGQJZm/lk5P8AsD35hpJq5XVbrBE1Ilo8NJOHBJmqZgsKQnU+Vca NyIi10ieIOHkI/bcXrFIuBgptWq1ow8nONf3bVOFUoJYTlIlX/8kLhNEBNd/TlzqpO4S BYd6LC7im3QYFcDRiwfebjrzs6KxRqPOsGcTpyx/CZSBFCa8HnsEsGT1vbz6hssLcPnb E2oA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692205985; x=1692810785; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZYUypjySzrAs8GAD7XVdfOM4s0Fcr0Thimjx1ALtpVI=; b=cmIrm02DTFy14I62EAZQ2/VUulNnSHlT7bGp0oOGgQe75akH4lY3tx8rJIo6DdEStK l/f544ZJ1HOIlpvOqXzCh67mX/k4MYEAiJp4yvsMjc4SmYaLOR9UD0ZwcpbiqrTal0HF /XsuHEC5BimKTAdUOH73iPdzG9fsOk02FufYLOgH6Rje1EZS0+iNPWSninKSMTszFVo+ jdt8G5lH1YHvzlOxRy9JYJ7RK1jOM1G8l/jfmMv8wXNi65eygazlM12sDH79nzC4RIa5 00l8v9dM36I3iZ69Aq6FNgF6sqk+C3hQNnVaaTy2tVud4JHcGEqDzPsqnjOe0SSrhzhk YU8w== X-Gm-Message-State: AOJu0Yx/ShGQJsua0Aojvv0/l6P/nXSylNPHp3++p9gMFSpjMlwn+Qd6 GUyBbyla0vgi0ZI8BilYNaP17xCC6eTCnyRUpgKmbg== X-Google-Smtp-Source: AGHT+IHSaO6WGnDE0QP8Hi7+3CkXN+E8BSEciMIfec2sBc4o6XMsWSHcwRcMePlTHR9KIDKgQ2howap+w7lK11I/LCc= X-Received: by 2002:a05:600c:5119:b0:3f4:fb7:48d4 with SMTP id o25-20020a05600c511900b003f40fb748d4mr4622wms.3.1692205984767; Wed, 16 Aug 2023 10:13:04 -0700 (PDT) MIME-Version: 1.0 References: <20230816161758.avedpxvqpwngzmut@revolver> In-Reply-To: <20230816161758.avedpxvqpwngzmut@revolver> From: Jann Horn Date: Wed, 16 Aug 2023 19:12:27 +0200 Message-ID: Subject: Re: maple tree change made it possible for VMA iteration to see same VMA twice due to late vma_merge() failure To: "Liam R. Howlett" , Jann Horn , Andrew Morton , kernel list , Linux-MM Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 8AB4C40015 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 9tjo8dk5znj5q5roife8hpnrjexrain4 X-HE-Tag: 1692205986-933164 X-HE-Meta: U2FsdGVkX18Y7UW1jpNwxBVAS0YpBiXbtr7IYJb7Q1DiGWoAMkQp5nfN5+1WaVLL6qnpZhrEPX6IOAu2EdXxnpwqBiXozm7vYSxNFsbeaRrOA6+tUvGV/Ll+mYsabs5XaqNrBowTNwEuZmxX0pUveLFizo6Yrv4hAS4wCLzNDIgjXjvJzEg/of8sVf2neaS88QVIv902+V1teDAA1bKrNNzxQOlK/PO4Ra2XwiCBw3T68259k9wyL/iZGJncJzjcNeHxM81KkfczzkBswhiIVgWsJwZe2gu3+Ya+Vk56PuxWGqtOxkyYYnOYrGjauWfvJwCFuexoBlAHMaaClP5Wi/MHAiJ+LVXWgQ/Mqi8m+/q7TU/DmlsHiJdTQbxFFECT2ND5THUbBzUTqB5Vf3QRai/DKQMFi1UlLf52mVrOdztrRI1NwoG1CxajmZs8+Eu/HQhkT2mcGXOs9ZXMD8R3794UOB8jJJtFEFJWShUu6vARBdM/6GsEI/Lrbf6cTfP+sEEuCx10F2mvJxpxNq9S1bmRQwMPr9q05HWRCog5/P7SAc3w48Apv9Eq/BJrqVQpWvv/E3czqaNxgObZ4qpo9BPjuS1b48nOnoQT/xg/5oHXlcbuSrQHj0Ghg1WmVfyBFC6eme2Nv8KH4fTpB1QGKrjP29ANI2CcvE9b5p2fGsiuAroTsdpj5hGioubK++x1R3w346sI5BJCQc80dwCK6745alIYD1vDj3N5OSdGiNdx4jkXWVUu/Vy+2eUeSOgRTFN6Lnn4mUcaW0umOYpQRmUbmtatd1k4uZlwc9b5E0UUublYmDmS2IR5sGh6hrM4/4Htu66UHdqKnPXIQRyiEY+2yAPDf+04TzA5appe2hgP14wsXALDqTDs2ypguYhO/O8etSiEcoT5QVG02iPGmJ4AMINxOhMnnuD+lSAWeRWgVpnmY9z/tGu4yIIm5btpmXFcff9kVCT8JZZd4La eCMfYkhK xQJ/Q1LhAx0Xee4dISw8xhD+KCexrfNe5RXh84xw6rV/y3ZjTPOTINeWuC52c8cfcBZ0vQds32I4BK8H3XJe4f5zyrGasDG+azY37FE0yx+BS8WaZR32GhbIwGt9S3h2+6RUopWyb+NKKtFjwmlsFqZfCeP0V2DT4Rp5dgq3QyJXgWZXxCgKXv8ChiV4qmGQsukjAc5PnKqLZuJPXzgZ5JZuNrVwCTF3xKdW7aWHes2/fEEu9xf7c0UGxisHM2lovB6wwqfUSS88rNXelVoqPBX8cTz5/O/trOwV82yfwazQKy103kWqwJmWyhQw7cTGFWwe0que4DbglxKLStdqN+kaiyZN8HubWlWuKDdZBw9hCBB0UX84IE60ndNKIlq9Q7osk X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Aug 16, 2023 at 6:18=E2=80=AFPM Liam R. Howlett wrote: > * Jann Horn [230815 15:37]: > > commit 18b098af2890 ("vma_merge: set vma iterator to correct > > position.") added a vma_prev(vmi) call to vma_merge() at a point where > > it's still possible to bail out. My understanding is that this moves > > the VMA iterator back by one VMA. > > > > If you patch some extra logging into the kernel and inject a fake > > out-of-memory error at the vma_iter_prealloc() call in vma_split() (a > > real out-of-memory error there is very unlikely to happen in practice, > > I think - my understanding is that the kernel will basically kill > > every process on the system except for init before it starts failing > > GFP_KERNEL allocations that fit within a single slab, unless the > > allocation uses GFP_ACCOUNT or stuff like that, which the maple tree > > doesn't): [...] > > then you'll get this fun log output, showing that the same VMA > > (ffff88810c0b5e00) was visited by two iterations of the VMA iteration > > loop, and on the second iteration, prev=3D=3Dvma: > > > > [ 326.765586] userfaultfd_register: begin vma iteration > > [ 326.766985] userfaultfd_register: prev=3Dffff88810c0b5ef0, > > vma=3Dffff88810c0b5e00 (0000000000101000-0000000000102000) > > [ 326.768786] userfaultfd_register: vma_merge returned 000000000000000= 0 > > [ 326.769898] userfaultfd_register: prev=3Dffff88810c0b5e00, > > vma=3Dffff88810c0b5e00 (0000000000101000-0000000000102000) > > > > I don't know if this can lead to anything bad but it seems pretty > > clearly unintended? > > Yes, unintended. > > So we are running out of memory, but since vma_merge() doesn't > differentiate between failure and 'nothing to merge', we end up in a > situation that we will revisit the same VMA. > > I've been thinking about a way to work this into the interface and I > don't see a clean way because we (could) do different things before the > call depending on the situation. > > I think we need to undo any vma iterator changes in the failure > scenarios if there is a chance of the iterator continuing to be used, > which is probably not limited to just this case. I don't fully understand the maple tree interface - in the specific case of vma_merge(), could you move the vma_prev() call down below the point of no return, after vma_iter_prealloc()? Or does vma_iter_prealloc() require that the iterator is already in the insert position? > I will audit these areas and CC you on the result. Thanks!