From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21BB0EEAA7B for ; Thu, 14 Sep 2023 22:21:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 96D8C6B0168; Thu, 14 Sep 2023 18:21:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8F27C6B016E; Thu, 14 Sep 2023 18:21:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 744E46B0178; Thu, 14 Sep 2023 18:21:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 5B9546B0168 for ; Thu, 14 Sep 2023 18:21:26 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 32F0B806BF for ; Thu, 14 Sep 2023 22:21:26 +0000 (UTC) X-FDA: 81236625372.06.4C16C81 Received: from mail-wm1-f43.google.com (mail-wm1-f43.google.com [209.85.128.43]) by imf05.hostedemail.com (Postfix) with ESMTP id 6547710000B for ; Thu, 14 Sep 2023 22:21:24 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Cdw9XG5A; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of surenb@google.com designates 209.85.128.43 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694730084; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BGIMEgSkXCbpKrdkhs9f3Eb1IgGXBmukqlPzTkm6Mos=; b=UomlokRlThFqMd7I3VL9VQE7zTrsB0Pk24sgGqdkEGF2IayQu9Gs+/c7Lk+SU5sVh5RSk2 EDo9ZtlB7TjrX9VBvn3xY5YYJtB+G8Kh78OrXftXVaDMwG8/cA9ZeIOuE455qZGQAwSOAs eZw9XRtP6KtmUWHX3AwcPbU0zSEmnNA= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Cdw9XG5A; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of surenb@google.com designates 209.85.128.43 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694730084; a=rsa-sha256; cv=none; b=BXBYYfTbp2f2Oy8ghBgnmR0LlJzGb308A6FOyTTyP1cYb1qVQMOt75VGVdSGZHnvfMAQKZ pkt0WlQsZFzyVHg3/9mRcCyA6wrUqvazJCVH+XH+nW2gYd+qMo8ryphEoFRsEwqqqYJO3f kmbYsZ2/yPZhvfPGp1y1Y5u6rrlri/g= Received: by mail-wm1-f43.google.com with SMTP id 5b1f17b1804b1-401d80f4ef8so16903745e9.1 for ; Thu, 14 Sep 2023 15:21:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694730083; x=1695334883; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=BGIMEgSkXCbpKrdkhs9f3Eb1IgGXBmukqlPzTkm6Mos=; b=Cdw9XG5Auicg1kZ7up8NWbQwHnDQCep+u2ALtPZ3aRUoZt8LkGzQCCY7nnYyzcqCZa fQGPhVvnYgpRqG3zXkDVjMKAT1LzvivkCDTPRgIZM3HYMTOULjIPIkcYcE3TDRIvDyaE CIkrbGSCVyhY04UXjsEH5QNay6WuEMb07S5K8sIMnR5tkN5WTl0V2yHo64LcU4uBjIWF y3vchDdbIJD1tTFqz3NgFlgxtEb3dLXomd2+wdxQfjgTKVeo2velYjZYZouKiCGqnMOS LLAprZ6w5l1C9efzm6mwG3zdG6maPiWK1R5Ja8gR0KEkJmTeOjAjrj2d2HnWISOo+Bw+ 7qBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694730083; x=1695334883; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BGIMEgSkXCbpKrdkhs9f3Eb1IgGXBmukqlPzTkm6Mos=; b=rnIIYZo5unvq1B2g23ogj6oH2fR3XBnfCgXYx9ohJUD4FGJzyIURLn2CL7Cu1XyFHy wEWX81Y+jPKJ5al3g0ssSbDB5ESZrX++mMPpVDE05Vh3fZetQ0cssGXVFb3gmk5lwB6C pLP5oxA4UiXc3UlA8CxFdM4BjV5KiIZaBH9XAk5M0ZVVeBzTAytmZyCnjg2pPYIyOrQF OTgk1g7Fk+RcJJJ2PsxL5FUHBlkrNFEbbC4KqbTIopEBWziiYRb22DDVt2U29tfNSXwr OJ0oFn3L75F+/IDaVtXB1WS9WYC1cmMdkZWjKLqjmCvdokQOBwYseqUv9C19WeuVzv4q eWPg== X-Gm-Message-State: AOJu0YyAgf0iQgoP+hISo++uYGR5BSIO+VEAeyUnXgSQw12TnGz5yyZQ VTHwHhqVLGLqmLjqklvkOW1+MPGCKn+2Dn8vbZ8fWg== X-Google-Smtp-Source: AGHT+IFXfO54+GQkZBkZJgXD7yb35OdK0sQJ/G4kr2enWyYfxtN7RtNgtMfAKsWMs6I+us0ETZORF/wNcFn+NZVe8MY= X-Received: by 2002:a5d:5387:0:b0:317:731a:6702 with SMTP id d7-20020a5d5387000000b00317731a6702mr5546547wrv.19.1694730082641; Thu, 14 Sep 2023 15:21:22 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Suren Baghdasaryan Date: Thu, 14 Sep 2023 22:21:07 +0000 Message-ID: Subject: Re: [syzbot] [mm?] kernel BUG in vma_replace_policy To: Matthew Wilcox Cc: syzbot , akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, syzkaller-bugs@googlegroups.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 6547710000B X-Stat-Signature: fk7ce7yff19jyawz77of9fji5rfpo7ru X-Rspam-User: X-HE-Tag: 1694730084-681762 X-HE-Meta: U2FsdGVkX1+XjqcV4QTIaW+YQs8Efj0IYiF5aA5dXaWdJhzBFb38dIaI3hfkp/k87SbzjO3vdgWvkFwuwvd7U5fR/IE3HtuKpdNUGtmhLlWCo4+u69jceHYUDgrosmO+aUQA/ex/5T9vHDJjibKYzrAYyDKr5WioOf5WABbn0ikoFStrbrG2y7/Wx/5QKXyXE5d/ldvv2SH5OYbv9YS5QOH8J0cF6tUp3ne5xDybuhYD3JEBLAD7tqOy1EDHZLTbL/3v8r15HSFa+UTzNJGBeS/7rRZZVOwOT9lCuIS8D2vFH5cAQE/P5FlZlULff5riwGFoYhvvXjQTH1aItLMggtg8s9upqSSZGbhQ8JFv1ST9t7YU5nV+JQUlCrvPFsbT6GCT10Rfx+wPyeuSYMuDvqwfRf5GRxhGrlrXvRO1lg9WpM8iVGjAzodxuWn8m1liKWz+rSBlOlJAB6qof+eApRypjmgCYjo/C4a/F1KAiw4W7SRnfieRRc56HjyNPuCBKvGkPfVd3Jt7HHKxkan2WyAXO3gOI4+RoUm7gJLHa40dzcHYQKAcSZX2kQ8m31B67tzJJ4JNL6M+uKUIxTv5YOiofXElETFR+NnNLd1yNUaYavUzUE1xPzD5+TSTO7uYezsfLs1My03xM3sg60aYqpKyFrQLf5P/hh38TYPOOOqVT9H/dywqP/8S0doIyo5YSf2FPbQYyeqbQL3pMX1mtxJ/PFzpj6ObhmEEDeZOfd9kHy5sntNrwFHTHeLTARkJ9V+PxkiuolirJzXvkxBivNP0MJa9m5indyPuxsauUATMRGIxRMu6OANMPyQrDihP3eHV693nQP6sbdGnTIQe0fhTrbcbDBQGeU9dPwT8P+C22EhYNTCllN8JJxPi1mny3K71yiXe+tkrzsWA2OsLHSKOYR2xT9aGYvduETtHt4/fE2wDSE2sWdQB6mQkvviUgMPXg2A555sRAIVKMSv ONIDcO52 h4lyBh3JVoSmJ13Mi0N7UYrCYJYjfrry5gVtYKn6fGKjPZS3lSPeXhJnQnBDGu9aDug77+1Nw/awBTBhixgGu3eh+FqOoYfs88FYxRE/uGzOdqdBiY0qlM8Hjv8t4GpVm4ejVg4viSBkyAdzvsasbdnt04dK0+l9uYAPvJFC8gA6z2NhlwMFiKl0vtSjh40gz9p5lF4BOvlbNv49AE9aA7NSDHMWOVzLVvZVyhN+JS1rdC5i2GC3FQDN4ehWHgrpp7GYrf6kpsObhPRvOFd+j6VU1qgGjbmHLYEAJGccOU3U86PPGif5SLGxP6O4hgGwKxaG4Bf1lA7ke2NAU3m3KkdNFN56Rf1dtcN6w2KwhwZG3jawaI1IRZ0aZmoNlKpEecCFYv/aElhs0wSnFuhB0tblAa88Xt2vUfpSYUxrxj7SlJb/vMWJY9ygeV5u4fY9lmWix3DqSiaiD6yQbti5y4UndOjGT6SHevAc4uqgXO6JU0LbfSLZxPKKFJcB0rKAs7SrothksD7dQFietZ4Aevozt6bumitCB/xq4ZnCbtm+IpNiay4ohleiVfcP4QXOe5RXCGbUavYbH6p1n1k/xFLwW7uGIPcffTlFwopolLQXZQFs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000006, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Sep 14, 2023 at 9:24=E2=80=AFPM Matthew Wilcox wrote: > > On Thu, Sep 14, 2023 at 08:53:59PM +0000, Suren Baghdasaryan wrote: > > On Thu, Sep 14, 2023 at 8:00=E2=80=AFPM Suren Baghdasaryan wrote: > > > > > > On Thu, Sep 14, 2023 at 7:09=E2=80=AFPM Matthew Wilcox wrote: > > > > > > > > On Thu, Sep 14, 2023 at 06:20:56PM +0000, Suren Baghdasaryan wrote: > > > > > I think I found the problem and the explanation is much simpler. = While > > > > > walking the page range, queue_folios_pte_range() encounters an > > > > > unmovable page and queue_folios_pte_range() returns 1. That cause= s a > > > > > break from the loop inside walk_page_range() and no more VMAs get > > > > > locked. After that the loop calling mbind_range() walks over all = VMAs, > > > > > even the ones which were skipped by queue_folios_pte_range() and = that > > > > > causes this BUG assertion. > > > > > > > > > > Thinking what's the right way to handle this situation (what's th= e > > > > > expected behavior here)... > > > > > I think the safest way would be to modify walk_page_range() and m= ake > > > > > it continue calling process_vma_walk_lock() for all VMAs in the r= ange > > > > > even when __walk_page_range() returns a positive err. Any objecti= on or > > > > > alternative suggestions? > > > > > > > > So we only return 1 here if MPOL_MF_MOVE* & MPOL_MF_STRICT were > > > > specified. That means we're going to return an error, no matter wh= at, > > > > and there's no point in calling mbind_range(). Right? > > > > > > > > +++ b/mm/mempolicy.c > > > > @@ -1334,6 +1334,8 @@ static long do_mbind(unsigned long start, uns= igned long len, > > > > ret =3D queue_pages_range(mm, start, end, nmask, > > > > flags | MPOL_MF_INVERT, &pagelist, true); > > > > > > > > + if (ret =3D=3D 1) > > > > + ret =3D -EIO; > > > > if (ret < 0) { > > > > err =3D ret; > > > > goto up_out; > > > > > > > > (I don't really understand this code, so it can't be this simple, c= an > > > > it? Why don't we just return -EIO from queue_folios_pte_range() if > > > > this is the right answer?) > > > > > > Yeah, I'm trying to understand the expected behavior of this function > > > to make sure we are not missing anything. I tried a simple fix that I > > > suggested in my previous email and it works but I want to understand = a > > > bit more about this function's logic before posting the fix. > > > > So, current functionality is that after queue_pages_range() encounters > > an unmovable page, terminates the loop and returns 1, mbind_range() > > will still be called for the whole range > > (https://elixir.bootlin.com/linux/latest/source/mm/mempolicy.c#L1345), > > all pages in the pagelist will be migrated > > (https://elixir.bootlin.com/linux/latest/source/mm/mempolicy.c#L1355) > > and only after that the -EIO code will be returned > > (https://elixir.bootlin.com/linux/latest/source/mm/mempolicy.c#L1362). > > So, if we follow Matthew's suggestion we will be altering the current > > behavior which I assume is not what we want to do. > > Right, I'm intentionally changing the behaviour. My thinking is > that mbind(MPOL_MF_MOVE | MPOL_MF_STRICT) is going to fail. Should > such a failure actually move the movable pages before reporting that > it failed? I don't know. > > > The simple fix I was thinking about that would not alter this behavior > > is smth like this: > > I don't like it, but can we run it past syzbot to be sure it solves the > issue and we're not chasing a ghost here? Yes, I just finished running the reproducer on both upstream and linux-next builds listed in https://syzkaller.appspot.com/bug?extid=3Db591856e0f0139f83023 and the problem does not happen anymore. I'm fine with your suggestion too, just wanted to point out it would introduce change in the behavior. Let me know how you want to proceed.