From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5983DC433EF for ; Mon, 7 Feb 2022 18:46:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6CF666B0078; Mon, 7 Feb 2022 13:46:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 67E5B6B007B; Mon, 7 Feb 2022 13:46:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 546BF6B007D; Mon, 7 Feb 2022 13:46:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0002.hostedemail.com [216.40.44.2]) by kanga.kvack.org (Postfix) with ESMTP id 461BA6B0078 for ; Mon, 7 Feb 2022 13:46:41 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 0A71F181C98FF for ; Mon, 7 Feb 2022 18:46:41 +0000 (UTC) X-FDA: 79116865002.10.05D315C Received: from mail-oi1-f180.google.com (mail-oi1-f180.google.com [209.85.167.180]) by imf30.hostedemail.com (Postfix) with ESMTP id 8CA2880004 for ; Mon, 7 Feb 2022 18:46:40 +0000 (UTC) Received: by mail-oi1-f180.google.com with SMTP id v67so17991670oie.9 for ; Mon, 07 Feb 2022 10:46:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version; bh=+wPgud2Zc9iDtR4LrtVArCP3c0J7sHpVkjRO1rtdEVU=; b=rxWhrueR31kHGt5dQf3osmpofGeWeGByme6ZPm8EUcT3J4f6Tt7sZFDjp3K3NWw5xH 8Q1bqeu5wxsLEaOnMXoYGsN18mPQPBVEdebmXxcTrmHR4vjG9GvvLPipaLPDmOKzH7aj Cdzf/bIYd4a5aj3X4T3ka4qzRYjIcrTUAB+esqdC0HX2K/9qN1hY4zdajr+McQK9lZyy l1g+gbV4gNJFdGxMHJdQlzQ1xSm/xy9VcYiN/7FRNQu5KB2Qao/Sm3jLVITIvrAJE0Uh +9o31+ivZbFMykFT9pylTUVqpND/2S6NCCWPd7wd6ZtmMsl3h/NikAsa1aNMGAWcA2Sh ZAOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:mime-version; bh=+wPgud2Zc9iDtR4LrtVArCP3c0J7sHpVkjRO1rtdEVU=; b=qWfFkdtCys31d2CdAMSEMQRzkbrpMNLUsl39O61d1DiF46UmT80/q/9uMEj25jOZuU vBgnDS17f09wk+DUc7GgiQAV4qdMkW0zuL6OH0BI7usQ9455ZsFWonFcYKrjXBleApNj NXAsxtAEyrqIFISiqNjer04Glb+lMriXmjGcwZOkeSL1/5NhB8BK2JJF+4zCMFdacF7z 9MwGclV02r7U1d1LTlYFLzdvIaorJcbaW7ITQ4eKJRTJcWflqPwjrIGgjI8WKk31r8P0 x013FWKhWu8rPQAP4GOpOni3VwNG4KV1tctcCC5wnUx/iay/2uP/Vy46jRmMLfBdbXhm yhmg== X-Gm-Message-State: AOAM530GYFoXxApJOgXQL/Gqr+VJDvNPl7QRT2YPErprbI2O99T77NyK vZLPleQT2EC2O3Qz3JMp/14+xA== X-Google-Smtp-Source: ABdhPJw+Tl5CzTDCc96VpocDvJbBxdd6ZINurZIdPIxVJUtMyioGW9CzMnzgbCq3WPYRTDAhM0mHJg== X-Received: by 2002:a05:6808:1b26:: with SMTP id bx38mr134447oib.267.1644259599649; Mon, 07 Feb 2022 10:46:39 -0800 (PST) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id l38sm4414090otl.45.2022.02.07.10.46.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 07 Feb 2022 10:46:38 -0800 (PST) Date: Mon, 7 Feb 2022 10:46:36 -0800 (PST) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Hillf Danton cc: Hugh Dickins , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 07/13] mm/munlock: mlock_pte_range() when mlocking or munlocking In-Reply-To: <20220207033518.2602-1-hdanton@sina.com> Message-ID: <203c549d-ad8-948d-1a3a-13be026864e@google.com> References: <8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com> <20220207033518.2602-1-hdanton@sina.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 8CA2880004 X-Stat-Signature: kjfq4pcdw9sgg6hd8jubqrkchbxhgcfo X-Rspam-User: nil Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=rxWhrueR; spf=pass (imf30.hostedemail.com: domain of hughd@google.com designates 209.85.167.180 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1644259600-748477 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, 7 Feb 2022, Hillf Danton wrote: > On Sun, 6 Feb 2022 13:42:09 -0800 (PST) Hugh Dickins wrote: > > +static void mlock_vma_pages_range(struct vm_area_struct *vma, > > + unsigned long start, unsigned long end, vm_flags_t newflags) > > { > > - /* Reimplementation to follow in later commit */ > > + static const struct mm_walk_ops mlock_walk_ops = { > > + .pmd_entry = mlock_pte_range, > > + }; > > + > > + /* > > + * There is a slight chance that concurrent page migration, > > + * or page reclaim finding a page of this now-VM_LOCKED vma, > > + * will call mlock_vma_page() and raise page's mlock_count: > > + * double counting, leaving the page unevictable indefinitely. > > + * Communicate this danger to mlock_vma_page() with VM_IO, > > + * which is a VM_SPECIAL flag not allowed on VM_LOCKED vmas. > > + * mmap_lock is held in write mode here, so this weird > > + * combination should not be visible to others. > > + */ > > + if (newflags & VM_LOCKED) > > + newflags |= VM_IO; > > + WRITE_ONCE(vma->vm_flags, newflags); > > Nit > > The WRITE_ONCE is not needed, given the certainty of invisibility to > others - it will quiesce syzbot reporting the case of visibility. Ah, maybe I can rewrite that comment better: when I said "visible to others", I meant visible to "the outside world", those participating in the usual mmap_lock'ed access, syscalls and /proc/pid/maps and smaps etc. The point here is that some kernel low-level internals (page migration and page reclaim) peek at vma->vm_flags without mmap_lock (but with anon_vma lock or i_mmap_rwsem). Originally I had VM_LOCKED set in vma->vm_flags before calling mlock_vma_pages_range(), no need for a newflags parameter. Then realized that left a tiny window in which VM_LOCKED was visible to migration and reclaim without the safening VM_IO, so changed it to pass in newflags, then "newflags |= VM_IO", then "vma->vm_flags = newflags" there. Then realized that perhaps an uncooperative compiler might be inspired to mutate that into "vma->vm_flags = newflags" followed by "vma->vm_flags |= VM_IO". I hope it would not, but can I be sure that it would not? That's why I ended up with WRITE_ONCE() there. Maybe all rather overkill: but trying to ensure that we undercount mmap_locked rather than risk overcounting it. Hugh