From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C62CCA9EAE for ; Tue, 29 Oct 2019 14:45:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DC4A32087E for ; Tue, 29 Oct 2019 14:45:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="a5pSJQCT" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DC4A32087E Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 327476B0005; Tue, 29 Oct 2019 10:45:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2FE166B0006; Tue, 29 Oct 2019 10:45:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 23A766B0007; Tue, 29 Oct 2019 10:45:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0221.hostedemail.com [216.40.44.221]) by kanga.kvack.org (Postfix) with ESMTP id 01B826B0005 for ; Tue, 29 Oct 2019 10:45:43 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 8F4548249980 for ; Tue, 29 Oct 2019 14:45:43 +0000 (UTC) X-FDA: 76097096166.07.crack54_31e2907f8741d X-HE-Tag: crack54_31e2907f8741d X-Filterd-Recvd-Size: 5715 Received: from mail-wm1-f67.google.com (mail-wm1-f67.google.com [209.85.128.67]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Tue, 29 Oct 2019 14:45:42 +0000 (UTC) Received: by mail-wm1-f67.google.com with SMTP id w9so2697476wmm.5 for ; Tue, 29 Oct 2019 07:45:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=ndtzQtaBYRerJYOl2PKkxHDr6ZbpyNnXEUg2VhDTdhE=; b=a5pSJQCTO5XT/dMTKMFSXSuDd6IJDrweUs74BCkXhM5mEH+jylxDZl1JWpeyE9jibq X5KyBedH8F4iAfzTIYJWsvBWTEblyVM0mxQy/oOxfbWfzTKYKtVzczGRiMfIlOo2pxYs Mq82TYFfob7wWqODo6etE4+s4y3c85gmKuH2mqX0e7gFeKdotIKn3kapy05x09CuIyIj xwGmWV6EeEacXv1y9trk8WRGZenv+d4ZNwTJ4kB9LuXfXvfYcknBlGCvWppZ1A7KuopF 77CeapF0hDx2s83fjgjzH1Cxyhl3BK4aHf9LNzAp5GTWq5ZLd7gdPEntHW99wdqcUPuN N3VQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=ndtzQtaBYRerJYOl2PKkxHDr6ZbpyNnXEUg2VhDTdhE=; b=H5jZFEKY0KzGmInORyhXwh6TJwlmq4NP3MrwdiBNfPZc33Bu6Dulyb3fHfClkaLBao Ty3zt0zKNwzqcLeOf9zeTzlVs7Y9kHmAXsaDKl0LkU2yd/ClA8Rqhmx0NLUjoEy9vLCM DqnXajNDQ2v+X9/aXv+FQvOmudY0MJF6P8vuqwaJ44iYxDkaP9s19ZBtdQ+r5H5tsHMm yN6Neg6ZHUwDiArtbUkRpM4VXDDQERM1qyVuHhUWQ8nyvqviiv0x/QfdIcCibTt/ayWr Jxu5vX/yTLVr+od78YY1SYAyPUWfMuj10FqLp+UEoOsPDRfw2P7tHNj8QMiiz6JnHP1X qoXA== X-Gm-Message-State: APjAAAV26Y/E7M22pr4j+WEMmB8YQXYcpioEUfMVW1Og5ltipS/onoKe c8JI9cFkLiYLjfUaiceR6AIJ9GYO5g97TYUwnT6Ogg== X-Google-Smtp-Source: APXvYqz6yHpdWDW1nJAcYcmqe9UDXsy4ocwKgAbdfP1ALdy++67KMZC3nCtlQWnZsrOp4x4u5Ixd/gz1HPu9Az+Q6/c= X-Received: by 2002:a1c:7d16:: with SMTP id y22mr4333200wmc.106.1572360341121; Tue, 29 Oct 2019 07:45:41 -0700 (PDT) MIME-Version: 1.0 References: <20191018094304.37056-1-glider@google.com> <20191018094304.37056-24-glider@google.com> <20191018162222.GM32665@bombadil.infradead.org> In-Reply-To: <20191018162222.GM32665@bombadil.infradead.org> From: Alexander Potapenko Date: Tue, 29 Oct 2019 15:45:29 +0100 Message-ID: Subject: Re: [PATCH RFC v1 23/26] kmsan: unpoisoning buffers from devices etc. To: Matthew Wilcox Cc: Andrew Morton , Jens Axboe , "Theodore Ts'o" , Dmitry Torokhov , "Martin K . Petersen" , "Michael S. Tsirkin" , Christoph Hellwig , Eric Dumazet , Eric Van Hensbergen , Takashi Iwai , Vegard Nossum , Dmitry Vyukov , Linux Memory Management List Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Oct 18, 2019 at 6:22 PM Matthew Wilcox wrote: > > On Fri, Oct 18, 2019 at 11:43:01AM +0200, glider@google.com wrote: > > When data is copied to memory from a device KMSAN should treat it as > > initialized. In most cases it's enough to just unpoison the buffer that > > is known to come from a device. > > In the case with __do_page_cache_readahead() and bio_copy_user_iov() we > > have to mark the whole pages as ignored by KMSAN, as it's not obvious > > where these pages are read again. > > ... > > > +++ b/mm/filemap.c > > @@ -18,6 +18,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include > > @@ -2810,6 +2811,8 @@ static struct page *do_read_cache_page(struct add= ress_space *mapping, > > page =3D wait_on_page_read(page); > > if (IS_ERR(page)) > > return page; > > + /* Assume all pages in page cache are initialized. */ > > + kmsan_unpoison_shadow(page_address(page), PAGE_SIZE); > > Why would you do that? The page cache already keeps track of which > pages are initialised -- the PageUptodate flag is set on them. Indeed, > just adding a kmsan call to SetPageUptodate and __SetPageUptodate would > probably be a very straightforward way of handling things, and probably > means you can get rid of a lot of these other calls. This seems like a very good thing to do, I'll definitely try that. I however noticed that __SetPageUptodate is used when copying pages, not during disk I/O. Is that really so? We basically need the following behavior: if a device writes to a page, the contents of that page are considered initialized. However when the kernel copies one page to another, we must explicitly copy the source shadow page to the destination. On a related note, there seems to be a PG_dirty bit that indicates the page is to be flushed to disk. What's the best place to check such pages for being initialized, so that we can also report writes of uninitialized data to the disk? --=20 Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Stra=C3=9Fe, 33 80636 M=C3=BCnchen Gesch=C3=A4ftsf=C3=BChrer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg