From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.0 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 313B6C2BB55 for ; Tue, 7 Apr 2020 19:57:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B11F820730 for ; Tue, 7 Apr 2020 19:57:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=shipmail.org header.i=@shipmail.org header.b="ADF7+uVu" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B11F820730 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=shipmail.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 193E68E0005; Tue, 7 Apr 2020 15:57:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 144718E0001; Tue, 7 Apr 2020 15:57:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 05AAA8E0005; Tue, 7 Apr 2020 15:57:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0254.hostedemail.com [216.40.44.254]) by kanga.kvack.org (Postfix) with ESMTP id E2B9B8E0001 for ; Tue, 7 Apr 2020 15:57:46 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 9A8FA908B for ; Tue, 7 Apr 2020 19:57:46 +0000 (UTC) X-FDA: 76682119332.12.nerve75_439cefe79f737 X-HE-Tag: nerve75_439cefe79f737 X-Filterd-Recvd-Size: 5178 Received: from ste-pvt-msa2.bahnhof.se (ste-pvt-msa2.bahnhof.se [213.80.101.71]) by imf23.hostedemail.com (Postfix) with ESMTP for ; Tue, 7 Apr 2020 19:57:44 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ste-pvt-msa2.bahnhof.se (Postfix) with ESMTP id A6DCF3F423; Tue, 7 Apr 2020 21:57:42 +0200 (CEST) Authentication-Results: ste-pvt-msa2.bahnhof.se; dkim=pass (1024-bit key; unprotected) header.d=shipmail.org header.i=@shipmail.org header.b=ADF7+uVu; dkim-atps=neutral X-Virus-Scanned: Debian amavisd-new at bahnhof.se Authentication-Results: ste-ftg-msa2.bahnhof.se (amavisd-new); dkim=pass (1024-bit key) header.d=shipmail.org Received: from ste-pvt-msa2.bahnhof.se ([127.0.0.1]) by localhost (ste-ftg-msa2.bahnhof.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id juSonQZZkhfi; Tue, 7 Apr 2020 21:57:41 +0200 (CEST) Received: from mail1.shipmail.org (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) (Authenticated sender: mb878879) by ste-pvt-msa2.bahnhof.se (Postfix) with ESMTPA id CBCE53F3F1; Tue, 7 Apr 2020 21:57:32 +0200 (CEST) Received: from localhost.localdomain (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) by mail1.shipmail.org (Postfix) with ESMTPSA id 8FBC3360153; Tue, 7 Apr 2020 21:57:29 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=shipmail.org; s=mail; t=1586289452; bh=n8v3kP19wHdlpasfsqLzlqOfhDoVJzv2qXSEHg2XIUw=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=ADF7+uVugZlqdmK5afrGpG3q5imNSOSM3I40tdQMKpNd14xTelmYZ5CMsLyPXrfjN +XMf3b9C+ML4rnbmUkQ1nhVWV7VdVyHblKCmHYuUSFCagT/uOB+wwSM1N+yrpoCcQL wuCcY4doZST4CAMHbIY+PSnC6wwUdMVLMkIrRL/U= Subject: Re: Bad rss-counter state from drm/ttm, drm/vmwgfx: Support huge TTM pagefaults To: "Alex Xu (Hello71)" , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Andrew Morton , =?UTF-8?Q?Christian_K=c3=b6nig?= , Dan Williams , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , "Kirill A. Shutemov" , linux-graphics-maintainer@vmware.com, Michal Hocko , pv-drivers@vmware.com, Ralph Campbell , Roland Scheidegger , "Matthew Wilcox (Oracle)" References: <1586138158.v5u7myprlp.none.ref@localhost> <1586138158.v5u7myprlp.none@localhost> <0b12b28c-5f42-b56b-ea79-6e3d1052b332@shipmail.org> <1586219716.1a3fyi6lh5.none@localhost> <37624a1f-8e6b-fe9c-8e0e-a9139e1bbe18@shipmail.org> <1586273767.0q72rozj3x.none@localhost> From: =?UTF-8?Q?Thomas_Hellstr=c3=b6m_=28VMware=29?= Organization: VMware Inc. Message-ID: Date: Tue, 7 Apr 2020 21:57:27 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <1586273767.0q72rozj3x.none@localhost> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 4/7/20 5:36 PM, Alex Xu (Hello71) wrote: > Excerpts from Thomas Hellstr=C3=B6m (VMware)'s message of April 7, 2020= 7:26 am: >> On 4/7/20 2:38 AM, Alex Xu (Hello71) wrote: >>> Excerpts from Thomas Hellstr=C3=B6m (VMware)'s message of April 6, 20= 20 5:04 pm: >>>> Hi, >>>> >>>> On 4/6/20 9:51 PM, Alex Xu (Hello71) wrote: >>>>> Using 314b658 with amdgpu, starting sway and firefox causes "BUG: B= ad >>>>> rss-counter state" and "BUG: non-zero pgtables_bytes on freeing mm"= to >>>>> start filling dmesg, and then closing programs causes more BUGs and >>>>> hangs, and then everything grinds to a halt (can't start more progr= ams, >>>>> can't even reboot through systemd). >>>>> >>>>> Using master and reverting that branch up to that point fixes the >>>>> problem. >>>>> >>>>> I'm using a Ryzen 1600 and AMD Radeon RX 480 on an ASRock B450 Pro4 >>>>> board with IOMMU enabled. >>>> If you could try the attached patch, that'd be great! >>>> >>>> Thanks, >>>> >>>> Thomas >>>> >>> Yeah, that works too. Kernel config sent off-list. >>> >>> Regards, >>> Alex. >> Thanks. Do you want me to add your >> >> Reported-by: and Tested-by: To this patch? >> >> /Thomas >> >> > Sure. Shouldn't we fix it properly though? It's still enabled for vmwgfx for which it is reasonably well tested and=20 where I can't see any such errors. The code we remove with this patch enables huge page-table entries in=20 some circumstances for other drivers, but given the problems you're=20 seeing for amdgpu, it's better to enable this on a per-driver basis=20 after thorough testing. Since I don't have amdgpu hardware I'm not sure=20 what it's doing differently, and can't debug the issue properly. /Thomas