From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3041DC55178 for ; Fri, 6 Nov 2020 20:41:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8167F208C7 for ; Fri, 6 Nov 2020 20:41:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="RT9Lzrvz" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8167F208C7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EE12D6B0036; Fri, 6 Nov 2020 15:41:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E6B286B005C; Fri, 6 Nov 2020 15:41:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D59F56B005D; Fri, 6 Nov 2020 15:41:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0075.hostedemail.com [216.40.44.75]) by kanga.kvack.org (Postfix) with ESMTP id 9D34B6B0036 for ; Fri, 6 Nov 2020 15:41:48 -0500 (EST) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 4C2561EE6 for ; Fri, 6 Nov 2020 20:41:48 +0000 (UTC) X-FDA: 77455164696.05.back85_250da20272d5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin05.hostedemail.com (Postfix) with ESMTP id 3534E180206EE for ; Fri, 6 Nov 2020 20:41:48 +0000 (UTC) X-HE-Tag: back85_250da20272d5 X-Filterd-Recvd-Size: 7588 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf13.hostedemail.com (Postfix) with ESMTP for ; Fri, 6 Nov 2020 20:41:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1604695307; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LX5gdTm6EDzi00pjEtiFhtYuinFfaUY5Wq0ZilW4doE=; b=RT9LzrvzSPiq9KDgfMbqBAxSSBLs5oTDU/6Oahmrjr3liwdI8eYwWETeHkzDQY8xa9AhhU /aKElfkS3ilMvOKoer7U5fQlHLhHzXc+J37EV1HW68CtfWW7N5MR7RsbnnI2Gv9ANboeZF WimeucT2LwGYi7qMa5Oz3HjfAja8M88= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-133-f3i3Bcf9PK2FKNMjgLE_TQ-1; Fri, 06 Nov 2020 15:41:43 -0500 X-MC-Unique: f3i3Bcf9PK2FKNMjgLE_TQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6D15F802B74; Fri, 6 Nov 2020 20:41:41 +0000 (UTC) Received: from [10.36.112.11] (ovpn-112-11.ams2.redhat.com [10.36.112.11]) by smtp.corp.redhat.com (Postfix) with ESMTP id EDAA46EF6D; Fri, 6 Nov 2020 20:41:38 +0000 (UTC) Subject: Re: Regression: QCA6390 fails with "mm/page_alloc: place pages to tail in __free_pages_core()" To: Pavel Procopiuc Cc: Vlastimil Babka , Kalle Valo , ath11k@lists.infradead.org, linux-mm@kvack.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-wireless@vger.kernel.org References: <225718f1-c4b0-8683-427a-059148a39350@gmail.com> <15e33a0a-9a76-0966-125a-5941e2cdfb09@gmail.com> From: David Hildenbrand Organization: Red Hat GmbH Message-ID: <31f66d70-95eb-12dd-1d01-0830d118f55a@redhat.com> Date: Fri, 6 Nov 2020 21:41:37 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: <15e33a0a-9a76-0966-125a-5941e2cdfb09@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 06.11.20 18:32, Pavel Procopiuc wrote: > Op 05.11.2020 om 21:23 schreef David Hildenbrand: >>> So just to make sure I understand you correctly, you'd like to see if= the problem with ath11k driver on my hardware persists when I boot prist= ine 5.10-rc2 kernel (without reverting commit 7fef431be9c9ac255838a957833= 1567b9dba4477) and with page_alloc.shuffle=3D1, right? >>> >> >> Right, but as lists are randomized then it might take a couple of trie= s to reproduce. I=E2=80=98ll have a look at the driver code / failing pat= h on Monday, when back to work. >=20 > I have done 5 boots of pristine 5.10-rc2 with page_alloc.shuffle=3D1. O= ut of those: 1st, 2nd, 4th and 5th resulted in > working ath11k driver, logs were the same as with the commit 7fef431be9= c9ac255838a9578331567b9dba4477 reverted. The 3rd > one failed, but in a different way, I just had no output from the drive= r after initialization lines: >=20 > Nov 06 18:19:41 razor kernel: Linux version 5.10.0-rc2 (root@razor) (gc= c (Gentoo 9.3.0-r1 p3) 9.3.0, GNU ld (Gentoo 2.34 > p6) 2.34.0) #8 SMP Fri Nov 6 18:14:36 CET 2020 > Nov 06 18:19:41 razor kernel: pci 0000:05:00.0: [17cb:1101] type 00 cla= ss 0x028000 > Nov 06 18:19:41 razor kernel: pci 0000:05:00.0: reg 0x10: [mem 0xd21000= 00-0xd21fffff 64bit] > Nov 06 18:19:41 razor kernel: pci 0000:05:00.0: PME# supported from D0 = D3hot D3cold > Nov 06 18:19:41 razor kernel: pci 0000:05:00.0: 4.000 Gb/s available PC= Ie bandwidth, limited by 5.0 GT/s PCIe x1 link at > 0000:00:1c.1 (capable of 7.876 Gb/s with 8.0 GT/s PCIe x1 link) > Nov 06 18:19:41 razor kernel: pci 0000:05:00.0: Adding to iommu group 2= 1 > Nov 06 18:19:42 razor kernel: ath11k_pci 0000:05:00.0: WARNING: ath11k = PCI support is experimental! > Nov 06 18:19:42 razor kernel: ath11k_pci 0000:05:00.0: BAR 0: assigned = [mem 0xd2100000-0xd21fffff 64bit] > Nov 06 18:19:42 razor kernel: ath11k_pci 0000:05:00.0: enabling device = (0000 -> 0002) > Nov 06 18:19:42 razor kernel: mhi 0000:05:00.0: Requested to power ON > Nov 06 18:19:42 razor kernel: mhi 0000:05:00.0: Power on setup success >=20 > I had this before and usually it was fixed after rebooting into Windows= and back. This time I just went and rebooted > into Linux again and driver was working on that boot (4th). I'm sorry, but "WARNING: ath11k PCI support is experimental!" and such=20 occasional issues don't give me the best feeling that everything is=20 operating as it should :) >=20 > After that I removed page_alloc.shuffle=3D1 and did 2 additional boots,= both of them resulted in a non-working driver with > the error messages about not being able to talk to firmware like I had = before on the clean 5.10-rc2: >=20 > Nov 06 18:24:07 razor kernel: Linux version 5.10.0-rc2 (root@razor) (gc= c (Gentoo 9.3.0-r1 p3) 9.3.0, GNU ld (Gentoo 2.34 > p6) 2.34.0) #9 SMP Fri Nov 6 18:22:43 CET 2020 > Nov 06 18:24:07 razor kernel: pci 0000:05:00.0: [17cb:1101] type 00 cla= ss 0x028000 > Nov 06 18:24:07 razor kernel: pci 0000:05:00.0: reg 0x10: [mem 0xd21000= 00-0xd21fffff 64bit] > Nov 06 18:24:07 razor kernel: pci 0000:05:00.0: PME# supported from D0 = D3hot D3cold > Nov 06 18:24:07 razor kernel: pci 0000:05:00.0: 4.000 Gb/s available PC= Ie bandwidth, limited by 5.0 GT/s PCIe x1 link at > 0000:00:1c.1 (capable of 7.876 Gb/s with 8.0 GT/s PCIe x1 link) > Nov 06 18:24:07 razor kernel: pci 0000:05:00.0: Adding to iommu group 2= 1 > Nov 06 18:24:08 razor kernel: ath11k_pci 0000:05:00.0: WARNING: ath11k = PCI support is experimental! > Nov 06 18:24:08 razor kernel: ath11k_pci 0000:05:00.0: BAR 0: assigned = [mem 0xd2100000-0xd21fffff 64bit] > Nov 06 18:24:08 razor kernel: ath11k_pci 0000:05:00.0: enabling device = (0000 -> 0002) > Nov 06 18:24:08 razor kernel: mhi 0000:05:00.0: Requested to power ON > Nov 06 18:24:08 razor kernel: mhi 0000:05:00.0: Power on setup success > Nov 06 18:24:08 razor kernel: ath11k_pci 0000:05:00.0: Respond mem req = failed, result: 1, err: 0 > Nov 06 18:24:08 razor kernel: ath11k_pci 0000:05:00.0: qmi failed to re= spond fw mem req:-22 > Nov 06 18:24:13 razor kernel: ath11k_pci 0000:05:00.0: qmi failed memor= y request, err =3D -110 > Nov 06 18:24:13 razor kernel: ath11k_pci 0000:05:00.0: qmi failed to re= spond fw mem req:-110 > Nov 06 18:25:39 razor kernel: mhi 0000:05:00.0: Device failed to exit M= HI Reset state >=20 Okay, that means that you should be able to reproduce=20 pre-7fef431be9c9ac255838a9578331567b9dba4477 with page_alloc.shuffle=3D1=20 as well ... it just might take a lot of tries to get a problematic page. I could also imagine that loading the driver deferred, after quite some=20 system/mm activity could result in the same issue. Looks like something either cannot handle a specific address we received=20 via dma_alloc_coherent(), or something is reading out of bounds, and the=20 content after our allocated page doesn't have the expected value anymore=20 (e.g., used to be zero, now no longer zero). What puzzles me is that "err: 0". That should have been properly set by=20 HW, no? --=20 Thanks, David / dhildenb