On Thu, Oct 23, 2025 at 12:49:14PM +0100, Jonathan Cameron wrote: > On Wed, 22 Oct 2025 22:39:28 +0100 > Conor Dooley wrote: > > Hi Conor, > > > On Wed, Oct 22, 2025 at 12:33:49PM +0100, Jonathan Cameron wrote: > > > > > +static int hisi_soc_hha_wbinv(struct cache_coherency_ops_inst *cci, > > > + struct cc_inval_params *invp) > > > +{ > > > + struct hisi_soc_hha *soc_hha = > > > + container_of(cci, struct hisi_soc_hha, cci); > > > + phys_addr_t top, addr = invp->addr; > > > + size_t size = invp->size; > > > + u32 reg; > > > + > > > + if (!size) > > > + return -EINVAL; > > > + > > > + addr = ALIGN_DOWN(addr, HISI_HHA_MAINT_ALIGN); > > > + top = ALIGN(addr + size, HISI_HHA_MAINT_ALIGN); > > > + size = top - addr; > > > + > > > + guard(mutex)(&soc_hha->lock); > > > + > > > + if (!hisi_hha_cache_maintain_wait_finished(soc_hha)) > > > + return -EBUSY; > > > + > > > + /* > > > + * Hardware will search for addresses ranging [addr, addr + size - 1], > > > + * last byte included, and perform maintain in 128 byte granule > > > + * on those cachelines which contain the addresses. > > > + */ > > > > Hmm, does this mean that the IP has some built-in handling for there > > being more than one "agent" in a system? IOW, if the address is not in > > its range, then the search will just fail into a NOP? > > Exactly that. NOP if nothing to do. The hardware is only tracking > a subset of what it might contain (depending on which cachelines are > actually in caches) so it's very much a 'clear this if you happen to > have it' command. Even if it is in the subset of PA being covered by > an instance, many cases will be a 'miss' and hence a NOP. Okay, cool. I kinda figured this was the mostly outcome, when yous put "search" into the comment. > > If that's not the case, is this particular "agent" by design not suitable > > for a system like that? Or will a dual hydra home agent system come with > > a new ACPI ID that we can use to deal with that kind of situation? > > Existing systems have multiple instances of this hardware block. > > Simplifying things over reality just to make this explanation less > messy. (ignoring other levels of interleaving beyond the Point of > Coherency etc). > > In servers the DRAM access are pretty much always interleaved > (usually at cache line granularity). That interleaving may go very > different physical locations on a die or across multiple dies. > > Similarly the agent responsible for tracking the coherency state > (easy to think of this as a complete directory but it's never that > simple) is distributed so that it is on the path to the DRAM. Hence > if we have N way interleave there maybe N separate agents responsible for > different parts of the range 0..(64*N-1) (taking smallest possible > flush that would have to go to all those agents). Well, thanks for the explanation.. I was only looking to know if there were multiple, since it wasn't clear, but the reason why you do is welcome. > > (Although I don't know enough about ACPI to know where you'd even get > > the information about what instance handles what range from...) > > We don't today. It would be easy to encode that information > as a resource and it may make sense for larger systems depending > on exactly how the coherency fabric in a system works. I'd definitely > expect to see some drivers doing this. Those drivers could then prefilter. Okay cool. I can clearly see how it'd be done in DT land, if required, but didn't know if it was possible on ACPI systems. > > Interleaving gets really complex so any description is likely to only > provide a conservative superset of what is actually handled by a given > agent. > > > > > > + size -= 1; > > > + > > > + writel(lower_32_bits(addr), soc_hha->base + HISI_HHA_START_L); > > > + writel(upper_32_bits(addr), soc_hha->base + HISI_HHA_START_H); > > > + writel(lower_32_bits(size), soc_hha->base + HISI_HHA_LEN_L); > > > + writel(upper_32_bits(size), soc_hha->base + HISI_HHA_LEN_H); > > > + > > > + reg = FIELD_PREP(HISI_HHA_CTRL_TYPE, 1); /* Clean Invalid */ > > > + reg |= HISI_HHA_CTRL_RANGE | HISI_HHA_CTRL_EN; > > > + writel(reg, soc_hha->base + HISI_HHA_CTRL); > > > + > > > + return 0; > > > +} > > > + > > > +static int hisi_soc_hha_done(struct cache_coherency_ops_inst *cci) > > > +{ > > > + struct hisi_soc_hha *soc_hha = > > > + container_of(cci, struct hisi_soc_hha, cci); > > > + > > > + guard(mutex)(&soc_hha->lock); > > > + if (!hisi_hha_cache_maintain_wait_finished(soc_hha)) > > > + return -ETIMEDOUT; > > > + > > > + return 0; > > > +} > > > + > > > +static const struct cache_coherency_ops hha_ops = { > > > + .wbinv = hisi_soc_hha_wbinv, > > > + .done = hisi_soc_hha_done, > > > +}; > > > + > > > +static int hisi_soc_hha_probe(struct platform_device *pdev) > > > +{ > > > + struct hisi_soc_hha *soc_hha; > > > + struct resource *mem; > > > + int ret; > > > + > > > + soc_hha = cache_coherency_ops_instance_alloc(&hha_ops, > > > + struct hisi_soc_hha, cci); > > > + if (!soc_hha) > > > + return -ENOMEM; > > > + > > > + platform_set_drvdata(pdev, soc_hha); > > > + > > > + mutex_init(&soc_hha->lock); > > > + > > > + mem = platform_get_resource(pdev, IORESOURCE_MEM, 0); > > > + if (!mem) { > > > + ret = -ENOMEM; > > > + goto err_free_cci; > > > + } > > > + > > > + /* > > > + * HHA cache driver share the same register region with HHA uncore PMU > > > + * driver in hardware's perspective, none of them should reserve the > > > + * resource to itself only. Here exclusive access verification is > > > + * avoided by calling devm_ioremap instead of devm_ioremap_resource to > > > > The comment here doesn't exactly match the code, dunno if you went away > > from devm some reason and just forgot to to make the change or the other > > way around? Not a big deal obviously, but maybe you forgot to do > > something you intended doing. It's mentioned in the commit message too. > > Ah. Indeed stale comment, I'll drop that. > > Going away from devm was mostly a hang over from similar discussions in fwctl > where I copied the pattern of embedded device(there)/kref(here) and reluctance > to hide way the final put(). > > > > > Other than the question I have about the multi-"agent" stuff, this > > looks fine to me. I assume it's been thought about and is fine for w/e > > reason, but I'd like to know what that is. > > I'll see if I can craft a short intro bit of documentation for the > top of this driver file to state clearly that there are lots of instances > of this in a system and that a requests to clear something that isn't 'theirs' > results in a NOP. Better to have that available so anyone writing > a similar driver thinks about whether that applies to what they have or > if they need to do in driver filtering. Yeah, adding a comment would be ideal, thanks.