On 18/02/2022 10:46, Joonas Lahtinen wrote:
Quoting Andi Shyti (2022-02-17 17:53:58)
Hi Tvrtko,
Now tiles have their own sysfs interfaces under the gt/ directory. Because RC6 is a property that can be configured on a tile basis, then each tile should have its own interface
The new sysfs structure will have a similar layout for the 4 tile case:
/sys/.../card0 \u251c\u2500\u2500 gt \u2502 \u251c\u2500\u2500 gt0 \u2502 \u2502 \u251c\u2500\u2500 id \u2502 \u2502 \u251c\u2500\u2500 rc6_enable \u2502 \u2502 \u251c\u2500\u2500 rc6_residency_ms . . . . . . . . \u2502 \u2514\u2500\u2500 gtN \u2502 \u251c\u2500\u2500 id \u2502 \u251c\u2500\u2500 rc6_enable \u2502 \u251c\u2500\u2500 rc6_residency_ms \u2502 . \u2502 . \u2502 \u2514\u2500\u2500 power/ -+ \u251c\u2500\u2500 rc6_enable | Original interface \u251c\u2500\u2500 rc6_residency_ms +-> kept as existing ABI; . | it multiplexes over . | the GTs -+
The existing interfaces have been kept in their original location to preserve the existing ABI. They act on all the GTs: when reading they provide the average value from all the GTs.
Average feels very odd to me. I'd ask if we can get away providing an errno instead? Or tile zero data?
Tile zero data is always wrong, in my opinion. If we have round-robin scaling workloads like some media cases, part of the system load might just disappear when it goes to tile 1.
I was thinking that in conjunction with deprecated log message it wouldn't be wrong - I mean if the route take was to eventually retire the legacy files altogether.
Real multiplexing would be providing something when reading and when writing. The idea of average came while revieweing with Chris the write multiplexing. Indeed it makes sense to provide some common value, but I don't know how useful it can be to the user (still if the user needs any average).
I think all read/write controls like min/max/boost_freq should return an error from the global interface if all the tiles don't return same value. Write will always overwrite per-tile values.
That would work I think, if the option chosen was not to retire the legacy files.
When we have frequency readbacks without control, returning MAX() across tiles would be the logical thing. The fact that parts of the hardware can be clocked lower when one part is fully utilized is the "new feature".
After that we're only really left with the rc6_residency_ms. And that is the tough one. I'm inclined that MIN() across tiles would be the right answer. If you are fully utilizing a single tile, you should be able to see it.
So we have MIN, AVG or SUM, or errno, or remove the file (which is just a different kind of errno?) to choose from. :)
Regards,
Tvrtko
This all would be what feels natural for an user who has their setup tuned for single-tile device. And would allow simple round-robing balancing across the tiles in somewhat coherent manner.