On Thu, 10 Apr 2014 21:30:03 +0200 Christian König deathsimple@vodafone.de wrote:
Quick thought from someone entirely unfamiliar with the hardware: perhaps you can get the performance benefit without the size increase by moving the else portion into a non-inline function? I'm guessing that most accesses happen in the "if" branch.
The function call overhead is about equal to branching overhead, so splitting it would only help about half that. It's called from many places, and a lot of calls per sec.
Actually direct register access shouldn't be necessary so often. Apart from page flips, write/read pointer updates and irq processing there shouldn't be so many of them. Could you clarify a bit more what issue you are seeing here?
Too much cpu usage for such a simple function. 2% makes it #2 in top-10 radeon.ko functions, right after evergreen_cs_parse. For reference, #3 (radeon_cs_packet_parse) is only 0.5%, one fourth of this function's usage.
As proved by the perf increase, it's called often enough that getting rid of the function call overhead (and compiling the if out compile-time) helps measurably.
- Lauri