On Thu, Apr 10, 2014 at 2:46 PM, Lauri Kasanen cand@gmx.com wrote:
On Thu, 10 Apr 2014 12:19:10 -0400 Ilia Mirkin imirkin@alum.mit.edu wrote:
+static inline uint32_t r100_mm_rreg(struct radeon_device *rdev, uint32_t reg,
bool always_indirect)
+{
if (reg < rdev->rmmio_size && !always_indirect)
return readl(((void __iomem *)rdev->rmmio) + reg);
Quick thought from someone entirely unfamiliar with the hardware: perhaps you can get the performance benefit without the size increase by moving the else portion into a non-inline function? I'm guessing that most accesses happen in the "if" branch.
The function call overhead is about equal to branching overhead, so splitting it would only help about half that. It's called from many places, and a lot of calls per sec.
Is that really true? I haven't profiled it, but a function call has to save/restore registers, set up new stack, etc. And the jump is to some far-away code, so perhaps less likely to be in i-cache? (But maybe not, not sure on the details of how all that works.) And I'm guessing most of the size increase is coming from the spinlock/unlock, which, I'm guessing again, is the much-rarer case.
And the branch would happen either way... so that's a sunk cost. (Except I bet that the always_indirect param is always constant and that bit of the if can be resolved at compile time with that part being inlined.)
Anyways, it was just a thought.
-ilia