There are two basic explanations for not getting consistent bisection results: * In each case, at least one bad commit was accidentally marked as good. Test longer / more times before declaring a commit as good to avoid this. * The problem (or at least the trigger) isn't in the kernel but somewhere else.