If they controlled for it, the results from the mixed-gender groups should be more applicable assuming an normal distribution of attractiveness. That doesn't run contradictory to the hypothesis that a non-normal distribution (that is, skewed on the attractive side) can hurt said performance.