## Power

### Power

Hi,

I am in the process of writing a presentation on my undergraduate dissertation for an interview. I used 1 Mann-Whitney U test to compare performance between condition A and B, and a second to compare performance between condition A and C. As hypothesized, there was a significant difference between A and B, but not A and C. My effect sizes were medium-large.

However, I did not calculate power. Am I right in thinking that my significant result must have had enough power, but that the non-sig comparison may not have and therefore I may have missed a significant result?? Would the implication be that if I were to do the analysis again, I would calculate power and possibly use a larger sample size if the analysis indicated this?

How would you go about calculating power for non-parametric analysis?

Thanks

I am in the process of writing a presentation on my undergraduate dissertation for an interview. I used 1 Mann-Whitney U test to compare performance between condition A and B, and a second to compare performance between condition A and C. As hypothesized, there was a significant difference between A and B, but not A and C. My effect sizes were medium-large.

However, I did not calculate power. Am I right in thinking that my significant result must have had enough power, but that the non-sig comparison may not have and therefore I may have missed a significant result?? Would the implication be that if I were to do the analysis again, I would calculate power and possibly use a larger sample size if the analysis indicated this?

How would you go about calculating power for non-parametric analysis?

Thanks

### Re: Power

Power (Beta, or the type 2 error rate) is the likelihood of detection an effect, if it is there; it is the flip-side of alpha (the type 1 error rate). You are not correct to assume that because your result was significant, you must have had enough power. The point of inadequate power is that it increases the probability of your statistical test being significant, even if there was no real difference between your data. You can read a 2005 paper by Ionnidis for more detailed information. It is freely available here.

You can calculate power post-hoc and this might help to answer you question. You can use G*Power to perform this calculation. In terms of correcting for low power, you are right that increasing the sample size will help, as will using a within samples design, if possible. Furthermore, in some cases you can improve measurement accuracy by using instruments that are more specific or precise at measuring your particular phenomenon of interest. You can also improve accuracy by specifying with greater precision the nature of the phenomenon under investigation.

Hope this helps. Good luck.

You can calculate power post-hoc and this might help to answer you question. You can use G*Power to perform this calculation. In terms of correcting for low power, you are right that increasing the sample size will help, as will using a within samples design, if possible. Furthermore, in some cases you can improve measurement accuracy by using instruments that are more specific or precise at measuring your particular phenomenon of interest. You can also improve accuracy by specifying with greater precision the nature of the phenomenon under investigation.

Hope this helps. Good luck.

### Re: Power

Thank you for replying. I'm so confused.

Power is the chance of finding a difference when one really exists right? So if I didn't have enough power, does this mean that the significant result that I found may not be genuine?! How does that work?? And does it also mean that although I found a non-sig result, a difference here may exist but my test wasn't powerful to find it?

My effect size for the significant result was med-large; am I right in thinking that this means the association between the two variables is relatively strong (compared to a small effect size)? And how come my effect sizes are good if my power might not be?

I've downloaded G power but I've never use it before and am struggling to use it- which "test family" and statistical test would Mann Whitney U test some under? I don't know what all of the values it's asking for are!!

Sorry for all of my questions, I feel like I should have learned all this at undergrad but it wasn't covered in my course.

Power is the chance of finding a difference when one really exists right? So if I didn't have enough power, does this mean that the significant result that I found may not be genuine?! How does that work?? And does it also mean that although I found a non-sig result, a difference here may exist but my test wasn't powerful to find it?

My effect size for the significant result was med-large; am I right in thinking that this means the association between the two variables is relatively strong (compared to a small effect size)? And how come my effect sizes are good if my power might not be?

I've downloaded G power but I've never use it before and am struggling to use it- which "test family" and statistical test would Mann Whitney U test some under? I don't know what all of the values it's asking for are!!

Sorry for all of my questions, I feel like I should have learned all this at undergrad but it wasn't covered in my course.

### Re: Power

I'd recommend looking in a good stats textbook such as Andy Field's 'discovering statistics' book. That will hopefully answer a lot of your queries and help make it all a bit clearer.

Good luck!

Good luck!

*You can't stop the waves, but you can learn to surf*- Jon Kabat-Zinn

### Re: Power

Very interesting paper, thank you!Alexander wrote:Power (Beta, or the type 2 error rate) is the likelihood of detection an effect, if it is there; it is the flip-side of alpha (the type 1 error rate). You are not correct to assume that because your result was significant, you must have had enough power. The point of inadequate power is that it increases the probability of your statistical test being significant, even if there was no real difference between your data. You can read a 2005 paper by Ionnidis for more detailed information. It is freely available here.

You can calculate power post-hoc and this might help to answer you question. You can use G*Power to perform this calculation. In terms of correcting for low power, you are right that increasing the sample size will help, as will using a within samples design, if possible. Furthermore, in some cases you can improve measurement accuracy by using instruments that are more specific or precise at measuring your particular phenomenon of interest. You can also improve accuracy by specifying with greater precision the nature of the phenomenon under investigation.

Hope this helps. Good luck.

I find the concept of power very interesting, because it is very much a massive focus in clinical research - but not so much within other areas of psychology, and I wonder why this is. Perhaps it is the specific research questions being asked.

### Re: Power

Thanks for the suggestion Toria, but I have a quick look at the contents before I tried to get it recalled to the library, and it doesn't seem to have a power section :S Also I think I have a general understanding of power, having researched it A LOT on the internet to try to work this this out, but struggling to apply it to the interpretation/reflection on my research.

I would be forever grateful if somebody could just let me know if my understanding is on the right lines, and also possibly give me some guidance on using G Power for Mann Whitney.

I would be forever grateful if somebody could just let me know if my understanding is on the right lines, and also possibly give me some guidance on using G Power for Mann Whitney.

### Re: Power

Yes and yes. Yes I would echo the advice to get your head round things a bit. You need a sufficiently powered sample to reject the possibility of making both type 1 and type 2 errors.Angel28 wrote:So if I didn't have enough power, does this mean that the significant result that I found may not be genuine?! And does it also mean that although I found a non-sig result, a difference here may exist but my test wasn't powerful to find it?

I would do post-hoc power using g-power. There should be instructions on where to put things somewhere online - it isn't difficult (promise) but is really hard to explain it to someone online, short of just doing it all for them, as it depends on the specifics of your study.

What you have done is pretty okay - in the sense that you have good effect sizes, but in some quarters doing a post-hoc power calc is controversial. the important thing is to know that what you have done is not ideal, and explain why. hence needing to read - just google, even wikipedia is good for these concepts. it's like doing a study and saying well if it is significant then we can devise a hypothesis that supports this, and if it isn't we won't.

### Re: Power

Ah sorry if it's not in there, it's such a thorough textbook I assumed it would be!

I've just looked at G*Power, and I think this is how you use it for Mann-Whitney, but please read this with the disclaimer that I haven't used it for this test before (although I have used it to calculate sample size for a study using a different stats analysis):

Test family = t-tests

Statistical test = Means: Wilcoxon-Mann-Whitney test (two groups)

Type of power analysis = Post hoc: Compute achieved power - given alpha, sample size, and effect size

Then select whether your hypothesis/hypotheses were one or two tailed

Enter your effect size, alpha value (probably .05, it's always .05!), and your sample size, and (fingers-crossed!) it should give you the power value.

Hope that helps and makes some sort of sense Good luck!

Edited to add: oops just read enid's good advice of looking online for instructions, I didn't think of that! I'll leave my hopefully correct how to guide above just in case it is helpful but it might be worth checking against other instructions too

I've just looked at G*Power, and I think this is how you use it for Mann-Whitney, but please read this with the disclaimer that I haven't used it for this test before (although I have used it to calculate sample size for a study using a different stats analysis):

Test family = t-tests

Statistical test = Means: Wilcoxon-Mann-Whitney test (two groups)

Type of power analysis = Post hoc: Compute achieved power - given alpha, sample size, and effect size

Then select whether your hypothesis/hypotheses were one or two tailed

Enter your effect size, alpha value (probably .05, it's always .05!), and your sample size, and (fingers-crossed!) it should give you the power value.

Hope that helps and makes some sort of sense Good luck!

Edited to add: oops just read enid's good advice of looking online for instructions, I didn't think of that! I'll leave my hopefully correct how to guide above just in case it is helpful but it might be worth checking against other instructions too

*You can't stop the waves, but you can learn to surf*- Jon Kabat-Zinn

### Re: Power

Glad it made sense! Yes, I think you have to put in your actual effect size and .05 for alpha (alpha is the probability you are hoping to meet or surpass, p is your actual probability, but you don't need to enter that into G*Power).

I found this link which might be helpful if you haven't already come across it:

http://www.ats.ucla.edu/stat/gpower/indepsamps.htm

I found this link which might be helpful if you haven't already come across it:

http://www.ats.ucla.edu/stat/gpower/indepsamps.htm

*You can't stop the waves, but you can learn to surf*- Jon Kabat-Zinn

### Re: Power

Sarah, from my experience of working on an NRES ethics committee and doing an MSc in Research Methods I would say the difference in attitude to power in clinical vs. non-clinical areas is pretty much down to statisticians. It is always the case that a statistician will advise researchers conducting clinical trials within the NHS; in fact its basically an ethical requirement. A statistician (unlike a psychologist), will quickly identify whether the proposed study has sufficient power (e.g. intended number of participants and estimated effect sizes). By comparison, psychologists are informed amateurs when it comes to statistics and often the importance of power is not taught at undergraduate level. I know it was not emphasised on my undergraduate course.sarahg wrote:

Very interesting paper, thank you!

I find the concept of power very interesting, because it is very much a massive focus in clinical research - but not so much within other areas of psychology, and I wonder why this is. Perhaps it is the specific research questions being asked.

Although the power calculation is in fact trivial, I am informed that psychologists not working within clinical trials reliably do not performed power calculations for their studies. In fact, recruitment numbers are based more on convention and intuition. For reference, remember the number 30. Whenever you read a psychology journal article in future, glance at the number of participants. It is very often around 30 participants, or 30 participants per group for a between subjects design. I've experienced this recently with my dissertation supervisor, who when we discussed required participants appeared to pluck numbers from the air. Of course, 30 participants is woefully inadequate to detect the sort of effect sizes that are often being investigated by psychologists in the 21st century.

I think journals and reviewers are also complicit in this problem, partly because the reviewers are the convention setters and partly because psychology (and frankly much of academia) is fairly resistant to methodological change. For more intersting resources about power, check out Geoff Cumming'g website and book. He's also done a fun Youtube video called The Dance of the P Values that makes a mockery of the relevance of the p statistic.

### Who is online

Users browsing this forum: No registered users and 1 guest