Skip to content

Inconsistence in W test statistic  #951

@maximelepetit

Description

@maximelepetit

I would like to thank you for this very interesting package.

I need help with interpretation and clarifying certain values.

I calculated an apoptosis score for two cell samples. BASE cells and LPS cells. And I would like to see if there is a significant statistical difference between the 2 groups.
For group 1 the sample size is 8126 cells and for group 2 the sample size is 7942 cells.

Naively I did a Wilcoxon test between these two groups.

# Extract data
apoptosis_data <- FetchData(neurons_v5_cb_subset_neurons_silvia, vars = c("ApoptosisScore1", "orig.ident"))
rownames(apoptosis_data)<-NULL
head(apoptosis_data)
  ApoptosisScore1     orig.ident
           <dbl>            <chr>
1	0.04673351	BASE		
2	0.03632951	BASE		
3	0.05176500	BASE		
4	0.04276227	BASE		
5	0.03331517	BASE		
6	0.03697204	BASE
group1 <- apoptosis_data[apoptosis_data$orig.ident == "LPS", "ApoptosisScore1"]
length(group1)
8126

and

group2 <- apoptosis_data[apoptosis_data$orig.ident == "BASE", "ApoptosisScore1"]
length(group2)
7942

Perform wilcoxon rank sum test :

wilcox_test <- wilcox.test(group1, group2)
print(wilcox_test)

That give :

	Wilcoxon rank sum test with continuity correction

data:  group1 and group2
W = 42939709, p-value < 2.2e-16
alternative hypothesis: true location shift is not equal to 0

Conclusion :
The p-value < 2.2e-16 suggests that there is a statistically significant difference in the ApoptosisScores between the two groups. Therefore, you can reject the null hypothesis that the distributions of the ApoptosisScores in the two groups are the same.

Then I discovered the ggstatsplot package.

After reading the documentation I decided to use ggbetweenstats function between the two groups. According to the documentation :
Non-parametric 2 Mann-Whitney U test [stats::wilcox.test()](https://rdrr.io/r/stats/wilcox.test.html)
I decided to set type="nonparametric" in order to find the value of p.value obtained previously.

Here the code used :

p <- ggbetweenstats(
  data  = apoptosis_data,
  x     = orig.ident,
  y     = ApoptosisScore1,
  type = "nonparametric",
  ylab = "Apoptosis score",
  xlab = "Condition",
  title = "Distribution of Apoptosis Score across condition"
) 

Give :
comparaison_lps_base_withoutggsignif

I am wondering why the test statistic (W) is different when i ran wilcoxon.test in one hand (W = 42939709) and the test statistic gave on the plot : 2.16e+07 ?

I need help !

Thanks.

Maxime

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions