Skip to content

How to use a weighted mean estimator in seaborn factor plot (incl bootstrapping)? #722

@timalthoff

Description

@timalthoff

[Note: I am reviving a stackoverflow question that I was unable to figure out with some new insights on how it might work. See: http://stackoverflow.com/questions/32771520/how-to-use-a-weighted-mean-estimator-in-seaborn-factor-plot-incl-bootstrapping]

I have a dataframe where each of the rows has a certain weight which needs to be accounted for in the mean computations. I love seaborn factorplots and their bootstrapped 95% confidence intervals but haven't been able to get seaborn to accept a new weighted mean estimator.

Here is an example of what I would like to do.

tips_all = sns.load_dataset("tips")
tips_all["weight"] = 10 * np.random.rand(len(tips_all))
sns.factorplot("size", "total_bill", 
               data=tips_all, kind="point")
# here I would like to have a mean estimator that computes a weighted mean
# the bootstrapped confidence intervals should also use this weighted mean estimator
# something like (tips_all["weight"] * tips_all["total_bill"]).sum() / tips_all["weight"].sum()
# but on bootstrapped samples (for the confidence interval)

The problem I have is that the estimator function only gets to see the "main variable" (y axis) instead of the full dataframe that would allow the estimator to access more than just "y".
See here:

boots = bootstrap(stat_data, func=estimator,

Is there any simple way to do this?

If not, what is the easiest way to extend the categorical plotting to allow for weighted estimators?

Thanks a lot,
Tim

PS: couldn't figure out labels. my guess is question and wishlist.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions