Skip to content
This repository was archived by the owner on May 4, 2019. It is now read-only.

Conversation

@johnmyleswhite
Copy link
Member

This branch restarts the process of adding missing functionality to our basic statistics functions for skipping NA values while calculating statistics. The code is quite repetitive and can be DRY'ed out in a future run.

For now what I'd like to do is agree on what functionality these functions should offer. For now, I've taken every function I'm replacing and added a skipna keyword that allows one to skip over NA values. For skewness and kurtosis, this keyword has to be passed to the function that computes centers when they are not pre-specified, so the center is now also a keyword called m. (FWIW, I'm not a big fan of specifying centers that aren't the mean, so we might take that out. I'd argue it also doesn't belong in Base: neither R nor SciPy offer that functionality. I'm not sure why we do.)

Things we're not doing that R does:

  • For mean, std and var, R also offers the ability to trim out extreme data points.
  • For median, R also offers the ability to use the lo or hi median, which is simply the lower or higher value in an array with an even number of elements, instead of their average.

Unlike R, Julia Base expects that we will offer:

  • Slices to compute column-wise, row-wise and other dimension-wise functions.
  • Use of non-standard centers

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants