-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
Description
We may be able to get some insights into distributions within IP addresses, and their influence on aggregate distributions.
- Compute CDFs on 5 or 10 bins per decade. Compute at a modest aggregation level, e.g. state, county, or possibly city. Use a fairly large time interval, like 3 months or a year.
a. Include all tests from all clients.
a. Use 1st, 5th, 10th, 25th, 50th, 75th, 90th, 95th, 99th percentile per IP.
a. Repeat excluding hottest IPs, with > 2 tests/day.
a. Repeat with only IPs that have very few tests, less than 3 per week.
a. Repeat with only IPs that have frequent tests - more than 3 per week.
a. Repeat with only hot IPs - those with more than 2 tests/day.
Repeat using WScale or CWnd to distinguish clients within an IP address?
Repeat for individual ASN or ISP. Some will have higher rates of CG-Nat than others.
Compare US vs EU vs later internet adopters.
With cold IPs, there should be little spread between percentiles, because most IPs with have only 1 or 2 tests.
With warm and hot IPs, the spread will be greater if there are multiple clients per IP, less if there is very little CG_NAT influence.
Possibly repeat, but use ratios, e.g. of 5th percentile and median.