This repository contains the source code to generate the Prefix2Org dataset presented in the paper "Prefix2Org: Mapping Internet Prefixes to Organizations" by Gouda et al., published in ACM Internet Measurement Conference (IMC) 2025.
The Prefix2Org dataset is available at https://github.com/ISS-GT/Prefix2Org.
To run the code in this repository, you will need to download the bulk WHOIS databases from the five Regional Internet Registries (RIRs). The databases are available under their Acceptable Use Policies.
- RIPE: https://ftp.ripe.net/ripe/dbase/
- AFRINIC: https://ftp.afrinic.net/dbase/
- ARIN: https://www.arin.net/reference/research/bulkwhois/
- APNIC: https://www.apnic.net/manage-ip/using-whois/bulk-access/
- LACNIC: https://www.lacnic.net/2472/2/lacnic/accessing-bulk-whois
If you use find the framework present in the paper useful, please cite the paper:
@inproceedings{10.1145/3730567.3764485,
author = {Gouda, Deepak and Dainotti, Alberto and Testart, Cecilia},
title = {Prefix2Org : Mapping BGP Prefixes to Organizations},
year = {2025},
isbn = {9798400718601},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3730567.3764485},
doi = {10.1145/3730567.3764485},
abstract = {Accurately mapping Internet address space to organizations is critical to understanding the Internet’s organizational ecosystem. Traditional approaches, which rely on individual WHOIS queries often suffer from unclear ownership structure of IP addresses and inconsistent organization names, resulting in ambiguous inferences. Alternative methods that map BGP prefixes to Autonomous Systems Numbers (ASNs) and ASNs to organizations are also inaccurate since ASes often originate prefixes on behalf of their customers. This paper introduces Prefix2Org, a comprehensive prefix-to-organization mapping framework. We introduce a taxonomy for the holders of IP addresses and a methodology to map IP addresses to organizations, based on the operational rights over them. We develop string processing heuristics and leverage RPKI Certificates and routing data to address inconsistencies in organizational names and aggregate prefixes under unified management. Our public dataset covers 99.96% (99.99%) of IPv4 (IPv6) prefixes. We validate 9.3% of routed IPv4 addresses with a 99% recall, and 5.6% of IPv6 prefixes with a 99.34% recall. For the two large organizations where we obtained complete ground truth, Prefix2Org produced no false positives. Finally, in two case studies, (i) we characterize organizations that hold address space without an ASN and (ii) demonstrate how RPKI adoption measured through Prefix2Org differs from the previously used AS-centric view.},
booktitle = {Proceedings of the 2025 ACM Internet Measurement Conference},
pages = {397–414},
numpages = {18},
keywords = {Prefix-to-Organization Mapping, IP Ownership, BGP, RPKI},
location = {USA},
series = {IMC '25}
}If you use this dataset in your research, please cite the dataset:
@software{gouda_2025_17237945,
author = {Gouda, Deepak and
Dainotti, Alberto and
Testart, Cecilia},
title = {Prefix2Org: Mapping BGP Prefixes to Organizations},
month = sep,
year = 2025,
publisher = {Zenodo},
version = {v1.0},
doi = {10.5281/zenodo.17237945},
url = {https://doi.org/10.5281/zenodo.17237945},
swhid = {swh:1:dir:7a6e744426503a5e88ba336db206e882916c04ef
;origin=https://doi.org/10.5281/zenodo.17237945;vi
sit=swh:1:snp:eaee977231fc014e300f74918ff6aac3e642
e6dd;anchor=swh:1:rel:b5b6e4912543ed69d11403945e34
c7dd9fecf090;path=ISS-GT-Prefix2Org-0dd56bc
},
}