Reports, Papers, and Tools


  1. Biannual Reports:
  2. This section comprises the public edition of the project's semiannual reports, showcasing the progress and outcomes achieved.
  3. Geoping Project:
  4. This report documents the details of different tools and technologies used for router geolocation.
  5. Active Measurement:
  6. Reports detailing the active measurement needs and desired infrastructure capabilities as identified by the community, along with potential technical solutions we could explore.
  7. Data Storage Requirements:
  8. This report aims to provide an overview of the latest trends and technologies in big data storage and management.
  9. Data and Metadata APIS:
  10. Year-2 report showcasing the development and enhancements of the project’s APIs
  11. Anonymization Tools:
  12. Summary of the state-of-the art anonymization tools
  13. Software for Disclosure Controls:
  14. This report documents gaps between privacy preservation techniques and network and security research needs.

Working Papers

  1. Data Needs for Securing Internet Infrastructure:
  2. This document compiles the datasets we've identified as having an impact (or potential impact) on enhancing the security stance of the foundational layers of Internet infrastructure. We encourage public feedback.
  3. Exploring the Limits of Differential Privacy:
  4. We explore the use of DP in the protection of corporate proprietary information in computing aggregate industry-wide query results.
  5. Annotated Schema: Mapping Ontologies onto Dataset Schemas:
  6. The goal of this Annotated Schema (AS), is to provide a limited ontology of annotations for dataset metadata that inform a prospective user of the classes, properties, and identifiers contained in the data.


  1. AS Rank is CAIDA's ranking of Autonomous Systems (AS) (which approximately map to Internet Service Providers) and organizations (Orgs) (which are a collection of one or more ASes). This ranking is derived from topological data collected by CAIDA's Archipelago Measurement Infrastructure and Border Gateway Protocol (BGP) routing data collected by the Route Views Project and RIPE NCC.
  2. Hoiho: Hostname-based Geolocation of IP addresses is an open-source tool released as a part of scamper. It uses CAIDA's Macroscopic Internet Topology Data Kit (ITDK) and observed round trip times to infer regular expressions that extract apparent geolocation hints from hostnames. The ITDK contains a large dataset of routers with annotated hostnames, which are used as input.
  3. Periscope: Unified Interface to Looking Glass Internet Measurement Servers provides a uniform interface to hundreds of Looking Glass and Scamper servers with access to thousands of network probing vantage points (monitors) that can perform traceroute and BGP queries.
  4. Spoofer is a suite of open-source software tools to assess and report on the deployment of source address validation (SAV) best anti-spoofing practices. This client-server system periodically tests a network's ability to both send and receive packets with forged source IP addresses (spoofed packets). The CAIDA Spoofer Data API provides a public data interface to the publicly shareable data collected by the Spoofer service.
  5. DNS Zone Database (DZDB) is a platform providing access to time-series data derived from current and historical zone files provided by generic Top-Level Domains (gTLDs) participating in the Central Zone Data Service (CZDS) or directly by Registries Operators in compliance with corresponding license agreements.
  6. BGP2GO: To facilitate security research and analysis, we study the feasibility of indexing numeric identifiers over time: We index BGP prefixes, ASNs, communities, and IP addresses to data sets in which they occur. Currently, we process all BGP update files from RouteViews' route collectors. We prototyped BGP2Go, a web application that assists in selecting and obtaining relevant MRT data sets for further analysis. We hope to extend it to include other types of data, e.g., RIR allocation files, DNS (OpenIntel), DNS data. Indexing more data will facilitate correlation of activities of an identifier across data sets.
  7. Facilitating Advances in Network Topology Analysis (FANTAIL) system was developed to enable discovery of the full potential value of massive raw Internet end-to-end path measurement data sets, allowing researchers to use high-level queries to perform data processing and analysis tasks on matching traces without owning/operating a cluster, and without learning big data programming.
  8. BGPStream is an open-source software for live and historical BGP data analysis, supporting scientific research, operational monitoring, and post-event analysis. It provides access to real-time and historical Routviews and RIPE RIS BGP data.