Pierre Auger Observatory Open Data

February 2021 release

The Pierre Auger 2021 Open Data is the public release of 10% of the Pierre Auger Observatory data presented at the 36th International Cosmic Ray Conference held in 2019 in Madison, USA, following the Auger collaboration open data policy.

This website hosts the datasets for download. An online event display is available to explore the released events, and example analysis codes are provided. See below for a brief overview of the Pierre Auger Observatory and of the Auger Open Data.

About the Pierre Auger Observatory

The Pierre Auger Observatory, located on a vast, high-altitude plain in Argentina, in the Province of Mendoza, is the world's largest cosmic ray observatory and is used to study the extensive air-showers produced by cosmic rays above ~1017 eV. The intensity of high energy cosmic rays (those above about 1014 eV) is only a few particles per square meter per day and thus too low to allow for direct measurement with satisfactory statistical precision. Above 1019 eV the rate is only about 1 per km2 per year. The phenomenon of extensive air-showers must be exploited to study cosmic rays at very high energies. The air-showers are cascade of particles created by the interaction of a single cosmic-ray with the Earth atmosphere. They can be observed by telescopes that pick up the fluorescence radiation emitted from nitrogen molecules excited as the shower crosses the atmosphere, while the particles reaching the ground can be sampled by large arrays of detectors. The properties of these extensive air-showers are measured to determine the energy and arrival direction of each cosmic ray and to provide a statistical determination of the distribution of primary masses.


Schematic map of the Pierre Auger Observatory

The Observatory features an array of 1600 water-Cherenkov particle detector stations (SD, black dots in the map) spread over 3000 km2 on a 1500 m triangular grid, overlooked by 24 air-fluorescence telescopes (FD, blue dots in the map). These are located at Los Leones, Los Morados, Loma Amarilla and Coihueco. In addition, three high-elevation fluorescence telescopes (HEAT) overlook a 23.5 km2, more compact 61-detector array with a 750 m spacing and an array of radio antennae (AERA). The Observatory is at a mean altitude of about 1400 m, corresponding to an atmospheric overburden of about 875 g cm-2. The site is located between latitudes 35.0°S and 35.3°S and between longitudes 69.0°W and 69.4°W. Data-taking started on 1 January 2004 with 154 water-Cherenkov detectors and one fluorescence detector in operation. Installation was completed in June 2008 and running has been on-going since that date. The hole visible in the array map, south-east of Loma Amarilla, is because of difficulties with a local landowner.

Each water-Cherenkov station is filled to a depth of 1.2 m with highly-purified water enclosed within a diffusively-reflective liner. The water is viewed from above by three 9-inch photomultiplier tubes (PMTs) in contact with it. These detect Cherenkov light emitted by charged particles that enter the detectors. Each PMT provides two signals which are tagged with the GPS time stamps to an absolute time accuracy of 12 ns and are digitized using 40 MHz, 10-bit Flash Analog-to-Digital Converters (FADCs). A low-gain signal is taken directly from the anode of the PMT, while a high-gain signal is provided by the last dynode and amplified to be nominally 32 times larger than the low-gain signal, enhancing the total dynamic range to span more than three orders of magnitude in integrated signal.

Information about the time and the amplitudes of station signals above a trigger level are sent, via a purpose-built communications network, at a rate of about 20 Hz to a computer at the Central Campus. If spatial and temporal coincidences are identified, data from the triggered stations are recorded and an event is reconstructed from the temporal and signal information.

The data from the fluorescence emission are collected by a set of six telescopes at each of the FD sites, covering 30 degrees of elevation from the ground up and 6 x 30° over the array. Each telescope has a camera with 440 photomultipliers (pixels), recording the ultraviolet light received in each 100 ns time interval. At each site, an event is recorded whenever there are several pixels with signals above the night-sky background light, compatible with the image of a line. The GPS time is used to connect the fluorescence event to those seen simultaneously in other FDs and with SD stations that have signals.

Lasers are located at the positions CLF and XLF towards the center of the array. They are used to fire beams into the atmosphere that can be seen from the fluorescence detectors, and are thus available to check the accuracy of the directional reconstruction made with the fluorescence data. These lasers are part of a battery of instruments that are used to monitor the state of the atmosphere at the time an event is recorded, including dedicated lidars and cloud cameras in each of the FD sites.

The Auger Observatory is operated by a Collaboration of more than 400 scientists, engineers, technicians and students from more than 90 institutions in 18 countries. You can find further information about the Observatory and the Collaboration in Nucl.Instrum.Meth.A 798 (2015) 172-213 (arXiv) and on the official Auger website https://www.auger.org/.

About the Auger Open Data

Data and analysis tools

The following are provided through this portal:

Downloadable datasets

  • Pseudo-raw data: For each event, a list of SD stations, with their relevant PMT traces, is available. If an event is detected simultaneously with the SD and FD it is called a hybrid event and a list of FD telescopes with a camera view is also provided. The main parameters from the SD and FD reconstruction are also given. The ‘ready-to-use’ Event Display is a good way to become familiar with the Open Data.
  • Reconstructed data: for each event, only ‘high-level’ information is provided. Different parameters are extracted from the pseudo-raw dataset to be used in physics analysis. Examples on how to use them can be found in the Analysis page.
  • Auxiliary data: these are extra data necessary for a full physics analysis but that are not extracted directly from the raw data. They include the position of the SD stations, the position of the FD pixels, the SD exposure, the FD acceptance.

Pseudo-raw and reconstructed data are provided in JSON format. Reconstructed data are also available in CSV format, representing a “summary" of the JSON files and containing the information that is needed for analysis. Similarly, auxiliary data are in CSV format.

Tools

  • Ready-to-use event display
  • Simple software, reading the JSON and CSV files and producing examples of basic histograms of different data parameters
  • Analysis examples, reading the reconstructed data and producing derived data and graphs

Other Auger Open Data

  • All Auger publications are available as Open Access. Some of them also include Open Data in the form of additional tables, plots, graphs.
  • Simplified data from the SD that have been widely used for outreach purposes since 2007 can be found at https://labdpr.cab.cnea.gov.ar/ED/. Their format is currently slightly different from that used in this portal. The Auger Open Data plan includes the moving of the Outreach Data into this portal.

Disclaimer

  • The Open Data are released under the (CC BY-SA 4.0) International License.
  • All datasets have a unique DOI that you are requested to cite in any applications or publications.
  • The current release should be cited as: Pierre Auger Collaboration (2021), Auger Open Data release 1-2021, DOI:10.5281/zenodo.4487613
  • The Auger Collaboration does not endorse any work, scientific or otherwise, produced using these data, even if available on, or linked from, this portal.
  • The spreadsheet-based datasets allow the user to undertake basic analyses. More complex analyses however require some knowledge of the underlying physics and of the instruments.
  • The analysis methods, including the reconstruction of the data, have evolved over time, and will continue to evolve. The reconstructed Open Data are processed with the most up-to-date software. Updates are thus foreseen, for either the reconstructed data or the software needed to analyse them. These will be detailed in later releases.
  • If you are interested in joining or working with the Auger Collaboration, please contact auger-join-request@auger.unam.mx.

Policy

The policy of the Auger Collaboration on Data Release and Open Access can be found here.