Top500: Top 500 Datasets
In shinra-dev/RTop500: Top 500 Data

All Top500 lists are available up to the release version of the package. For example, package 14.06 contains all Top500 datasets up to June of 2014. Top500 data is released twice a year, every June and November, at the ISC and SC conferences, respectively.

All Top500 datasets are stored as dataframes, essentially being then output of XML::xmlToDataFrame. The raw datasets are available at the official Top500 website, and are mirrored in the GitHub repository of the RTop500 project (along with the parser used by the package).

The naming convention for all datasets goes TOP500_<year><month>, where the year is the 4-digit year and month is the 2-digit month that the list was released (either 06 or 11).

rank	Rank of the system based on performance of the Linpack benchmark.
system.id	A unique (for Top500 lists) system identifier.
system.name	Name of the machine.
manufacturer	System vendor.
computer	Short description of the nodes.
system.address	Machine website.
r.max	Max Linpack benchmark performance in GFLOPs.
power	Power in megawatts with machine at peak.
r.peak	Theoretical peak performance in GFLOPs.
n.max	Problem size for Linpack benchmark to achieve given r.max value.
n.half	Problem size for Linpack benchmark to achieve half of the given r.max value.
installation.site	Location of the machine.
town	Town machine is located in.
state	State machine is located in (if applicable).
country	Country machine is located in.
year	Year machine was created.
area.of.installation
number.of.processors	Number of processors/cores.