Data Management Plan

The NSTX-U Data Management Plan (DMP) describes the elements of data from measured to analyzed and also describes the resources available for the data management and preservation during the course of research operations. In addition, this page describes the resources available for sharing of data and provides a link to user requirements for data access. Finally, web links to the NSTX-U and PPPL computing and analysis resources are provided. Any NSTX-U data management plan questions should be directed to the NSTX-U Head of Physics Analysis: Stan Kaye (kaye@pppl.gov).

I. Data Categories

Data from NSTX-U discharges will be obtained from a suite of diagnostics measuring a broad range of plasma characteristics. The three main categories of NSTX-U data are raw, reduced, and analyzed.


A. Raw

Raw (measured) data may take the form of voltages, emissivities, etc, and are not directly useable as input to higher level analysis routines. The raw data  will be:

    1. OD - temporally and spatially constant information during the course of a plasma discharge such as fixed operational                           settings, device/facility conditions, etc.
    2. 1D - temporally varying measurements (magnetic fluxes, neutron rates, etc.), or spatially varying data taken only at one time
    3. 2D - measurements that vary both in time and space (kinetic profiles, etc.)
    4. 3D - temporally varying 2D images (visible camera, gas puff imaging, etc.)

B. Reduced

Raw data will be converted to reduced data through diagnostic-specific analysis software. Reduced data will be in real physics units (e.g., temperatures, densities, etc.), and once validated by the responsible diagnostician, can be used as input to high level analysis codes.  A listing of NSTX-U diagnostics, units for the measurements, and the person responsible for the diagnostic is provided here.

C. Analyzed

Validated reduced data that has been synthesized through direct analysis or through higher level analysis codes. Analyzed data, along with some validated reduced data, form the basis for figures and physics conclusions presented in publications.

II. Data Management Resources, Storage, and Archival

A. Resources
On-site data management resources include real-time and post-experiment data reduction, standardized data acquisition architecture and storage (MDSPlus), on-call help for software and hardware issues (software and hardware engineers), coordinated hardware maintenance, upgrades and compatibility affecting computers owned by PPPL as well as those owned by collaborators, shared CPU resources with some CPUs dedicated to specific data acquisition and reduction tasks, and web-based visualization tools. Code for generating plots via web-based plotting tools are maintained on the local PPPL cluster.  Outside resources include Google mail, sites, and docs (NSTX-U web pages are managed by Google) and mdsplus.org for downloading and documentation of MDSPlus tools.

B. Storage

On-site data is stored in MDSPlus, the standard architecture for data management within the magnetic fusion community. All data are stored within this architecture except for certain exceptions (such as fast camera videos, which is stored in its own repository, CAMDATA). Data storage is centrally managed and is contained in a dedicated project space. There is no standard format required for the video data, but the data format for this has evolved into a de facto standard. Data contributed to international databases are stored on off-site servers but are accessible through the Web.    

C. Archival
Data is archived using the EPICS archiver (engineering operations data repository) and using VERITAS for data backup for end users with a self-help archiving system for long-term storage. Procedure ITD-003 (Nov. 2010) governs the PPPL backup policy (available on request) and it includes both on- and off-site storage, data formats include those specified by MDSPlus, NETCDF, SQL Server databases and Plasma State files. Assistance on storage and archival is obtained from Helpdesk, and MDSPlus and Unix backup system administrators.

III. Data Access and Sharing

A. Resources
Data sharing is facilitated through Web-based visualization tools accessible to public, common MDSPlus architecture/tools including shared analysis code, NTCC module library, FTP services, common login cluster (ability to access main computer cluster from on- or off-site), trusted data movement mechanisms among PPPL, ORNL, GA, NERSC, MIT and ITER, common output file formats (e.g., Plasma State file from TRANSP runs, NETCDF files), 10 Gigabyte ESNET connection to all National Labs, GLOBUS on-line for transferring data over the internet. Data provenance is limited to maintaining histories of data calibrations, etc through MDSPlus and keeping track of data smoothing, averaging, etc. in UFILES (for TRANSP runs). 

B. Access and Sharing
All research data displayed in publications will be made digitally accessible to the public at the time of publication. This will include data displayed in charts, figures, images, etc., and they will be identified uniquely by Archival Resource Keys (ARKs). The ARKs and/or URLS for accessing the data files will be given in the publication. The data files will be stored in the Princeton University Data Repository. The underlying digital research data used to generate the displayed data will be made available through the establishment of a collaboration, whose requirements are given below (Sec. III.C)

C. Requirements
The establishment of a collaboration is contingent on both identifying a point of contact with an NSTX-U researcher and reading and signing the NSTX-U Data Usage and Publication agreement. The use of data and the publication policy are governed by the NSTX-U Data Usage and Publication Form.

IV. Links to NSTX-U and PPPL Data Management Resources

The following links provide additional information for the management and analysis of NSTX-U data.


V. Digital Data

Digital data in support of publications are provided in accordance with DOE policy. This will include data displayed in charts, figures, images, etc., and they will be identified uniquely by Archival Resource Keys (ARKs). The ARKs and/or URLS for accessing the data files will be given in the publication. The data files will be stored in the Princeton University Data Repository. 

Instructions for authors on how to include the ARKs in the publications can be found here. Instructions for uploading readme and data files for review to the local PPPL repository, before being uploaded to the permanent Princeton Univ. DataSpace repository are contained in the "How To" guide. Uploading to the local PPPL repository can be done by going to the following link: http://pppl-dspace.pppl.gov/.