CADIS User Guide

CADIS User's Guide: How to Contribute Data to CADIS

CADIS is the Cooperative Arctic Data and Information Service of the NSF-funded Arctic Observing Network.

Upload data and write metadata using the interface accessed via the “CADIS Data Portal” link listed (in gray) along the left side of the http://www.aoncadis.org site or directly at http://aoncadis.ucar.edu. Users must have a CADIS user name and password to upload metadata and data. AON PIs may designate “metadata contacts” that may create and edit metadata for datasets, upload data to datasets, and handle administrative functions related to making metadata and data available on the CADIS portal. Metadata contacts should have their own CADIS user name and password. A new account can be created from the log-in page. Please contact CADIS User Support if you wish to designate one or more metadata contacts for your AON project, or otherwise need assistance.

This User’s Guide contains the following sections:

  1. Structure of the CADIS Data Collection
  2. The CADIS Metadata Profile
  3. Logging On and Navigating
  4. Entering Metadata
  5. Submitting Metadata
  6. Submitting Data
  7. Appendix
    1. More On Metadata
    2. A CADIS Metadata Record Example
    3. Writing Documentation

Structure of the CADIS Data Collection

The CADIS Data Portal is organized by AON project. Each project will have one or more associated datasets, and each dataset will contain one or more data files. Figure 1 explains the relationship between projects, datasets and nested datasets, and their metadata.


Figure 1. The CADIS data collection structure. Data files are not shown. Each dataset, including nested datasets, will have data files in a directory structure determined by the AON project person who uploads the data.

Some of the elements of the CADIS data collection are described below.

Project: Project level information (project metadata) is fixed, and pre-populated in the CADIS metadata database. Project level information reflects NSF program grant information, such as the project title and the PI name. Please contact CADIS User Support if there are errors in the project level information, or if you would like to add your AON Project.

Dataset(s): PIs can create datasets by entering metadata and uploading data under a project or projects. A dataset can belong to one or more projects, but must be uploaded to each project individually. PIs (or designated “metadata contacts”) create and edit dataset metadata.

It is up to the PI to decide what collections of data will constitute individual datasets. What constitutes a dataset can be somewhat arbitrary. A dataset can be a collection of data for a specified time period, for a specific parameter, or for a specific location. The PI has even more flexibility, in that he or she can also “nest” datasets inside other datasets.

Data file(s): Under each dataset there can be a single data file or several data files. One of these files must be a “readme” file, or dataset documentation, that includes information needed to use the data files. Recommended contents are in the Appendix under Writing Documentation.

The CADIS Metadata Profile

Metadata for a dataset is a little like a catalog entry for a book in a library. A book with a catalog entry can be found more easily with on line searches, and libraries can share their catalogs for collections.

The metadata fields, taken together, make up the CADIS metadata profile. CADIS Metadata are compatible with (that is, they can be mapped to) other metadata standards including the NASA Global Change Master Directory’s (GCMD) Directory Interchange Format (DIF) and the International Polar Year metadata profile. The CADIS metadata profile is most like the DIF metadata, with a few extra fields. Many DIF and CADIS fields are identical, and we have drawn heavily on the pioneering work of the GCMD project for the CADIS profile and this metadata authoring tool.

See “More on Metadata” in the Appendix for general information about metadata.

Logging On and Navigating

Connect to the log-in page and enter your user name and password. Note that logging in is only required for metadata and data submission. All other functions (e.g., search, browse, etc.) do not require logging into the system. There are two ways to reach the metadata entry form:

  1. Click on the “Contribute [meta]data” menu selection
  2. Navigate to a dataset you have permission to edit and click on the “Contribute [meta]data” tab.Click on the “Contribute [meta]data” menu selection

Click on either “Edit the metadata for an existing dataset” or “Create a new dataset: Enter Metadata” in the “Contribute [meta]data” area to enter the metadata entry form.

Entering Metadata

Metadata are required for each dataset submitted within a given AON project, and are kept in a relational data base. You can enter metadata without uploading data, or you can enter metadata and upload data described by the metadata. Note that while all AON projects with data must have a metadata record for the data in CADIS, the data may reside elsewhere. In this case, the metadata field for Data Access holds the URL for where the data reside.

You must provide a minimum set of required metadata before uploading a data file to CADIS. The required metadata fields are listed first on the metadata entry form and are indicated with an asterisk (*). Any entries and/or changes you make will only be saved if you click on the “Save Metadata” button at the bottom of the form. You can retrieve the metadata at a later date and make changes as needed.

Please note that it is important to fill in the metadata fields with accurate and as complete as possible information. These metadata are used by several functions, including search and browse. Invalid or inaccurate metadata may prevent your datasets from being properly accessed and displayed by other CADIS system functions. When in doubt about how to fill a field, contact CADIS User Support. Many of the fields come with dropdown menus from which to select content.

See “A CADIS Metadata Record Example” in the Appendix for one dataset’s complete metadata record. Each field is described below in the order it will be presented on the screen. (The field numbers are internal project reference numbers only).

Field 1. Project. The project shown should be the project that funds your data collection, or that your data collection is in support of. This field is pre-populated. Please contact CADIS User Support if there is an error in the project name, if your project is not listed, or if you have any further questions. This is a required field.

Field 2. Title. This is the title of the dataset. Please make this title descriptive of the content. Put the most important words first and do not use only dates for titles. For nested datasets, please consider creating titles that include at least a portion of the parent dataset’s title. If the title doesn't fit in the entry window it is probably too long. Keep in mind that, like the title of a research paper, the title of a dataset may be used in a citation. This is a required field.

Field 3. Dataset Summary. This is a paragraph that describes the 'who, what, where, when, and why' of the dataset you are submitting with this metadata. It will appear in the online AON data catalog. Web crawlers will index this paragraph, and as a result the description provided here will assist others who are searching for data using a search engine and keyword search. This field can be typed in or simply cut and pasted from existing information such as the summary from your NSF proposal (but remember, it should describe the dataset, not the project as a whole). Beware that copying from some documents may result in embedded control characters which may or may not be support by CADIS. Also, consider referring to published documents or web sites when more detail is required to describe a dataset adequately. This is a required field.

Field 10. Location Keyword. Choose one or more that fit(s) best. This is a required field.

Field 11. SEARCH Discipline. This field is pre-populated. Please contact CADIS User Support if you have questions. These are used to group AON datasets at a high level. This is a required field.

Field 15. Platform Keyword. Choose the platform or platforms for the instruments that acquired these data. This is a required field.

Field 16. Instrument Name. Choose one or more instruments from the list. If the instrument is not in the list, select the last choice which is “other”. This is a required field.

Field 17. Science Keyword. Choose one or more from the NASA Global Change Master Directory list of topic areas or parameters. This is a required field.

Field 18. ISO Topic. The CADIS metadata profile includes ISO Topic in order to be compatible with international standards. We recommend you simply choose “Oceans”, “Biota”, or “Climatology/Meteorology/Atmosphere “, but you can select from the complete list if desired. This may be useful in the future for international data sharing. This is a required field.

Field 19. Metadata Contact. This is the person who will be asked when metadata questions come up for this dataset. This list contains PIs for the projects. Please contact CADIS User Support if a different name is needed. The Metadata Contact is responsible for the content of the metadata record. Generally, this will be the person filling out the form. If the responsibility shifts from the original author to another person (e.g. from an investigator to a data manager), this field should be updated to the newly responsibly person. This is a required field.

Field 20. Data Center Contact. The person (or Help Desk, or data center itself) who is responsible for the distribution of the data. If data are submitted to CADIS, this will be CADIS User Support. IIf data are kept elsewhere, and only metadata are submitted to CADIS, it will be the person responsible for the data. This is a required field.

Field 22. Distribution Format. Choose the format of your data. Remember that ASCII is not a format in the metadata sense. You need to be more specific. This is a required field.

Fields 4 and 5. Begin and End date. Choose from the drop down menus to provide the temporal coverage of the dataset. If hour and day of month are known, please include them.

Fields 6-9. Minimum and Maximum Latitude and Longitude. Please supply the values for spatial coverage that describe a box around your dataset in term of minimum and maximum latitude and longitude. If the data are at one geographical location only, enter the latitude and longitude of that point. For a moving platform, use a box that bounds the track of the platform. Position to a tenth of a degree or even a degree is usually sufficient. Although not required, these fields are important for searching and properly displaying a dataset. Please fill these fields in as accurately as possible. The following guidelines from the GCMD may be useful.

Southermost_Latitude (Minimum Latitude): The southernmost geographic latitude covered by the data. From: 0 to 90 deg for northern latitude or 0 to -90 deg for southern latitude. For example, 60 will be 60 degrees north, -60 will be 60 degrees south

Northernmost_Latitude (Maximum Latitude): The northernmost geographic latitude covered by the data. From: 0 to 90 deg for northern latitude or 0 to -90 deg for southern latitude. For example, 60 will be 60 degrees north, -60 will be 60 degrees south

Westernmost_Longitude (Minimum Longitude): The westernmost geographic longitude covered by the data. From: 0 to 180 deg or 0 to -180 deg. The Prime Meridian (PM) is 0 degrees, measured positive (+) eastwards of the PM and negative (-) westward of the PM. For example, 45 will be 45 degrees east and -45 will be 45 degrees west.

Easternmost_Longitude (Maximum Longitude): The easternmost geographic longitude covered by the data. From: 0 to 180 deg or 0 to -180 deg. The Prime Meridian (PM) is 0 degrees, measured positive (+) eastwards of the PM and negative (-) westward of the PM. For example, 45 will be 45 degrees east and -45 will be 45 degrees west.

Field 12. Frequency. Choose from the list of temporal resolution(s) of the dataset. Choose the frequency(ies) that best approximate(s) the sampling frequency for your data.

Field 13. Spatial Type. Choose the data type(s) that most closely match(es) your data.

Field 14. Resolution. Choose from the spatial resolution ranges given by the GCMD keywords. For some datasets (gridded satellite image data, for example), resolution can be harder to define. Choose the resolution(s) that best describe(s) the data. You can include more detailed information in the documentation you provide.

Field 23. Progress. This is meant to describe the state of the data being submitted. “Planned” refers to datasets to be collected in the future and are thus unavailable at the present time. “In Work” means the data are preliminary or data collection is on-going. “Completed” refers to a dataset in which no updates or further data collection will be made.

Field 21. Data Access. This is a URL to data not at the CADIS archive. If you submit data to CADIS, leave this blank. Otherwise, this will usually be a link to a web page that conveys both how and where to get the data.

Fields 24. & 25. Related Resource. These are references (citations) to work that uses the data, link(s) to full product documentation, and/or links to related data. Insert a URL or citation, then tag it with a “Purpose” from the pull-down menu.

Field 26. Dataset Language. If the dataset (data and/or documentation) is in a language other than English, please enter the language.

Field 27. Access Restrictions. We have assumed that all AON datasets are unrestricted in access once they are uploaded to the CADIS data server. If you do not want open access, you must speak with the NSF manager about exceptions. These may include any special restrictions, legal prerequisites, limitations, and/or warnings on obtaining the dataset. Additional information and examples are available from the GCMD.

Field 28. Use Constraints. Describe here how the data may or may not be used (after access is granted) to assure the protection of privacy or intellectual property. This includes any special restrictions, legal prerequisites, terms and conditions, and/or limitations on using the dataset. Data providers may request acknowledgement of the data from users and claim no responsibility for quality and completeness of data. Additional information is available from the GCMD.

Click on the “Save Metadata” button when you are finished editing the metadata. Edits to the metadata will not be saved unless you click this button.

Submitting Metadata

After you have completed the 28 field metadata entry form, click the "Save Metadata" button at the bottom of the page. If you have omitted required information, the system will alert you by displaying a message in red text at the top of the page. Please correct all problems and resubmit by clicking the “Save Metadata” button again. Remember, it is possible for you to edit the metadata at any time during this session or by re-entering the CADIS metadata authoring tool site. After submitting the metadata, you will be taken to that dataset's display.

Metadata documents each dataset to some degree, but additional information is always desirable and is usually required before a dataset can be used by a wide audience of researchers. In addition to metadata, each dataset must include a “readme” file, or dataset documentation, that includes information needed to use the data files. Recommended contents are in the Appendix under Writing Documentation.

Submitting Data

To submit data files, click on the “Contribute [meta]data” tab. You will have options to “Create a new dataset: Enter Metadata” or “Upload files to an existing dataset”. If you have already created a dataset, choose “Upload files to an existing dataset” and select the appropriate dataset from the list that appears. Once you select the appropriate dataset, the “Data Publishing: Upload Files to Collection” page will appear. Choose “Select File”, enter the files to be uploaded, and then select “Upload File” to submit your data. Select “Clear” to start over.

To create a new dataset, select “Create new dataset: Enter Metadata” and follow the steps in the Entering Metadata section. Once you have entered metadata for the dataset, you will be taken to the dataset’s display. From there, you can select “Upload files to this dataset” where you will be taken to the “Data Publishing: Upload Files to Collection” page.

Appendix

More On Metadata

Why is metadata important? The Marine Metadata Interoperability project has this to say about the role of metadata:

The scientific data collected by a computer system is that system's most important product. For a brief time, data can even stand on its own, without descriptions or context, and provide the desired results.

Eventually, though, the data will ever be used in other ways, by other people, or at other times. This is where metadata plays an essential role. What a person forgets about the data, or someone from another project never knew in the first place, metadata can remember and explain. Good metadata can even explain data for another computer to understand it and make use of it.

Advancing science

To be able to study marine processes over extensive space and time domains it is necessary to have access to other investigators' data. For many years, both individual investigators and larger collaborative projects have developed custom systems to manage, and make available, their marine science data. It is at best a challenge, and at worst impossible, for other investigators to query all these sources for useful data. If the investigators are lucky enough to find the data they seek they often cannot obtain enough contextual information to use the data in their research.

Comply with requirements of funding agencies and publishers

Program or project managers, journal publishers or data centers often mandate that collection of data should follow certain procedures, that data should be in certain formats and that it should be well documented. This allows each organization to more easily discover their data, since they can rely on a consistent system for describing the data.

Encourage others to re-use your data, citing your research.

If your dataset is well documented and it is available to other researchers, they will be more likely to use your data. Your research and theirs will be complemented, opportunities for collaboration will increase, and the scope of your research will broaden.

(From http://marinemetadata.org/, accessed 7/11/07, used with permission.)

A CADIS Metadata Record Example

Project level information (project metadata) is illustrated in Figure A-1. Figure A-2 shows the metadata for a dataset that falls under that project.


Figure A-1. Project level information (project metadata) is contained in the blue portion. CADIS support staff fill these fields based on NSF grant information.


Figure A-2. A dataset metadata example.

Writing Documentation

The “readme” file is a critical part of dataset documentation. It should contain enough information for a researcher who is unfamiliar with the project under which the data were acquired to use the dataset. It can begin with the same summary paragraph that is used in the metadata. Following are headings and a suggested outline for dataset documentation.

Dataset Title [This is also a metadata field.]
This field helps others find your data.

Please give your dataset a descriptive title that is less than 220 characters. It should be descriptive enough so that when a user is presented with a list of titles the general content of the dataset can be determined. For example, Aerosols would not be an adequate dataset title, but Aerosol characterization and snow chemistry at Terra Nova Bay would. So, if it can be done without making the title too long, include parameters measured, geographic location, instrument, investigator, project, temporal coverage.

Examples:

  • National Solar Radiation Data Base Hourly Solar Data from Alaska, 1961-1990
  • Comprehensive Ocean - Atmosphere Dataset (COADS) LMRF Arctic Subset
  • Daily Precipitation Sums at Coastal and Island Russian Arctic Stations, 1940-1990

Summary Description [This is also a metadata field]
An ‘above the fold’ overview of everything someone might need to know to decide if the dataset is something they can use., about ½ page long at maximum.

This is a paragraph that describes the 'who, what, where, when, and why' of the dataset you are submitting with this metadata. Also, consider referring to published documents or web sites when more detail is required to describe a dataset adequately.

Contacts
Any relevant people, with their titles, and role.

Background
Any contextual information, for example, is the dataset part of a larger experiment or collection? Were certain aspects of the data acquisition procedure notable or unusual?

Detailed Data Description
Be as descriptive as possible, with references to instrument manuals, standards, or other works where applicable.

For example:
Snow depth on sea ice measurements were acquired according to the method detailed in Chapter 3.1 (Strum, 2009) of Field Techniques for Sea Ice Research (Eicken, 2009)“ Another example: Sea ice optics measurement sites were selected following the guidelines set forth in Chapter 3.6, Section 3.6.3, Methods and Protocols (Perovich, 2009) of Field Techniques for Sea Ice Research (Eicken, et al., eds., 2009).

References and Related Publications
These can include papers that use these data or like data. References that describe how measurements are taken are especially valuable. For example:

Strum, M., 2009. Field Techniques for Snow Observations on Sea Ice, in Field Techniques for Sea-Ice
Research, Edited by H. Eicken et al., University of Alaska Press, Fairbanks, 588pp; ISBN 978-1-6022230-59-0


Perovich, D. 2009. Sea Ice Optics Measurement , in Field Techniques for Sea-Ice Research, Edited by H. Eicken et
al., University of Alaska Press, Fairbanks, 588pp; ISBN 978-1-6022230-59-0


H. Eicken, R. Gradinger, M. Salganek, K. Shirasawa, D. Perovich And M. Leppa¨ Ranta, eds. 2009. Field
techniques for sea ice research. Fairbanks, AK, University of Alaska Press. 588pp. ISBN-10: 1-602230-59-5,
ISBN-13: 978-1-602-23059-0

Acknowledgments
As a rule, include the grant number or numbers and funding agencies that supported the work.

Document Information
Here name the document author(s), date and the date it was created or revised. Including the date is important.

If a revision was made, say when it was made and briefly, what the nature of the revision was. For example, “This readme documentation file was revised on 4 Feb 2011 to add information about a new instrument used to measure snow density, and about the resulting new additional data files."

This document was first published on the AON CADIS Web site in March, 2011. It was written by Lisa Booker, NSIDC, with contributions from Florence Fetterer (NSIDC), Linda Cully (NCAR), and Hannah Wilcox (NCAR), as well as others on the CADIS team.