Getting Discrete Data into EnviroData: Data Ingestion Workflow#
Introduction#
EnviroData is able to ingest data from a variety of sources. This tutorial will walk you through the process of ingesting discretely sampled data that was analyzed by a lab into EnviroData.
Tip: EnviroData is able to handle a wide variety of data types, including continuous (timeseries) and survey data. This tutorial focuses solely on discretely sampled data and does not cover the ingestion of other supported data types.
Step 1: Transform your data into the Hatfield EDD Format#
To ingest your discretely sampled data into EnviroData following this tutorial, you must first transform your data into the Hatfield EDD format. At this step, you have a choice: either get your lab to send you properly formatted data files, or manually transform your data into the Hatfield EDD format.
Option 1-A: Get your lab to send properly formatted data#
Many labs, including ALS and BV Labs can convert data that they analyzed into the Hatfield EDD format and send it to EnviroData for automatic ingestion. If you are unsure if your lab can do this, please contact EnviroData Technical Support.
Once automatic ingestion of lab data has been setup in EnviroData, all automatically ingested data will show up in the Import Events page. If this has been setup, jump to the Review Imported Data on the Import Events page step.
Option 1-B: Manually transform your data into the Hatfield EDD format#
If you have data in a format that is not currently supported by EnviroData, you will first manually transform your data into the Hatfield EDD format. This can be done using a spreadsheet program like Microsoft Excel, or using a command line tool like R. Ultimately, the data will need to be in an Excel file with specific columns based on the type of report you are filling in: Lab, Historical, or Field.
Use this HEDD Excel file template as a starting template for your HEDD data. The file outlines the necessary columns for each report type.
Hints on preparing your Hatfield EDD data file
Make sure the excel file has a single worksheet, if the file has more than one worksheet only the first one will be imported
Identify what report type your data is and what columns you need, note that the columns don’t have be in a certain order
Make sure that the first row of the worksheet is a header row that contains just the name of the columns
The file must be saved as an Excel (.xlsx or .xls) file
Step 2: Upload your Hatfield EDD file#
Step 2a: Go to the EnviroData File area and upload your file#
Click “Files” from the left sidebar
Click “Browse All Files” button from the main page
Click “+ Add a new file” from the main page
Enter a name of the file into “File Name”
Make sure to select “Auto-detect file type”
Click “Upload” to upload the file
Click “Next”
Optional - Add Additional Metadata#
Click “+ ADD METADATA FIELD”
Fill in the “Field Name” and “String Field Value”
To add another metadata data entry, click “+ ADD METADATA FIELD” again until you have entered all the metadata needed
Finish Uploading#
Click the green “Create” button
Step 3: Ingest the Uploaded Hatfield EDD File#
Click on the file you want to ingest from the “Browse Files” page
Click “Ingest Data” at the top of the page and click the green “Ingest” button
If the ingestion is successful, it will look like this
If the ingestion is unsuccessful, check the logs and recommended actions to fix the issue
To fix errors in the data, skip to Step 5
Step 4: Review Imported Datasets on the Import Events Page#
All data that has been ingested ito EnviroData will show up on the Import Events page. This includes discretely sampled data that was analyzed by a lab, survey data, and continuous data.
On the Import Events page, users can:
View all import events (ingestions, re-ingestions, and rollbacks) and their result (successful, failed)
View a summary of an import event, such as number of new analytes, stations, and media types.
Filter for import events by date, file type, event type, and status (successful or failed)
Get more information for a specific import event by clicking the View import details button or the Review file button.
Navigate to the “Import Events” page from the sidebar on the left
Search for the file you want to modify using the search bar at the top right of the page or use the filer options on the left of the main page
To view problems and fix issues with the file, click on “Review File”
Step 5: View problems and fix issues with the ingested file#
Once the file has gone through the ingestion process, all warning and errors will be displayed on the Review File page.
Tips: The ingestion process tries its best to get your data into EnviroData. If there are any errors found, you will need to fix them before you can ingest the file. If there are warnings with your file, the ingestion process will do its best to ingest the file, but you will need to review the file to make sure that the data is correct. It is also possible to adjust the strictness checking mode for each column in your uploaded HEDD file. If you want to adjust the stricness checking for a column, please contact envirodata-techsupport@hatfieldgroup.com.
The right tab of the page displays a list of ingestion errors, if any are present, as well as warnings and relevant information regarding data validation and parsing
To view the errors and/or warnings present in the data, simply click on their respective buttons. The eye icon under each button shows if they are being displayed or not
Also shown in the right tab of the page, under “Errors” “Warnings” and “Info” is “Submission Logs” and “Curent Changes”
“Submission Logs” allow you to view any pending logs that have been generated by the ingestion process, the status of the log, the location, and the details of the log
“Current Changes” shows the changes that have been made to the data since the last ingestion. You can view the locations of new changes, what the changes were and you are given the option to undo the changes using the undo button
The uploaded Hatfield EDD file is displayed that can be used to quickly review the dataset and make changes to individual cells.
Changes made to the spreadsheet can be viewed by clicking “Current Changes” at the in the right tab of the page.
After making changes, click on the “Reingest” button at the top of the page to reingest the data. This will create a copy of the original file with the changes made to it.
If you reingested file still has errors, you will need to fix them using the same process and tools.
Step 6: View the ingested data#
You can now use the tools provided by EnviroData to view the newly ingested data.
Optional - Rollback Ingested Data File#
Sometimes you will find that you need to rollback an ingestion. This deletes all data associated with the file including Stations and Analytes not referenced by other imports. This is useful if you need to correct a mistake or error that was introduced during the file ingestion process.
Navigate to the “Import Events” page from the sidebar on the left
Search for the file you want to rollback using the search bar at the top right of the page or use the filer options on the left of the main page
Click on “Rollback”
The Rollback event is logged in the “Import Events” page