calculate_guideline() gives users access to the EnviroData microservice for calculating guidelines for use on external datasets (i.e., those not stored on EnviroData).

calculate_guideline(master_dataframe, guideline, compareGuideline = TRUE)

Arguments

master_dataframe

A dataframe of water quality data containing the following columns:

  • Analyte - Analyte name as stored in the EnviroData database, as opposed to the name shown in the EnvironData UI (DisplayName). As such, analyte names from external data may need standardization to match the required format before guidelines can be calculated.

  • Site - The names of stations for which guidelines will be calculated.

  • DateTime - The date and time samples were taken. All dates must be formatted as one of either %%Y-%%m-%%d and %%Y-%%b-%%d.

  • Value - Numerical value of analyte reported from water quality analyses.

  • Unit - Units of measurement for water quality lab results.

  • pH - Lab result for pH corresponding to each sample. To be used for guideline calculations where pH is a required cofactor.

  • Hardness - Lab result for Hardness corresponding to each sample. To be used for guideline calculations where Hardness is a required cofactor.

  • Chloride - Lab result for Chloride corresponding to each sample. To be used for guideline calculations where Chloride is a required cofactor.

  • Temperature - Lab result for Temperature corresponding to each sample. To be used for guideline calculations where Temperature is a required cofactor.

  • ColorRef - Lab result for ColorRef corresponding to each sample. To be used for guideline calculations where ColorRef is a required cofactor.

Users may define these columns themselves, but are advised to use columns from the return values of get_wq_locations_list(), get_wq_locations_details() and get_wq_report_data().

Note that different guidelines require data on different cofactors. Have a look at the Guidelines section of the EnviroData web app to find out which cofactors are needed for your guideline query.

guideline

A character string or vector of one or more short guideline names.

Users may define guideline names themselves, but are advised to use the Guideline column from the return value of get_guideline_details().

Value

A long-format dataframe of guideline information containing the columns from the input data frame plus these additional columns:

  • GuidelineValueString - The calculated guideline value for each record.

  • IsExceedance - Boolean to show if the value has exceeded the calculated guideline value

  • GuidelineUnit - The unit of measurement for the selected guideline.

  • GuidelineName - The name of the guideline used in the calculation.

  • GuidelineAnalyteName - The analyte name that is used in the guideline calculation for each record

  • HasError - Boolean, whether or not there is an error during the calculation.

  • ErrorDetail - Error note if there is an error during the calculation

  • Analyte - The analyte name as stored in the EnviroData database.

  • DisplayName - The analyte name as shown in the EnviroData UI.

  • Site - The station that the samples were taken from.

  • DateTime - The date and time samples were taken

  • Value - The numerical value of analyte reported from water quality analyses.

  • Unit - The units of measurement for water quality lab results.

  • pH - Lab result for pH corresponding to each sample.

  • Hardness - Lab result for Hardness corresponding to each sample.

  • Chloride - Lab result for Chloride corresponding to each sample.

  • Temperature - Lab result for Temperature corresponding to each sample.

  • ColorRef - Lab result for ColorRef corresponding to each sample.

  • DOC - Lab result for Dissolved Organic Carbon corresponding to each sample

Details

IMPORTANT: Users must obtain login credentials to the wqProcessing tenant to use this function. In addition, analyte names in external datasets will need to match the formatting used by the EnviroData Analyte Name.

Contact the Environmental Information Specialist assigned to your project or the Data Solutions Group for more assistance in this regard.

Examples


# example dates and analyte names
dates <- as.POSIXct(seq(2))
analytes_1 <- c("Cobalt (Co)-Total_Concentration (liquid)", "something else")

# minimal data frame for the request
input_1 <- data.frame(DateTime=dates, Analyte=analytes_1)

# second guideline field is empty because there was no matching analyte name
calculate_guideline(input_1, "BC_AWQG_STMAX", compareGuideline=FALSE)

# this example requires cofactors
analytes_2 <- c("Copper (Cu)-Dissolved_Concentration (liquid)", 2)
cofactor_2 <- data.frame(pH = c(7, NA), Hardness = c(150, NA), DOC = c(3, NA))
input_2 <- data.frame(DateTime=dates, Analyte=analytes_2) |> cbind(cofactor_2)

# the second guideline is empty because we were missing its cofactors
calculate_guideline(input_2, "BC_AWQG_STMAX", compareGuideline=FALSE)

# demonstrate exceedance checking (provide values this time)
input_3 <- cbind(input_2, Value=c(10, NA))
calculate_guideline(input_3, "BC_AWQG_STMAX", compareGuideline=TRUE)