Installing PhotoGEA
How Do I Install the Latest Release of PhotoGEA?
PhotoGEA
is available on CRAN, so the easiest way to
install the latest release is to type the following from within an R
terminal:
install.packages('PhotoGEA')
There may be a short period of time where a new version has been released but is not yet available on CRAN. In this case, the latest release can be installed directly from the main branch of the GitHub repository by typing the following:
remotes::install_github('eloch216/PhotoGEA')
Note that this method requires the remotes
package,
which can be installed from within R by typing
install.packages('remotes')
.
How Do I Install an Old Version of PhotoGEA?
All PhotoGEA
releases are tagged on GitHub, and the tag
names are formatted as vX.Y.Z
, where X.Y.Z
is
the version number. Because of this,
remotes::install_github
can be used to install old versions
from within R, supplying the appropriate tag name as the
ref
input argument. For example, version 1.0.0
can be installed as follows:
remotes::install_github('eloch216/PhotoGEA', ref = 'v1.0.0')
Note that this command requires the remotes
package,
which can be installed from within R by typing
install.packages('remotes')
.
How Do I Install the Development Version of PhotoGEA?
The development version of PhotoGEA
contains the latest
changes, although it may be a “work in progress” and subject to suddent
changes without any warning. It can always be found on the
unreleased
branch on GitHub, and the package can be
installed from this branch as follows:
remotes::install_github('eloch216/PhotoGEA', ref = 'unreleased')
Note that this command requires the remotes
package,
which can be installed from within R by typing
install.packages('remotes')
.
How Do I Install From a Local Copy of the Repository?
The PhotoGEA
package can be installed from a local copy
of the repository by running the following command in an R session with
the working directory set to the root directory of the
PhotoGEA
repository:
install.packages('.', repos = NULL, type = 'SOURCE')
This can be helpful for developers who wish to locally test changes to the package code.
Loading Data
What if I Have Log Files From a Licor LI-6400, PP Systems CIRAS, or Other Gas Exchange Measurement System?
At the moment, PhotoGEA
can read plaintext and Excel
files created by Licor LI-6800 gas exchange measurement systems, but
does not have functions for reading log files from other
instruments.
Fortunately, there are other R packages that do, such as GasanalyzeR. Using a
function from another package is a little complicated, because generally
they do not use the same data structures or variable names as
PhotoGEA
. So, after reading a file with another package,
there is typically some extra “conversion” that needs to take place.
The following example shows how to create a wrapper for the
read_6400_xls
function from the gasanalyzer
R
package. It also provides a simple function that “detects” the file
type, so a mix of Licor LI-6400 and LI-6800 log files could be used. The
code in the example would replace the lines that use
read_gasex_file
to create gasex_exdf_list
in
the user guides.
As of the time this article was written (April 4, 2025), the
read_6400_xls
function is not available in the CRAN version
of the gasanalyzer
package. Instead, the package must be
installed directly from its GitLab repository to get the latest version.
This can be done by calling the following command from R:
remotes::install_gitlab('plantphys/gasanalyzer')
This example only processes the columns that are absolutely essential for fitting A-Ci curves. Depending on your goals, you may need to modify this code to include other columns.
# Helper function for reading 6400 Excel files. This is a wrapper for the
# read_6400_xls function from the gasanalyzer package that converts the output
# into a format usable by PhotoGEA.
#
# Debug option: If DEBUG_PRINT is set to TRUE below, the raw column names will
# be printed immediately after the file is loaded. This may be helpful if there
# are other columns (such as user constants) that also need to be renamed.
read_6400_xls_wrapper <- function(fpath) {
# Read the contents of the file
rawdata <- gasanalyzer::read_6400_xls(fpath)
# Optional debug printing
DEBUG_PRINT <- FALSE
if (DEBUG_PRINT) {
print(colnames(rawdata))
}
# Rename a few columns so they meet PhotoGEA's expectations; the "new"
# PhotoGEA column name is on the left, and the "original" gasanalyzer
# column name is on the right.
new_column_names <- list(
A = 'GasEx.A',
Ca = 'GasEx.Ca',
Ci = 'GasEx.Ci',
DeltaPcham = 'Meas.DeltaPcham',
gsw = 'GasEx.gsw',
oxygen = 'Const.Oxygen',
Pa = 'Meas.Pa',
Qin = 'LeafQ.Qin',
TleafCnd = 'GasEx.TleafCnd'
)
for (i in seq_along(new_column_names)) {
gasanalyzer_name <- new_column_names[[i]]
photogea_name <- names(new_column_names)[i]
colnames(rawdata)[colnames(rawdata) == gasanalyzer_name] <- photogea_name
}
# Drop units and convert to a regular data frame
rawdata_df <- as.data.frame(units::drop_units(rawdata))
# Convert to an exdf object
exdf_obj <- exdf(rawdata_df)
# Supply units for a few columns so they meet PhotoGEA's expectations, and
# return
document_variables(
exdf_obj,
c('GasEx', 'A', 'micromol m^(-2) s^(-1)'),
c('GasEx', 'Ca', 'micromol mol^(-1)'),
c('GasEx', 'Ci', 'micromol mol^(-1)'),
c('Meas', 'DeltaPcham', 'kPa'),
c('GasEx', 'gsw', 'mol m^(-2) s^(-1)'),
c('in', 'oxygen', 'percent'),
c('Meas', 'Pa', 'kPa'),
c('LeafQ', 'Qin', 'micromol m^(-2) s^(-1)'),
c('GasEx', 'TleafCnd', 'degrees C')
)
}
# Helper function that tries to read a file using PhotoGEA::read_gasex_file, but
# then uses `read_6400_xls_wrapper` if there is an error.
read_gasex_file_plus_6400 <- function(fpath) {
cat(paste0(
'\nAttempting to read `', fpath, '` using PhotoGEA::read_gasex_file\n'
))
tryCatch(
read_gasex_file(fpath),
error = function(e) {
cat(paste0(
'\nAn error occurred. Attempting to read `', fpath,
'` using gasanalyzer::read_6400_xls\n'
))
read_6400_xls_wrapper(fpath)
}
)
}
# Load each file, storing the result in a list
gasex_exdf_list <- lapply(file_paths, function(fpath) {
read_gasex_file_plus_6400(fpath)
})
What if My Data Has Been Reformatted to CSV?
Sometimes the contents of one or more instrument log files has been
reformatted into a CSV file. This is especially common in older data
sets that were compiled before the availability of tools like PhotoGEA’s
read_gasex_file
function, since the log files could not be
read directly from R. By manually converting to CSV, it became possible
to read their contents using read.csv
.
Such data sets can still be analyzed in PhotoGEA, but slightly
different steps must be taken. The main idea is to first read the CSV
file with read.csv
, creating a regular data frame. Then, by
supplying units for key columns, the data frame can be converted to an
exdf
object and passed to other PhotoGEA functions.
The following example shows how to do this for a single data file,
which can be obtained from a
figshare repository. This code would replace the lines that use
read_gasex_file
to create gasex_data
in the
user guides.
# Read the original CSV
original_df <- read.csv(
'ACi-TGlob_V1.0.csv',
fileEncoding = 'latin1',
stringsAsFactors = FALSE
)
# Convert to an exdf object and add units for key columns
gasex_data <- exdf(
original_df,
units = data.frame(
Ci = 'micromol mol^(-1)',
CO2S = 'micromol mol^(-1)',
Cond = 'mol m^(-2) s^(-1)',
PARi = 'micromol m^(-2) s^(-1)',
Photo = 'micromol m^(-2) s^(-1)',
Tleaf = 'degrees C'
)
)
As shown here, it is not necessary to provide units for every column.
In fact, it is not necessary to provide units for any column
when creating an exdf
object. So, one way to proceed is to
begin without any units:
gasex_data <- exdf(
original_df,
units = data.frame(
)
)
Subsequent calls to other PhotoGEA functions (for example,
fit_c3_aci
) will produce error messages when required units
are not available. Then you can fill them in.
What if My Columns Have Different Names Than Those in a Licor LI-6800 Log File?
Each function in PhotoGEA allows the user to specify alternate names
for the columns it needs, where the default names follow those used in
Licor LI-6800 log files. For example, the default name for the net
CO2 assimilation rate column in most functions is
A
. However, some instrument log files use different names;
for example, the net CO2 assimilation rate is called
Photo
in Licor LI-6400 log files. In this case, the default
name can be changed to Photo
when calling a PhotoGEA
function that uses this column.
This can be tedious though, since you may need to override the
default column names many times. An alternate approach is to rename
columns in the data table after loading it from a file. This only needs
to be done once at the start of a script. The example below shows one
way to do this; this code would follow the creation of the
exdf
table called gasex_data
, and shows how to
rename a few Licor LI-6400 columns:
# PhotoGEA has a mechanism for specifying the column names for key inputs, but
# it is a bit clunky, so it's easier to rename them to their "standard" names;
# here we accomplish this by using a helper function
rename_col <- function(data_table, oldname, newname) {
colnames(data_table)[colnames(data_table) == oldname] <- newname
data_table
}
# Specify the old and new names via a list, where the old names are on the left
# and the new ones are on the right
to_rename <- list(
CO2S = 'Ca',
Cond = 'gsw',
PARi = 'Qin',
Photo = 'A',
Tleaf = 'TleafCnd'
)
# Rename columns in `gasex_data`
for (i in seq_along(to_rename)) {
gasex_data <-
rename_col(gasex_data, names(to_rename)[i], to_rename[[i]])
}
Checking Data
What Should I Do When check_response_curve_data Fails?
When beginning to work with a new data set, it is fairly common to
encounter problems that cause check_response_curve_data
to
fail. The most common issues are:
The manual log button was accidentally pressed while measuring, causing a response curve to have an extra point.
One or more “User Constant” was not updated before starting a to measure a new curve, causing two curves to have the same identifying metadata.
A curve was started, but needed to be ended early, resulting in a curve with fewer points than expected.
These are exactly the kinds of issues that
check_response_curve_data
is designed to detect. It may
feel irritating to encounter these errors, but it’s better to know about
issues with extra log points or user constants early in your analysis,
before they cause additional downstream issues.
In each of these cases, it is often simplest to fix the issues by manually editing the log files in Excel. This is an easy way to delete extra rows or alter the values of user constants. When altering log files, it is always a good idea to keep an “original” version in case any of the changes need to be reverted.
Sometimes check_response_curve_data
fails for a
different reason – for example, perhaps different setpoint sequences
were intentionally used while measuring the curves, or perhaps you have
already cleaned up your data in Excel. In these cases, please see the
“What if My Response Curves Have Different Numbers of Points?” and “What
if I Clean My Data in Excel?” sections of this article.
What if My Response Curves Have Different Numbers of Points?
The analysis guides use data sets where each response curve has the
same number of points, making it easy to apply
check_response_curve_data
and
organize_response_curve_data
. However, data sets may
sometimes have curves measured using different sequences of
setpoints.
To deal with this situation, one strategy is to split the full set into groups that are expected to use the same sequences, then separately check and organize each group, and finally recombine all the groups back together. The code snippets below show two examples of how this could be accomplished.
Splitting Response Curves According to the Number of Points in Each Curve
The following code would replace the calls to
check_response_curve_data
and
organize_response_curve_data
in the user guides. This code
was originally written for a set of A-Ci curves that used different
numbers of recovery points. The user wished to keep the final recovery
point to use for subsequent analysis.
The curves used the following sequences of CO2_r
setpoint values:
16 points: 400, 300, 200, 120, 70, 30, 10, 400, 400, 400, 600, 800, 1200, 1500, 1800, 400
18 points: 400, 300, 200, 120, 70, 30, 10, 400, 400, 400, 400, 400, 600, 800, 1200, 1500, 1800, 400
19 points: 400, 300, 200, 120, 70, 30, 10, 400, 400, 400, 400, 400, 400, 600, 800, 1200, 1500, 1800, 400
# Add a new column called `curve_npts` that stores the number of points in each
# response curve
gasex_data <- do.call(rbind, by(gasex_data, gasex_data[, 'curve_identifier'], function(x) {
x[, 'curve_npts'] <- nrow(x)
x
}))
# Choose points to remove, depending on how many points are in the curve
pts_to_remove <- list(
'16' = c(1, 8:9, 16),
'18' = c(1, 8:11, 18),
'19' = c(1, 8:12, 19)
)
# Check and process each group of curves depending on how many points are in the
# curve
gasex_exdf_list_processed <- by(gasex_data, gasex_data[, 'curve_npts'], function(x) {
# Get the number of points in these curves
npts <- x[1, 'curve_npts']
# Make sure info is specified for this group of curves
if (!as.character(npts) %in% names(pts_to_remove)) {
stop('Points to remove were not specified for npts = `', npts, '`')
}
# Make sure the data meets basic requirements
check_response_curve_data(x, 'curve_identifier', npts, 'CO2_r_sp')
# Remove points with duplicated `CO2_r_sp` values and order by `Ci`
organize_response_curve_data(
x,
'curve_identifier',
pts_to_remove[[as.character(npts)]],
'Ci'
)
})
# Use `rbind` to recombine all the data
gasex_data <- do.call(rbind, gasex_exdf_list_processed)
Splitting Response Curves According to the Date They Were Measured and the Number of Points in Each Curve
This code would replace the calls to
check_response_curve_data
and
organize_response_curve_data
in the user guides. This code
was originally written for a set of A-Ci curves that used different
sequences of CO2_r
setpoints on different days.
The curves used the following sequences of CO2_r
setpoint values:
2023-03-21: 18 points: 400, 300, 200, 150, 100, 75, 50, 40, 30, 20, 10, 400, 400, 600, 800, 1000, 1200, 1500
2023-03-23: 19 points: 400, 300, 200, 150, 100, 75, 50, 40, 30, 20, 10, 400, 400, 500, 600, 800, 1000, 1200, 1500
2023-03-24: 19 points: 400, 300, 200, 150, 100, 75, 50, 40, 30, 20, 400, 400, 450, 500, 600, 800, 1000, 1200, 1500
# Add a new column called `curve_npts` that stores the number of points in each
# response curve
gasex_data <- do.call(rbind, by(gasex_data, gasex_data[, 'curve_identifier'], function(x) {
x[, 'curve_npts'] <- nrow(x)
x
}))
# Add a new column called `date_ymd` that stores the date formatted as
# YYYY-MM-DD
gasex_data[, 'date_ymd'] <- paste(
substring(gasex_data[, 'date'], 1, 4),
substring(gasex_data[, 'date'], 5, 6),
substring(gasex_data[, 'date'], 7, 8),
sep = '-'
)
# Add a new column called `date_ymd_npts` that combines the date and the number
# of points
gasex_data[, 'date_ymd_npts'] <-
paste(gasex_data[, 'date_ymd'], gasex_data[, 'curve_npts'], sep = ' - ')
# Choose points to remove, depending on the date the curve was measured and the
# number of points it contains
pts_to_remove <- list(
'2023-03-21 - 18' = c(12, 13),
'2023-03-23 - 19' = c(12, 13),
'2023-03-24 - 19' = c(11, 12)
)
# Check and process each group of curves depending on the date and the number of
# points
gasex_exdf_list_processed <- by(gasex_data, gasex_data[, 'date_ymd_npts'], function(x) {
# Get the date and number of points in these curves
date_ymd_npts <- x[1, 'date_ymd_npts']
npts <- x[1, 'curve_npts']
# Make sure info is specified for this group of curves
if (!date_ymd_npts %in% names(pts_to_remove)) {
stop('Points to remove were not specified for date_ymd_npts = `', date_ymd_npts, '`')
}
# Make sure the data meets basic requirements
check_response_curve_data(x, 'curve_identifier', npts, 'CO2_r_sp')
# Remove points with duplicated `CO2_r_sp` values and order by `Ci`
organize_response_curve_data(
x,
'curve_identifier',
pts_to_remove[[date_ymd_npts]],
'Ci'
)
})
# Use `rbind` to recombine all the data
gasex_data <- do.call(rbind, gasex_exdf_list_processed)
What if I Clean My Data in Excel?
In the user guides, the organize_response_curve_data
and
remove_points
functions are used to remove recovery points
and other points from sets of response curves. However, it is also
possible to remove points in Excel before reading log files into R. In
this case, it is necessary to make a few small alterations to the code
used in the user guides.
One consideration is that after cleaning the curves in Excel, it is
likely that not all curves have the same number of points or follow the
same sequence of setpoint values. Because of this, the checks in
check_response_curve_data
are likely to fail. In this case,
we recommend setting expected_npts
to 0 (the default value)
and error_on_failure
to FALSE
when calling
check_response_curve_data
. This will provide potentially
useful information about the number of points in each curve, but won’t
throw an error that would cause a script to stop running.
Another consideration is that
organize_response_curve_data
is not needed to remove any
points from the curves. Yet, its other features, such as reordering the
points and calculating average values, are still useful. In this case,
we recommend setting measurement_numbers_to_remove
to
c()
and leaving the other arguments as-is.
Putting this all together would produce something like the following
code, which would replace the regular calls to
check_response_curve_data
and
organize_response_curve_data
from the user guides:
# Print info about the number of points in each curve to make sure
# `curve_identifier` is able to properly split the set into individual curves
check_response_curve_data(gasex_data, 'curve_identifier', error_on_failure = FALSE)
# Reorder by `Ci` and calculate average values of leaf temperature and Qin
gasex_data <- organize_response_curve_data(
gasex_data,
'curve_identifier',
c(),
'Ci',
columns_to_average = c('TleafCnd', 'Qin')
)
What if My Files Don’t Have User Constants or Other Metadata?
Please see the Guide to Licor LI-6800 User Constants.
Converting and Exporting Results
How Can I Convert an exdf to a Regular Data Frame?
Each exdf
object is actually a list, where one of the
elements (called main_data
) includes its contents as a data
frame. So, an exdf
object can be converted to a data frame
by extracting this element. For example, if exdf_obj
is an
extended data frame, then the following will “convert” it to a regular
data frame:
# Convert an exdf to a regular data frame
dataf <- exdf_obj$main_data
This can be helpful for passing the results from PhotoGEA functions to other R functions, especially when performing statistical analysis.
Note: Using as.data.frame
is not recommended, since it
will add a row for the units and a row for the categories when
converting. This is useful in a few situations (for example, when
printing the contents of an exdf
), but it is a nuisance in
most situations because it will cause all columns to become
character
(even if they were originally
numeric
) and to include non-data values. See
?as.data.frame.exdf
for more info.
How Can I Save an exdf to a CSV File?
There are several options for doing this. One is to use
write.csv.exdf
, which will include the units and categories
for each column. Another option is to convert the exdf
to a
regular data frame (see above) and use write.csv
. For
example:
# Write an exdf to a CSV file, including units
write.csv.exdf(exdf_obj, file = 'exdf_version.csv')
# Write an exdf to a CSV file, not including units
write.csv(exdf_obj$main_data, file = 'data_frame_version.csv', row.names = FALSE)
The file created by write.csv.exdf
can be read using
read.csv.exdf
, which will create an exdf object from the
file. The file created by write.csv
can be read using
read.csv
, which will create a data frame from the file.
Sometimes it can be helpful to only save a subset of the columns in
an exdf
. For example, the parameters
table
returned by fit_c3_aci
contains many columns, some of which
may not be important to all users. In this case, it is possible to
specify the columns that should be saved as follows:
# Specify a subset of columns to write to a CSV file
col_to_save <- c(
'curve_identifier',
'Vcmax_at_25',
'Jmax_at_25',
'Tp_at_25',
'RL_at_25'
)
# Save fitting results to a CSV file, including units
write.csv.exdf(
c3_aci_results$parameters[, col_to_save, TRUE],
file = 'parameters_subset.csv'
)
# Save fitting results to a CSV file, not including units
write.csv(
c3_aci_results$parameters[, col_to_save],
file = 'parameters_subset.csv',
row.names = FALSE
)
The example shows two options for creating the file, but typically only one of these would actually be used in practice. You may need to alter the particular columns that are saved.
Fitting C3A-Ci Curves
How Can I Exclude TPU From C3A-Ci Fits?
To exclude TPU from a C3A-Ci curve
fit, just fix the value of Tp_at_25
to a very high number,
such as 1000. If the maximum rate of triose phosphate utilization is
1000 micromol / m^2 / s, then TPU will never be the rate-limiting
process for any reasonable A-Ci curve. Thus, TPU will effectively be
disabled. When Tp_at_25
is not being fit, there is also no
reason to fit any of the associated alpha
parameters that
are also related to TPU (alpha_old
, alpha_g
,
alpha_s
, and alpha_t
), so these should be
fixed to 0.
This can be accomplished through the fit_options
argument when calling fit_c3_aci
. In particular, the
following code could be used in place of the regular command in the
C3A-Ci user guide:
# Fit the C3 A-Ci curves, disabling TPU (so only RL, Vcmax, and J are fit)
c3_aci_results <- consolidate(by(
gasex_data, # The `exdf` object containing the curves
gasex_data[, 'curve_identifier'], # A factor used to split `gasex_data` into chunks
fit_c3_aci, # The function to apply to each chunk of `gasex_data`
Ca_atmospheric = 420, # Additional argument passed to `fit_c3_aci`
fit_options = list(
Tp_at_25 = 1000,
alpha_old = 0,
alpha_g = 0,
alpha_s = 0,
alpha_t = 0
)
))