Takes a data.frame with reported species observations and reformats it, using visit identifiers, to an OrganizedBirds-class that can be used in further analyses with the BIRDS-package.

organizeBirds(
  x,
  sppCol = "scientificName",
  idCols = c("locality", "recordedBy"),
  timeCols = c("year", "month", "day"),
  timeInVisits = "day",
  grid = NULL,
  presenceCol = NULL,
  xyCols = c("decimalLongitude", "decimalLatitude"),
  dataCRS = 4326,
  taxonRankCol = NULL,
  taxonRank = c("SPECIES", "SUBSPECIES", "VARIETY"),
  simplifySppName = FALSE,
  spOut = FALSE
)

organiseBirds(
  x,
  sppCol = "scientificName",
  idCols = c("locality", "recordedBy"),
  timeCols = c("year", "month", "day"),
  timeInVisits = "day",
  grid = NULL,
  presenceCol = NULL,
  xyCols = c("decimalLongitude", "decimalLatitude"),
  dataCRS = 4326,
  taxonRankCol = NULL,
  taxonRank = c("SPECIES", "SUBSPECIES", "VARIETY"),
  simplifySppName = FALSE,
  spOut = FALSE
)

Arguments

x

A data.frame, sf or a SpatialPointsDataFrame containing at least a column for species name, one or several columns for date of observation, one or several columns for identifying a visit and, if it is not spatial, coordinate columns.

sppCol

A character string with the column name for the column for the species names. Default is the Darwin Core standard name "scientificName".

idCols

A character vector of the names for the columns that are holding the information that identifies a visit. Default is the Darwin Core standard column names c("locality", "day", "month", "year", "recordedBy").

timeCols

A character vector with the names for the column(s) holding the observation dates. Default is the Darwin Core standard column names c("year", "month", "day").

timeInVisits

A flag indicating whether visits are defined by the time definition or not, and to which resolution. Default is 'day'. Alternatives are c("day", "month", "year", NULL). Time is anyhow organised into three columns year, month, day.

grid

Either NULL to be ignored or an object of class SpatialPolygons or SpatialPolygonsDataFrame as identifier of the visits spatial extent.

presenceCol

A character string with the column name for the column for the presence status. Default is NULL.

xyCols

A character vector of the names for the columns that are holding the coordinates for the observations. The order should be longitude(x), latitude(y). Default is the Darwin Core standard column names c("decimalLongitude", "decimalLatitude"). Only applicable to non- spatial data.frames.

dataCRS

A character string or numeric for the data.frame CRS (Coordinate Reference System). Default is 4326, which is WGS 84. This is only applicable to non-spatial data.frames, since a spatial data.frames already should have this information.

taxonRankCol

the name of the column containing the taxonomic rank for the observation. That is the minimum taxonomic identification level.

taxonRank

a string or vector of strings containing the taxonomic ranks to keep. Only evaluated if taxonRankCol is not NULL

simplifySppName

Logical. whether to remove everything else that is not the species name (authors, years). Default set to FALSE, else leaves a canonical name given by taxize::gbif_parse(), that is a scientific name with up to 3 elements.

spOut

Logical. Whether the result should be a SpatialPolygon (sp) or an sf.

Value

a `SpatialPointsDataFrame` wrapped into an object of class OrganizedBirds, with additional attributes.

Details

An OrganizedBirds-class is essentially a list containing a spatial element. After version 0.2, the resulting element is of class sf. However, we now add a parameter for backwards compatibility. It also accepts the inputs in both formats. This sf has its data formatted in a way that the other functions in the BIRDS-package can use further on. It also has the attribute "visitCol", which indicates which column in the data.frame holds the visit identifier. The visit identifier is created by the function createVisits, which creates a unique id for each combination of the values in the defined columns.

The variable timeCol can be formatted differently. If the variable is a named vector with the names "Year", "Month" and "Day" (letter capitalization does not matter) it will use the variable named year as the year column and so on. Otherwise it will use the first variable as year, the second as month and the third as day, if there is a vector of length three or more. If the vector is of only length one it will interpret the column as a date column formatted as "yyyy-mm-dd".

See also

createVisits to create unique visits IDs, visits to get or set the visit IDs to this class, simplifySpp to simplify species names, obsData to retrieve the data.frame from this class.

Examples

OB <- organizeBirds(bombusObs)