F.run.sqlFile: F.run.sqlFile

F.run.sqlFileR Documentation

F.run.sqlFile

Description

Opens, formats, and runs all SQL statements from a text file.

Usage

F.run.sqlFile(ch, sqlFile, echo = TRUE, check.drops = FALSE, ...)

Arguments

ch

A connection handle originating from a call to RODBC function odbcConnectAccess, which essentially opens a connection between R and Access, or more generally, an ODBC database.

sqlFile

Name of a text file containing one or more SQL statements. This file must exist in the directory given by sql.code.dir, which is a global variable (see GlobalVars). Variable sql.code.dir can be reset as needed with a call to GlobalVars or by reassignment in regular code. Statements in sqlFile must be separated by a semicolon ";". Comments, or lines starting with "--", are allowed.

echo

If TRUE, submitted SQL statements echo to the log file. Defaults to TRUE.

check.drops

If TRUE, the routine checks for errors following SQL DROP statements. This is done since DROP statements on non-existent tables cause an abort error. The DROP simply needs to ensure the non-existence of the table; this way, it can be created later. Set this to FALSE and these DROP errors will pass through. This behavior of DROP may be specific to Access, and thus may not occur when querying other SQL data bases (e.g., mySQL or MSSQL).

...

Additional named parameters needed for evaluation of the underlying SQL queries contained in sqlFile. This allows R to pass parameters to SQL for incorporation into things like WHERE clauses. See Details.

Details

Function F.run.sqlFile is often paired with the RODBC function sqlQuery, which reads and returns tables housed in an Access database.

Function F.run.sqlFile works by replacing variables in provided SQL statements with values passed to it via the R "..." parameter.

Value

Results of the LAST SQL statment in sqlFile. In many cases, the SQL statments in sqlFile delete, create, and populate tables in the Access database housing data. Results housed in this database are then later queried via sqlQuery or sqlFetch. One could, however, arrange the SQL statements in sqlFile so that the last statement is a query or fetch which produces the desired results. In that case, this routine will return those results.

Modifying Access-developed SQL queries for use by R-package RODBC

Report generation requires communication between the underlying R code and the Access databases housing the data. The 32-bit R-package RODBC performs the link between these two systems, allowing for SQL-type queries to be sent from R to Access. The resulting queries, once completed and housed in the Access database, can then be read into R.

Examples of original Access SQL queries that have been processed for use in R can be found in the folder named in global variable sql.code.dir. Use these as a guide when updating any new batch of SQL code.

In this package, Access SQL queries are not immediately readable by R (and RODBC). Modifications must first be made. This list attempts to enumerate the necessary changes.

  • Comments Most SQL scripts contain helpful comments. While helpful, RODBC must be made to recognize that these are in fact comments, and not developed SQL strings. Use a double hyphen (--) to demonstrate to RODBC a comment for which processing is unnecessary. For example, -- This is a comment and will not be read would communicate to RODBC that this particular line should be skipped, whereas SELECT * FROM THIS.TABLE.HERE forces RODBC to act. Note that double hyphens need only be entered on the left. Additionally, the space following the double hyphen is unnecessary. However, it makes comments more readable.

  • Quotes All quoted strings must utilize single quotes ('example'), and not double quotes ("example") to wrap text strings. Double quotes force the underlying SQL strings to be misinterpreted, and thus fail.

  • Variables Many developed SQL queries require subsetting the data to Chinook Salmon via SQL WHERE clauses. However, in most cases, if not all, the R code has been developed to allow for selection of any piscine species. This means that the WHERE clause is a variable set by the user. It also means that this variable quantity must be passed from R to Access. Within the database, chinook salmon have the character identifier '161980'. Note the single quotes here.

    In the campR package, WHERE statements for Chinook Salmon are identified via this identifier. However, because the R code has been developed to allow this value to vary, SQL code has been adapted to take variable values via RODBC and variable "R.TAXON". The appellation R.TAXON reflects the R-code name for the species variable, and is how a variable quantity is communicated to Access.

  • INTO SQL Statements Queries for use in the Platform are developed for multiple use. This means that once a query for one set of criteria is created, the code must be able to recreate a new query, based on separate criteria, without error. Practically, this means that different queries must be able to overwrite resultant tables in Access databases flawlessly. The way that Access and RODBC does this varies; this necessitates more complex changes to developed SQL code.

    Within Access, the use of the SQL keyword INTO forces a table that currently exists to be dropped. It also recreates the table into which data are to be placed. For example, suppose within Access a query generates a final table named Table_fish. Now, however, a new query is generated, also within Access, creating a new Table_fish. The new Table_fish overwrites the old Table_fish; Access does this via SQL code resembling

    SELECT * INTO Table_fish FROM Table_helper;

    where the special keyword INTO deletes out the old Table_fish and replaces it with the new version. Syntax of this nature must be modified for use in RODBC R code – the INTO statement does not work in this way. While RODBC will create the named table, it will not first delete out its previous version. This must be done explicitly.

    To work around this, use a DROP TABLE statement. Then, the use of the INTO statement will work within R. Explicitly, modifying the code block above, write

    DROP TABLE Table_fish; SELECT * INTO Table_fish FROM Table_helper; as two separate SQL queries, where the semi-colon ; demonstrates the end of a SQL query.

    One final to-do is required. Once a SQL query has been modified to include a DROP TABLE statement, an initial table must then be produced in all Access databases in which the query is to be run. Practically, the first time this particular query is run, SQL cannot drop the table Table_fish because it will not exist, not having been created yet. To work around this, from within Access, copy and paste any one of the existing tables, and rename it to the name of the final table, which in this case, is Table_fish. Note that some queries create multiple interim tables. These may also need to be modified via DROP TABLE statements. On completion, the modified SQL will be fooled into having a table to delete, and will work as intended.

  • Dates Calendar dates utilized as values within a WHERE clause must have preceding and antecedent #-signs added to them.

Structured Query Language (SQL) Queries

Communication between R and Access databases takes place via the RODBC package and the SQL query language. Queries are organized into series, with different series manipulating data housed in Access databases in different ways. Series within the campR package include the following.

  • All Catch Reproduces the functionality within the Quality-Control database button "ViewAllCatch." Function F.get.all.fish.data calls this query after first running the Build Report Criteria query series.

  • Build Report Criteria A workhorse query called frequently prior to the calling of other queries, due to its finding all trapVisitIDs between two specified dates; i.e., the min.date and max.date. This series results in table TempReportCriteria_Trapvisit in the underlying Access database.

  • Build Report Criteria Release A query similar to that of the Build Report Criteria seres, but adapted for identifying releases. This series results in table TempReportCriteria_Release in the underlying Access database.

  • By Catch The means by which by-catch is identified.

  • Efficiency Tests Compiles efficiency-test release and recapture data. This series depends on first running the Build Report Criteria series, followed by the Sample Period series. In running, the includeCatch variable is pre-set to 'yes' for all recaptured fish.

    In cases where releases temporally overlap, and for which a trapVisitID may fall within the test period of more than one release, the series adds an additional row for each release. In this case, more than one row in the final table for each trapVisitID may result. Thus, these are only unique with the addition of releaseID.

    This query series is inappropriate for programs using inclined plane and other types of traps.

    Note that this series includes recaptures with zero records only for trap samples collected during the 36-hour period following a release. This prevents zero-recapture records from being added to the table for traps not deployed until several days following release. Additionally, the series does not add release records when no recaptures occur within 36 hours of a release. If a release has no recaptures, than all visits with that releaseID will have a recapture total of 0.

    Generally, it is possible a test may be approved, leading to the release of fish, even when no traps were deployed. Currently, the series makes no accommodation for these, and so they are excluded.

  • Fishing Gaps Modifies the identifier for a trap, via its trapPositionID, when lengthy temporal gaps occur in its catch time series. The running of this query requires series Build Report Criteria and Sample Period be run first. The Fishing Gaps series only runs if a lengthy gap is found, which is currently defined as any gap in excess of 10,080 minutes, or 7 days. This value is set in the GlobalVars R function via global variable fishingGapMinutes. Trap identifiers are modified by appending a decimal to their corresponding trapPositionID.

  • Non-valid Fishing Displays target species by run. Series results include trap gear codes and half-cone fields. Currently, this series excludes adult and grilse life stages.

    Note that the series only pulls data for non-valid trap visits, where the variable includeCatchID equals 2, and for which there are target fish in the catch.

  • Not Fishing Develops a table that itemizes hours not fishing. The resulting table references table tempSamplingSummary and is built similarly. The table identifies gaps in sampling to the minute and includes both trapPositionID and trapping date. This means it can be used in conjuction with table tempSamplingSummary. The running of this query requires series Build Report Criteria and Sample Period be run first.

  • Sample Period Calculates the amount of time between starts and stops of a trapping instance, as itemized via variable trapVisitID.

  • Sum Chinook by Trap Sums all the Chinook caught via trap, i.e., variable trapVisitID. Note that species Chinook Salmon enters the series via variable R.TAXON.

  • Unmarked by Run and Lifestage Displays target species by run. This series requires each of the QryBuildReportCriteria, QrySamplePeriod, and QryNotFishing series be developed first prior to implementation. The series only pulls data for valid trap visits, where within table tempSamplingSummary, variable includeCatchID equals 1 and variable visitTypeID has a value larger than one. The table within the Access database holding the result of this series is TempSumUnmarkedByTrap_Run_Final.

  • Unmarked Chinook by Lifestage Displays target species data for life stage analysis. This series requires each of the QryBuildReportCriteria and QrySamplePeriod series be developed first.

    The series pulls catch data from valid trap visits, identified when variable includeCatch equals 'yes' or when includeCatch is non-valid, or equals 'no', but variable visitTypeID has a value greater than 1, i.e., Start trap & begin trapping. This series includes variables weight and finalRunMethod for use in life stage assignment.

    This series is different from that used in production analysis, being modified slightly for use in life stage development. This series uses two of the same output tables, but the final table name is TempSumUnmarkedByTrap_Run_Final2.

  • Weekly Effort Develops trapping effort data for graphical display. The graph displays effort by Julian week, as summarized within the Dates table in the accomapanying Access database. The graph also itemizes effort that resulted in a valid sample, effort for invalid samples not used in analyses, and times when no trapping effort was extended.

    The series depends on output tables from other SQL series. As such, series QryBuildReportCriteria and QrySamplePeriod must be run prior to the implementation of this series. Dependent tables include TempSamplingSummary, TempSamplingSummary_Final, TempNonSamplingSummary, NonTrapSample1, TempEffortSummary_a, TempEffortSummary_b, and TempEffortSummary_a1. Note that table TempEffortSummary_b currently contains the final output table.

Author(s)

WEST Inc., Connie Shannon

See Also

sqlQuery, sqlFetch

Examples

## Not run: 
#   ---- Open ODBC channel.
db <- get( "db.file", envir=.GlobalEnv )
ch <- odbcConnectAccess(db)

#   ---- Develop the hours fished and TempSamplingSummary table in the
#   ---- underlying Access database.  
F.run.sqlFile( ch, "QrySamplePeriod.sql", R.TAXON=taxon )

## End(Not run)
 

tmcd82070/CAMP_RST documentation built on April 6, 2022, 12:07 a.m.