README.md

MLWIC2: Machine Learning for Wildlife Image Classification

MLWIC2 can be used to automatically classify camera trap images or to train new models for image classification, it contains two pre-trained models: the species_model identifies 58 species and empty images, and the empty_animal model distinguishes between images with animals and those that are empty. MLWIC2 also contains Shiny apps for running the functions. These can be accessed using runShiny. In the steps below, you can see Shiny options for some steps. This indicates that you can run these steps with Shiny apps by running the function provied. Note that when you are using Shiny apps to select directories and files, you can only navigate using the top part half of the screen and you must scroll to the bottom of the window to find the Select button.

If you have issues, please submit them to the issues tab and do not email the authors of this package with questions. This way everyone can learn from the issue.

Step 0: Prerequisites

You need to have Anaconda Navigator installed, along with Python 3.7 (Python 3.6 or 3.5 will also work just as well). If you are using a Windows computer, you will likely need to install Rtools if you don't already have it installed.

Step 1: Install the MLWIC2 package in R

# install devtools if you don't have it
if (!require('devtools')) install.packages('devtools')
# check error messages and ensure that devtools installed properly. 

# install MLWIC2 from github
devtools::install_github("mikeyEcology/MLWIC2") 
# This line might prompt you to update some packages. It would be wise to make these updates. 

# load this package
library(MLWIC2)

When running install_github, some users will get an error about Rcpp or Rlang. This is due to the update to R version 4. If you have issues, update R to the latest version and re-install these R packages.

You only need to run steps 2-3 the first time you use this package on a computer. If you have already run MLWIC on your computer, you can skip step 2

Step 2: Setup your environment for using MLWIC2 using the R function setup

Shiny option: MLWIC2::runShiny('setup')

Step 3: Download the MLWIC2_helper_files folder from this link.

Before running models on your own data, I recommend you try running using the example provided.

Step 4: Create a properly formatted input file using make_input

Shiny option: MLWIC2::runShiny('make_input')

Step 5: Classify images using classify

Shiny option: MLWIC2::runShiny('classify')
classify(path_prefix = "/Users/mikeytabak/Desktop/images", # path to where your images are stored
         data_info = "/Users/mikeytabak/Desktop/image_labels.csv", # path to csv containing file names and labels
         model_dir = "/Users/mikeytabak/Desktop/MLWIC2_helper_files", # path to the helper files that you downloaded in step 3, including the name of this directory (i.e., `MLWIC2_helper_files`). Check to make sure this directory includes files like arch.py and run.py. If not, look for another folder inside this folder called `MLWIC2_helper_files`
         python_loc = "/anaconda2/bin/", # location of python on your computer
         save_predictions = "model_predictions.txt", # how you want to name the raw output file
         make_output = TRUE, # if TRUE, this will produce a csv with a more friendly output
         output_name = "MLWIC2_output.csv", # if make_output==TRUE, this will be the name of your friendly output file
         num_cores = 4 # the number of cores you want to use on your computer. Try runnning parallel::detectCores() to see what you have available. You might want to use something like parallel::detectCores()-1 so that you have a core left on your machine for accomplishing other tasks. 
         ) 

Step 6: Update the metadata of your image files using write_metadata (optional)

Shiny option: MLWIC2::runShiny('write_metadata')
write_metadata(output_file="/Users/mikeytabak/Desktop/MLWIC2_helper_files/MLWIC2_output.csv", # note that if you look at the classify command above, this is the [model_dir]/[output_name]
               model_type="species_model", # the type of model I used for classify
               exiftool_loc="/usr/local/bin", # location where exiftool is stored, you might not need to specify this. 
               show_sys_output = FALSE
               )

Step 7: Train a new model to recognize species in your images train

Shiny option: MLWIC2::runShiny('train')

If you aren't satisfied with the accuracy of the builtin models, you can train train your own model using your images. The parameters will be similar to those for classify, but you will want to specify some more options based on how you want to train the model. - path_prefix is the absolute path where your images are stored. - data_info is the absolute path to where your input file is stored. Check your output from make_input. - model_dir is the absolute path to where you stored the MLWIC2_helper_files folder in step 3. - num_classes is the number of species (or groups of species) you want the model to recognize - architecture is the DNN architecture. The options are c("alexnet", "densenet", "googlenet", "nin", "resnet", "vgg"). I recommend starting with "resnet" and set depth=18. If you get poor accuracy with this, "densenet" is another good option. - depth is the number of layers in the DNN. If you are using resnet, the options are c(18, 34, 50, 101, 152). If you are using densenet, the options are c(121, 161, 169, 201), otherwise, the depth will be automatically set for you. - batch_size is the number of images simultaneously passed to the model for training. It must be a multiple of 16. Smaller numbers will train models that are more accurate, but it will take longer to train. The default is 128. - log_dir_train is the directory where you will store the model information. This will be called when you what you specify in the log_dir option of the classify function. You will want to use unique names if you are training multiple models on your computer; otherwise they will be over-written - retrain If TRUE, the model you train will be a retraining of the model you specify in retrain_from. If FALSE, you are starting training from scratch. Retraining will be faster but training from scratch will be more flexible. - retrain_from name of the directory from which you want to retrain the model. If you are retraining from the species model, you would set retrain_from="species_model". If you need to stop training (e.g., you have to turn off your computer), you can retrain_from what you set as your log_dir_train and set your num_epochs to the total number you want minus the number that have completed. - num_epochs the number of epochs you want to use for training. The default is 55 and this is recommended for training a full model. But if you need to start and stop training, you can decrease this number. - You can read about more options by typing ?train into the console.

If you use this package in a publication, please site our manuscript: \ Tabak, M. A., Norouzzadeh, M. S., Wolfson, D. W., Newton, E. J., Boughton, R. K., Ivan, J. S., … Miller, R. S. (2020). Improving the accessibility and transferability of machine learning algorithms for identification of animals in camera trap images: MLWIC2. Ecology & Evolution, 10(9): 10374-10383. doi:10.1002/ece3.6692

@article{tabakImprovingAccessibilityTransferability2020,
  title = {Improving the Accessibility and Transferability of Machine Learning Algorithms for Identification of Animals in Camera Trap Images: {{MLWIC2}}},
  shorttitle = {Improving the Accessibility and Transferability of Machine Learning Algorithms for Identification of Animals in Camera Trap Images},
  author = {Tabak, Michael A. and Norouzzadeh, Mohammad S. and Wolfson, David W. and Newton, Erica J. and Boughton, Raoul K. and Ivan, Jacob S. and Odell, Eric A. and Newkirk, Eric S. and Conrey, Reesa Y. and Stenglein, Jennifer and Iannarilli, Fabiola and Erb, John and Brook, Ryak K. and Davis, Amy J. and Lewis, Jesse and Walsh, Daniel P. and Beasley, James C. and VerCauteren, Kurt C. and Clune, Jeff and Miller, Ryan S.},
  year = {2020},
  month = mar,
  pages = {ece3.6692},
  publisher = {{Wiley}},
  doi = {10.1002/ece3.6692},
  journal = {Ecology & Evolution},
  language = {en}
}

Disclaimer: MLWIC2 is a free software that comes with no warranty. You are recommended to test the software's ability to classify images in your dataset and not assume that the reported accuracy will be found on your images. The authors of this paper are not responsible for any decisions or interpretations that are made as a result of using MLWIC2.



mikeyEcology/MLWIC2 documentation built on Feb. 18, 2021, 11:46 a.m.