cloc-package | R Documentation |
Counts blank lines, comment lines, and physical lines of source code in source
files/trees/archives. An R wrapper to the Perl cloc
utility
https://github.com/AlDanial/cloc by @AlDanial.
cloc
's method of operation resembles SLOCCount
's:
First, create a list of files to consider. Next, attempt to determine whether or not
found files contain recognized computer language source code. Finally, for files
identified as source files, invoke language-specific routines to count the number of
source lines.
A more detailed description:
If the input file is an archive (such as a .tar.gz
or .zip
file),
create a temporary directory and expand the archive there using a
system call to an appropriate underlying utility (tar
, bzip2
, unzip
,
etc) then add this temporary directory as one of the inputs. (This
works more reliably on Unix than on Windows.)
Use perl's File::Find
to recursively descend the input directories and make
a list of candidate file names. Ignore binary and zero-sized files.
Make sure the files in the candidate list have unique contents
(first by comparing file sizes, then, for similarly sized files,
compare MD5 hashes of the file contents with perl's Digest::MD5
). For each
set of identical files, remove all but the first copy, as determined
by a lexical sort, of identical files from the set. The removed
files are not included in the report.
Scan the candidate file list for file extensions which cloc
associates with programming languages. Files which match are classified as
containing source
code for that language. Each file without an extensions is opened
and its first line read to see if it is a Unix shell script
(anything that begins with #!
). If it is shell script, the file is
classified by that scripting language (if the language is
recognized). If the file does not have a recognized extension or is
not a recognzied scripting language, the file is ignored.
All remaining files in the candidate list should now be source files for known programming languages. For each of these files:
Read the entire file into memory.
Count the number of lines (= L original).
Remove blank lines, then count again (= L non-blank).
Loop over the comment filters defined for this language. (For
example, C++ as two filters: (1) remove lines that start with
optional whitespace followed by //
and (2) remove text between
/*
and */
) Apply each filter to the code to remove comments.
Count the left over lines (= L code).
Save the counts for this language:
blank lines = L original - L non-blank
comment lines = L non-blank - L code
code lines = L code
Bob Rudis (bob@rud.is)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.