Glossary

+

+1
An upvote. FIXME

A

A/B testing
FIXME
Absolute path
A path that points to the same location in the filesystem regardless of where it’s evaluated. An absolute path is the equivalent of latitude and longitude in geography.
Absolute row number
The sequential index of a row in a table, regardless of what sections of the table is being displayed.
Aggregation
To combine many values into one, e.g., by summing a set of numbers or concatenating a set of strings.
Aggregation function
A function that combines many values into one, such as sum or max.
Aliasing
To have two or more references to the same thing, such as a data structure in memory or a file on disk.
Anchor tag
FIXME
Anonymous function
A function that has not been assigned a name. Anonymous functions are usually quite short, and are usually defined where they are used, e.g., as callbacks.
Application Programming Interface
A set of functions and procedures provided by one software library or web service through which another application can communicate with it. An API is not the code, the database, or the server: it’s the access point.
Argument
A value passed into a function. Some authors use the term as a synonym for parameter and some do not; it’s all very confusing.
ASCII
A standard way to represent the characters commonly used in the Western European languages as 7- or 8-bit integers, now superceded by Unicode.
Attribute
A name-value pair associated with an object, used to store metadata about the object such as an array’s dimensions.

B

Back end
FIXME
Backward-compatible
Software which is able to be used the same way as earlier versions of itself without problems.
Base R
The basic functions making up the R language. The base packages can be found in src/library and are not updated outside of R; their version numbers follow R version numbering. Base packages are installed and loaded with R, while priority packages are installed with base R but must be loaded prior to use.
Binary
FIXME
Binding
FIXME
Branch
A snapshot of a version of a Git repository. Multiple branches can capture multiple versions of the same repository.
A set of supplementary navigational links included in many websites, usually placed at the top of the page. Breadcrumbs show the users where the current page lies in the website; the term comes from a fairy tale in which children left a trail of breadcrumbs behind themselves so that they could find their way home.

C

Call stack
A data structure that stores information about the active subroutines executed. cst() is a useful function provided in the lobstr package to visualize a call stack.
Call to action
FIXME
Cascading Style Sheets
A way to control the appearance of HTML. CSS is typically used to specify fonts, colors, and layout.
Catch (exception)
To accept responsibility for handling an error or other unexpected event. R prefers “handling a condition” to “catching an exception”.
Character encoding
A specification of how characters are stored as bytes. The most commonly-used encoding today is UTF-8.
Cherry picking (in Git)
FIXME
Class
FIXME
Closure
A set of variables defined in the same scope whose existence has been preserved after that scope has ended. Closures are one of the trickiest ideas in programming.
Code hook
FIXME
Coercion
see type coercion.
Comma-Separated Values
A text format for tabular data in which each record is one row and fields are separated by commas. There are many minor variations, particularly around quoting of strings.
Condition
An error or other unexpected event that disrupts the normal flow of control.
Constant
FIXME
Constructor
A function that creates an object of a particular class. In the S3 object system, constructors are a convention rather than a requirement.
Copy-on-modify
The practice of creating a new copy of aliased data whenever there is an attempt to modify it so that each reference will believe theirs is the only one.
CRAN
The Comprehensive R Archive Network is a public repository of R packages.
Creative Commons licenses
A set of licenses that can be applied to published work. Each license is formed by concatenating one or more of -BY (Attribution): users must cite the original source; -SA (ShareAlike): users must share their own work under a similar license; -NC (NonCommercial): work may not be used for commercial purposes without the creator’s permission; -ND (NoDerivatives): no derivative works (e.g., translations) can be created without the creator’s permission. Thus, CC-BY-NC means “users must give attribution and cannot use commercially without permission The term CC-0 (zero, not letter ‘O’) is sometimes used to mean “no restrictions”, i.e., the work is in the public domain.
CSS selector
FIXME

D

Data frame
A two-dimensional data structure for storing tabular data in memory. Rows represent records and columns represent variables.
Diff (in Git)
FIXME
Docker
FIXME.
Double
Short for “double-precision floating-point number”, meaning a 64-bit numeric value with a fractional part and an exponent.
Double square brackets
An index enclosed in [[...]], used to return a single value of the underlying type.
Dynamic scoping
FIXME.

E

Eager evaluation
FIXME.
Empty vector
A vector that contains no elements. Empty vectors have a type such as logical or character, and are not the same as null.
Environment
A structure that stores a set of variable names and the values they refer to.
Error handling
What a program does to detect and correct for errors. Examples include printing a message and using a default configuration if the user-specified configuration can’t be found.
Escape sequence
A sequence of characters used to represent some other character that would otherwise have a special meaning. For example, the escape sequence \" is used to represent a double-quote character inside a double-quoted string.
Evaluation
The process of taking an expression such as 1+2*3/4 and turning it into a single irreducible value.
Exception
An object that stores information about an error or other unusual event in a program. One part of a program will create and raise an exception to signal that something unexpected has happened; another part will catch it.

F

Falsy
A horrible neologism meaning “equivalent to false”.
Feature branch
A branch within a Git repository containing commits dedicated to a specific feature, e.g., a bug fix or a new function. This branch can be merged into another branch.
Field (database)
FIXME
Filter
To choose a set of records (i.e., rows of a table) based on the values they contain.
Fork
A copy of one person’s Git repository that lives in another person’s GitHub account. Changes to the content of a fork can be submitted to the upstream repository via a pull request.
Front end
FIXME
Fully-qualified name
An unambiguous name of the form package::thing.
Functional programming
A style of programming in which data is transformed through successive application of functions, rather than by using control structures such as loops.

G

Generic function
A collection of functions with similar purpose, each operating on a different class of data.
Git
a version control tool to record and manage changes to a project.
GitHub
A cloud-based platform built around Git that allows you to save versions of your project online and collaborate with other Git users.
Global environment
The environment that holds top-level definitions in R, e.g., those written directly in the interpreter.
Global installation
Installing a package in a location where it can be accessed by all users and projects.
Global variable
A variable defined outside any particular function, which is therefore visible to all functions.
GNU Public License
A license that allows people to re-use software as long as they distribute the source of their changes.
Group
To divide data into subsets according to some criteria while leaving records in a single structure.

H

Handle (condition)
To accept responsibility for handling an error or other unexpected event. R prefers “handling a condition” to “catching an exception”.
Header row
If present, the first row of a CSV file that defines column names (but tragically, not their data types or units).
Heterogeneous
Having mixed type. For example, an list can contain a mix of numbers, character strings, and values of other types.
Higher-order function
A function that operates on other functions. For example, the higher-order function map executes a given function once on each value in an list. Higher-order functions are heavily used in functional programming.
Homogeneous
Having a single type. For example, a vector must be homogeneous: its values must all be numeric, logical, etc.
Hook
FIXME

I

Instance
FIXME
Integrated Development Environment
An application that helps programmers develop software. IDEs typically have a built-in editor, a console to execute code immediately, and browsers for exploring data structures in memory and files on disk.
Interactive rebase (in Git)
FIXME

J

JSON
A way to represent data by combining basic values like numbers and character strings in lists and name/value structures. The acronym stands for “JavaScript Object Notation”; unlike better-defined standards like XML, it is unencumbered by a syntax for comments or ways to define a schema.

K

knitr
FIXME.

L

Lazy evaluation
Delaying evaluation of an expression until the value is actually needed (or at least until after the point where it is first encountered).
Lexical scoping
FIXME.
LGTM (Looks Good to Me)
FIXME
Library
FIXME
List
A vector that can contain values of many different types.
Literate programming
A programming paradigm that mixes prose and code.
Local installation
Placing a package inside a particular project so that it is only accessible within that project.
Local variable
A variable defined inside a function which is only visible within that function.
Logical indexing
To index a vector or other structure with a vector of Booleans, keeping only the values that correspond to true values. Also referred to as masking.

M

Make
FIXME
Makefile
FIXME
Markdown
A markup language with a simple syntax intended as a replacement for HTML. Markdown is often used for README files, and is the basis for R markdown.
Masking
FIXME
Master branch
A dedicated, permanent, central branch which should contain a “ready product”. As a new feature is developed on a separate branch to avoid breaking the main code, it can be merged into the master branch.
Merge (data)
FIXME
Merge (Git)
Merging branches in Git incorporates development histories of two branches in one. If changes are made to similar parts of the branches on both branches a commit will occur and this must be resolved before the merge will be completed.
Method
An implementation of a generic function that handles objects of a specific class.
MIT License
A license that allows people to re-use software with no restrictions.
Module
FIXME
Mutation
Changing data in place, such as modifying an element of an array or adding a record to a database.

N

NA
A special value used to represent data that is not available.
Name collision
The ambiguity that arises when two or more things in a program that have the same name are active at the same time.
Namespace
FIXME
Native app
FIXME
Negative selection
To specify the elements of a vector or other data structure that aren’t desired by negating their indices.
Non-standard evaluation
FIXME
NoSQL database
Any database that doesn’t use the relational model. The awkward name comes from the fact that such databases don’t use SQL as a query language.
Null
A special value used to represent a missing object. Null is not the same as NA, and neither is the same as an empty vector.

O

Object-oriented programming
FIXME
Observation
FIXME

P

Package
A collection of code, data, and documentation that can be distributed and re-used. Also referred to in some languages as a library or module.
Package manager
A program that does its best to keep track of the bits and bobs of software installed on a computer and their dependencies on one another.
Parameter
A variable whose value is passed into a function when the function is called. Some writers distinguish parameters (the variables) from arguments (the values passed in), but others use the terms in the opposite sense. It’s all very confusing.
Parse
To translate the text of a program or web page into a data structure in memory that the program can then manipulate.
Peanuts
An American comic strip by Charles M. Schulz which has inspired the names of R versions.
Short for “permanent link”, the full URL that you see and use for a post, page, or a site’s content. An example could be https://www.mysite.com/category/post-name.
Pipe operator
The %>% used to make the output of one function the input of the next.
Prefix operator
FIXME
Production code
Software that is delivered to an end user. The term is used to distinguish such code from test code, deployment infrastructure, and everything else that programmers write along the way.
Pseudo-random number
A value generated in a repeatable way that resemble the true randomness of the universe well enough to fool merely mortal observers.
Pseudo-random number generator
A function that can generate pseudo-random numbers.
Pull indexing
Vectorized indexing in which the value at location i in the index vector specifies which element of the source vector is being pulled into that location in the result vector, i.e., result[i] = source[index[i]].
Pull request
The request to merge a new feature or correction created on a user’s fork of a Git repository into the upstream repository. The developer will be notified of the change, review it, make or suggest changes, and potentially merge it.
Push indexing
Vectorized indexing in which the value at location i in the index vector specifies an element of the result vector that gets the corresponding element of the source vector, i.e., result[index[i]] = source[i]. Push indexing can easily produce gaps and collisions.

Q

Quosure
A data structure containing an unevaluated expression and its environment.
Quoting function
A function that is passed expressions rather than the values of those expressions.

R

R Consortium
FIXME
R Foundation
A non-profit founded by the R development core team providing support for R. It is a member of the R Consortium.
R hub
A free platform available to check a R package on several different platforms in preparation for the CRAN submission process.
R Markdown
A dialect of Markdown that allows authors to mix prose and code (usually written in R) in a single document. Cf. literate programming.
Raise
To signal that something unexpected or unusual has happened in a program by creating an exception and handing it to the error-handling system, which then tries to find a point in the program that will catch it.
Reactive programming
A style of programming in which actions are triggered by external events.
Reactive variable
A variable whose value is automatically updated when some other value or values change. Reactive variables are used extensively in Shiny.
Read-eval-print loop
An interactive program that reads a command typed in by a user, executes it, prints the result, and then waits patiently for the next command. REPLs are often used to explore new ideas or for debugging.
Rebase
FIXME
Record (database)
FIXME
Recycle
To re-use values from a shorter vector in order to generate a sequence of the same length as a longer one.
Refactor (code)
FIXME
Refactor (R function)
FIXME
Regular expression
A pattern for matching text, written as text itself. Regular expressions are sometimes called “regexp”, “regex”, or “RE”, and are as powerful as they are cryptic.
Relational database
A database that organizes information into tables, each of which has a fixed set of named fields (shown as columns) and a variable number of records (shown as rows).
Relative path
A path whose destination is interpreted relative to some other location, such as the current directory. A relative path is the equivalent of giving directions using terms like “straight” and “left”.
Relative row number
The index of a row in a displayed portion of a table, which may or may not be the same as the absolute row number within the table.
Repository
A place where a version control system stores the files that make up a project and the metadata that describes their history.
Reprex
A reproducible example. When asking questions about coding problems online or filing issues on GitHub, you should always include a reprex so others can reproduce your problem and help. The reprex package can help!
Root directory
The directory that contains everything else, directly or indirectly. The root directory is written / (a bare forward slash).

S

S
A language originally developed in Bell Labs for data analysis, statistical modeling, and graphics. R is a dialect of S.
S3
A framework for object-oriented programming in R.
S4
A framework for object-oriented programming in R.
Scalar
A single value of a particular type, such as 1 or “a”. Scalars don’t really exist in R; values that appear to be scalars are actually vectors of unit length.
Schema
A specification of the format of a dataset, including the name, format, and content of each table.
Scope
The portion of a program within which a definition can be seen and used. Cf. closure, global variable, and local variable.
Script
Originally, a program written in a language too usable for “real” programmers to take seriously; the term is now synonymous with program.
Seed
A value used to initialize a pseudo-random number generator.
Select
To choose entire columns from a table by name or location.
Shiny
FIXME
Signal (a condition)
A way of indicating that something has gone wrong in a program, or that some other unexpected event has occurred. R prefers “signalling a condition” to “raising an exception”.
Single square brackets
An index enclosed in [...], used to select a structure from another structure.
Singleton
A set with only one element, or a class with only one instance.
Slug
An abbreviated portion of a page’s URL that uniquely identifies it. In the example https://www.mysite.com/category/post-name, the slug is post-name.
SQL
The language used for writing queries for a relational database. The term was originally an acronym for Structured Query Language.
Squash (in Git)
FIXME
Stack frame
FIXME
Stash (in Git)
FIXME
String
A block of text in a program. The term is short for “character string”.
String interpolation
The process of inserting text corresponding to specified values into a string, usually to make output human-readable.

T

Table
A set of records in a relational database or observations in a data frame. Tables are usually displayed as rows (each of which represents one record or observation) and columns (each of which represents a field or variable).
Tibble
A modern replacement for R’s data frame, which stores tabular data in columns and rows, defined and used in the tidyverse.
Tidy data
Tabular data that satisfies three conditions that facilitate initial cleaning, and later exploration and analysis: (1) each variable forms a column, (2) each observation forms a row, and (3) each type of observation unit forms a table.
tidymodels
A collection of R packages for modeling and statistical analysis designed with a shared philosophy.
Tidyverse
A collection of R packages for operating on tabular data in consistent ways.
Truthy
A truly Orwellian neologism meaning “not equivalent to false”. Cf. falsy, but only if you are able to set aside your respect for the English language.
Type coercion
FIXME

U

Unicode
A standard that defines numeric codes for many thousands of characters and symbols. Unicode does not define how those numbers are stored; that is done by standards like UTF-8.
Unit test
A test that exercises one property or expected behavior of a system. FIXME: provide example.
Upstream repository
FIXME
UTF-8
A way to store the numeric codes representing Unicode characters in memory that is backward-compatible with the older ASCII standard.

V

Variable (data)
FIXME
Variable (program)
A name in a program that has some data associated with it. A variable’s value can be changed after definition.
Variable arguments
In a function, the ability to take any number of arguments. R uses ... to capture the “extra” arguments.
Vector
A sequence of values, usually of homogeneous type. Vectors are the fundamental data structure in R; a scalar is just a vector with exactly one element.
Vectorize
To write code so that operations are performed on entire vectors, rather than element-by-element within loops.
Version control system
A system for managing changes made to software during its development.
Vignette
A long-form guide used to provide details of a package beyond the README.md or function documentation.

W

Whitespace
The space, newline, carriage return, and horizontal and vertical tab characters that take up space but don’t create a visible mark. The name comes from their appearance on a printed page in the era of typewriters.

X

XML
A set of rules for defining HTML-like tags and using them to format documents (typically data). XML achieved license plate popularity in the early 2000s, but its complexity led many programmers to adopt JSON instead.

Y

YAML
FIXME