Package 'goldfish'

Title: Statistical Network Models for Dynamic Network Data
Description: Tools for fitting statistical network models to dynamic network data. Can be used for fitting both dynamic network actor models ('DyNAMs') and relational event models ('REMs'). Stadtfeld, Hollway, and Block (2017a) <doi:10.1177/0081175017709295>, Stadtfeld, Hollway, and Block (2017b) <doi:10.1177/0081175017733457>, Stadtfeld and Block (2017) <doi:10.15195/v4.a14>, Hoffman et al. (2020) <doi:10.1017/nws.2020.3>.
Authors: James Hollway [aut, dtc] (IHEID, ORCID: <https://orcid.org/0000-0002-8361-9647>), Christoph Stadtfeld [aut, dtc], Marion Hoffman [aut], Alvaro Uzaheta [cre, aut] (ORCID: <https://orcid.org/0000-0003-4367-3670>), Mirko Reul [ctb], Timon Elmer [ctb], Kieran Mepham [ctb], Per Block [ctb], Xiaolei Zhang [ctb], Weigutian Ou [ctb], Emily Garvin [ctb], Siwei Zhang [ctb], Mabel Wylie [ctb]
Maintainer: Alvaro Uzaheta <[email protected]>
License: GPL (>= 3)
Version: 1.7.0
Built: 2026-05-10 08:34:39 UTC
Source: https://github.com/stocnet/goldfish

Help Index


goldfish package

Description

The goldfish Project is an R package that allows to fit statistical network models (such as DyNAM and REM) to dynamic network data.

Details

The goldfish package in R allows the study of time-stamped network data using a variety of models. In particular, it implements different types of Dynamic Network Actor Models (DyNAMs), a class of models that is tailored to the study of actor-oriented network processess through time. Goldfish also implements different versions of the tie-oriented Relational Event Model by Carter Butts.

Author(s)

Maintainer: Alvaro Uzaheta [email protected] (ORCID)

Authors:

Other contributors:

  • Mirko Reul [contributor]

  • Timon Elmer [contributor]

  • Kieran Mepham [contributor]

  • Per Block [contributor]

  • Xiaolei Zhang [contributor]

  • Weigutian Ou [contributor]

  • Emily Garvin [contributor]

  • Siwei Zhang [contributor]

  • Mabel Wylie [contributor]

References

Stadtfeld, C. (2012). Events in Social Networks: A Stochastic Actor-oriented Framework for Dynamic Event Processes in Social Networks. KIT Scientific Publishing. doi:10.5445/KSP/1000025407

Stadtfeld, C., and Block, P. (2017). Interactions, Actors, and Time: Dynamic Network Actor Models for Relational Events. Sociological Science 4 (1), 318-52. doi:10.15195/v4.a14

Stadtfeld, C., Hollway, J., and Block, P. (2017). Dynamic Network Actor Models: Investigating Coordination Ties Through Time. Sociological Methodology 47 (1). doi:10.1177/0081175017709295

Hollway, J. (2020). Network embeddedness and the rate of international water cooperation and conflict. In Networks in Water Governance, edited by Manuel Fischer and Karin Ingold. London: Palgrave, pp. 87-113.

Hoffman, M., Block P., Elmer T., and Stadtfeld C. (2020). A model for the dynamics of face-to-face interactions in social groups. Network Science, 8(S1), S4-S25. doi:10.1017/nws.2020.3

See Also

estimate


Estimate a model

Description

Estimates parameters for a dynamic network model via maximum likelihood implementing the iterative Newton-Raphson procedure as describe in Stadtfeld and Block (2017).

Usage

estimate_dynam(
  x,
  sub_model = c("choice", "rate", "choice_coordination"),
  data = NULL,
  control_estimation = set_estimation_opt(),
  control_preprocessing = set_preprocessing_opt(),
  preprocessing_init = NULL,
  preprocessing_only = FALSE,
  progress = getOption("progress"),
  verbose = getOption("verbose")
)

estimate_dynami(
  x,
  sub_model = c("choice", "rate"),
  data = NULL,
  control_estimation = set_estimation_opt(),
  control_preprocessing = set_preprocessing_opt(),
  preprocessing_init = NULL,
  preprocessing_only = FALSE,
  progress = getOption("progress"),
  verbose = getOption("verbose")
)

estimate_rem(
  x,
  data = NULL,
  control_estimation = set_estimation_opt(),
  control_preprocessing = set_preprocessing_opt(),
  preprocessing_init = NULL,
  preprocessing_only = FALSE,
  progress = getOption("progress"),
  verbose = getOption("verbose")
)

Arguments

x

a formula that defines at the left-hand side the dependent network (see make_dependent_events()) and at the right-hand side the effects and the variables for which the effects are expected to occur (see vignette("goldfishEffects")).

sub_model

A character string specifying the sub-model to be estimated. It can be "rate" to model the waiting times between events, "choice" to model the choice of the receiver, or "choice_coordination" to model coordination ties. See details.

choice

a multinomial receiver choice model estimate_dynam() (Stadtfeld and Block, 2017). A multinomial group choice model estimate_dynami() (Hoffman et al., 2020)

choice_coordination

a multinomial-multinomial model for coordination ties estimate_dynam() (Stadtfeld, Hollway and Block, 2017)

rate

A individual activity rates model estimate_dynam() (Stadtfeld and Block, 2017). Two rate models, one for individuals joining groups and one for individuals leaving groups, jointly estimated estimate_dynami()(Hoffman et al., 2020)

data

a data.goldfish object created with make_data(). It is an environment that contains the nodesets, networks, attributes and dependent events objects. Default to NULL.

control_estimation

An object of class control_estimation.goldfish (typically created by set_estimation_opt()), specifying parameters for the estimation algorithm.

control_preprocessing

An object of class control_preprocessing.goldfish (typically created by set_preprocessing_opt()), specifying parameters for data preprocessing. This is only used if preprocessing_init is not a preprocessed.goldfish object or NULL.

preprocessing_init

an optional preprocessed object of class preprocessed.goldfish from a previous estimation. When it is provided, the function will skip the preprocessing of the effects that are already present in the object and only preprocess the new effects. Default to NULL.

preprocessing_only

logical. If TRUE, the function will only run the preprocessing stage and return an object of class preprocessed.goldfish. Default to FALSE.

progress

logical indicating whether should print a minimal output to the console of the progress of the preprocessing and estimation processes.

verbose

logical indicating whether should print very detailed intermediate results of the iterative Newton-Raphson procedure; slows down the routine significantly.

Details

Missing data is handled during the preprocessing stage of the data. The specific imputation strategy depends on the type of data:

  • Network Data: Missing values in the initial network structure or in linked events that update network ties are imputed with a value of zero (0). This explicitly assumes the absence of a tie or event.

  • Attribute Covariates:

    • Initial Values: Missing values for the initial state of an attribute covariate are replaced by the mean value of that attribute across all actors.

    • During Event Updates (via linked events):

      • Using replace: If a linked event uses the replace variable to specify a new attribute value and that value is missing, the missing value is replaced by the mean of the attribute, excluding the node being updated, at the moment of the event.

      • Using increment: If a linked event uses the increment variable to specify a change in attribute value and that increment is missing, the missing value is imputed with a value of zero (0). This assumes no change occurred.

Value

returns an object of class() "result.goldfish" when preprocessing_only = FALSE or a preprocessed statistics object of class "preprocessed.goldfish" when preprocessing_only = TRUE.

An object of class "result.goldfish" is a list including:

parameters

a numeric vector with the coefficients estimates.

standardErrors

a numeric vector with the standard errors of the coefficients estimates.

logLikelihood

the log-likelihood of the estimated model

finalScore

a vector with the final score reach by the parameters during estimation.

finalInformationMatrix

a matrix with the final values of the negative Fisher information matrix. The inverse of this matrix gives the variance-covariance matrix for the parameters estimates.

convergence

a list with two elements. The first element (isConverged) is a logical value that indicates the convergence of the model. The second element (maxAbsScore) reports the final maximum absolute score in the final iteration.

nIterations

an integer with the total number of iterations performed during the estimation process.

nEvents

an integer reporting the number of events considered in the model.

names

a matrix with a description of the effects used for model fitting. It includes the name of the object used to calculate the effects and additional parameter description.

formula

a formula with the information of the model fitted.

model

a character value of the model type.

sub_model

a character value of the sub_model type.

rightCensored

a logical value indicating if the estimation process considered right-censored events. Only it is considered for estimate_dynam(x, sub_model = "rate") or REM (estimate_rem()), when the model includes the intercept.

Models

Currently there are implemented the following models:

DyNAM

Dynamic Network Actor Models, estimate_dynam(), modeling a sequence of relational events as an actor-oriented process (Stadtfeld, Hollway and Block, 2017 and Stadtfeld and Block, 2017)

DyNAMi

Dynamic Network Actor Models for interactions, estimate_dynami(), modeling face-to-face interactions as an actor oriented process (Hoffman et al., 2020)

REM

Relational Event Model, estimate_rem(), modeling a sequence of relational events as a tie-oriented process (Butts, 2008).

DyNAM

The actor-oriented models that the goldfish package implements, estimate_dynam(), have been called Dynamic Network Actor Models (DyNAMs). The model is a two-step process. In the first step, the waiting time until an actor ii initiates the next relational event is modeled (sub_model = "rate") by an exponential distribution depending on the actor activity rate. In the second step, the conditional probability of ii choosing jj as the event receiver is modeled (sub_model = "choice") by a multinomial probability distribution with a linear predictor. These two-steps are assumed to be conditionally independent given the process state (Stadtfeld, 2012), due to this assumption is possible to estimate these components by different calls of the estimate_dynam() function.

Waiting times

When DyNAM-rate (estimate_dynam(x, sub_model = "rate")) model is used to estimate the first step component of the process, or the REM estimate_rem(x, model = "REM") model is used. It is important to add a time intercept to model the waiting times between events, in this way the algorithm considers the right-censored intervals in the estimation process.

In the case that the intercept is not included in the formula. The model reflects the likelihood of an event being the next in the sequence. This specification is useful for scenarios where the researcher doesn't have access to the exact interevent times. For this ordinal case the likelihood of an event is merely a multinomial probability (Butts, 2008).

References

Butts C. (2008). A Relational Event Framework for Social Action. Sociological Methodology 38 (1). doi:10.1111/j.1467-9531.2008.00203.x

Hoffman, M., Block P., Elmer T., and Stadtfeld C. (2020). A model for the dynamics of face-to-face interactions in social groups. Network Science, 8(S1), S4-S25. doi:10.1017/nws.2020.3

Stadtfeld, C. (2012). Events in Social Networks: A Stochastic Actor-oriented Framework for Dynamic Event Processes in Social Networks. KIT Scientific Publishing. doi:10.5445/KSP/1000025407

Stadtfeld, C., and Block, P. (2017). Interactions, Actors, and Time: Dynamic Network Actor Models for Relational Events. Sociological Science 4 (1), 318-52. doi:10.15195/v4.a14

Stadtfeld, C., Hollway, J., and Block, P. (2017). Dynamic Network Actor Models: Investigating Coordination Ties Through Time. Sociological Methodology 47 (1). doi:10.1177/0081175017709295

See Also

make_dependent_events(), make_global_attribute(), make_network(), make_nodes(), link_events()

Examples

# A DyNAM modeling rate and choice steps
data("Social_Evolution")
callNetwork <- make_network(nodes = actors, directed = TRUE)
callNetwork <- link_events(
  x = callNetwork, change_event = calls,
  nodes = actors
)
callsDependent <- make_dependent_events(
  events = calls, nodes = actors,
  default_network = callNetwork
)



socialEvData <- make_data(callsDependent, callNetwork, call, actors)

mod01 <- estimate_dynam(callsDependent ~ inertia + recip + trans,
  sub_model = "choice",
  data = socialEvData,
  control_estimation = set_estimation_opt(engine = "gather_compute")
)
summary(mod01)

# A individual activity rates model
mod02 <- estimate_dynam(callsDependent ~ 1 + node_trans + indeg + outdeg,
  sub_model = "rate",
  data = socialEvData,
  control_estimation = set_estimation_opt(engine = "gather_compute")
)
summary(mod02)

# A REM

mod03 <- estimate_rem(
  callsDependent ~ 1 + node_trans(callNetwork, type = "ego") +
    indeg(callNetwork, type = "ego") + outdeg(callNetwork, type = "ego") +
    inertia + recip + trans,
    data = socialEvData,
    control_estimation = set_estimation_opt(engine = "gather_compute")
)
summary(mod03)



# A multinomial-multinomial choice model for coordination ties
data("Fisheries_Treaties_6070")
states <- make_nodes(states)
states <- link_events(states, sovchanges, attribute = "present")
states <- link_events(states, regchanges, attribute = "regime")
states <- link_events(states, gdpchanges, attribute = "gdp")

bilatnet <- make_network(bilatnet, nodes = states, directed = FALSE)
bilatnet <- link_events(bilatnet, bilatchanges, nodes = states)

contignet <- make_network(contignet, nodes = states, directed = FALSE)
contignet <- link_events(contignet, contigchanges, nodes = states)

createBilat <- make_dependent_events(
  events = bilatchanges[bilatchanges$increment == 1, ],
  nodes = states, default_network = bilatnet
)

fisheriesData <- make_data(
  createBilat, contignet, bilatnet, contigchanges, bilatchanges,
  states, sovchanges, regchanges, gdpchanges
 )
partnerModel <- estimate_dynam(
  createBilat ~
    inertia(bilatnet) +
    indeg(bilatnet, ignore_repetitions = TRUE) +
    trans(bilatnet, ignore_repetitions = TRUE) +
    tie(contignet) +
    alter(states$regime) +
    diff(states$regime) +
    alter(states$gdp) +
    diff(states$gdp),
  sub_model = "choice_coordination",
  data = fisheriesData,
  control_estimation =
    set_estimation_opt(
      initial_damping = 40, max_iterations = 30,
      engine = "default"
    )
)
summary(partnerModel)

Diagnostic functions

Description

Provide diagnostic functions for an object of class result.goldfish. outliers helps to identify outliers events. changepoints helps to identify where a change point in the events sequence is presented using the log-likelihood.

Usage

examine_outliers(
  x,
  method = c("Hampel", "IQR", "Top"),
  parameter = 3,
  window = NULL
)

examine_changepoints(
  x,
  moment = c("mean", "variance"),
  method = c("PELT", "AMOC", "BinSeg"),
  window = NULL,
  ...
)

Arguments

x

an object of class result.goldfish output from an estimate call.

method

Choice of "AMOC", "PELT" or "BinSeg". For a detail description see cpt.mean or cpt.var. The default value is "PELT".

parameter

An integer that represents the number of absolute outliers to identify, the threshold for the Hampel filter, i.e. parameter * MAD, or the threshold beyond the interquartile range halved, i.e. parameter/2 * IQR.

window

The window half-width for the Hampel filter. By default it is half the width of the event sequence.

moment

character argument to choose between "mean" or "variance". See section Change point for details.

...

additional arguments to be passed to the functions in the changepoint package.

Value

NULL if neither outliers nor change points are identified. An object of class ggplot object from a call of ggplot2::ggplot(). It can be modified using the ggplot2 syntax.

Outliers

examineOutliers creates a plot with the log-likelihood of the events in the y-axis and the event index in the x-axis, identifying observations with labels indicating the sender and recipient.

Change point

The parameter moment controls which method from the package changepoint is used:

"mean"

It uses the cpt.mean function to investigate optimal positioning and (potentially) number of change points for the log-likelihood of the events in mean.

"variance"

It uses the cpt.var function to investigate optimal positioning and (potentially) number of change points for the log-likelihood of the events in variance

The function call creates a plot with the log-likelihood of the events in the y-axis and the event index in the x-axis, highlighting the change point sections identified by the method.

Examples

# A multinomial receiver choice model
data("Social_Evolution")
callNetwork <- make_network(nodes = actors, directed = TRUE)
callNetwork <- link_events(
  x = callNetwork, change_event = calls,
  nodes = actors
)
callsDependent <- make_dependent_events(
  events = calls, nodes = actors,
  default_network = callNetwork
)

socialEvolutionData <- make_data(callsDependent, callNetwork, calls, actors)
mod01 <- estimate_dynam(
  callsDependent ~ inertia + recip + trans,
  sub_model = "choice",
  data = socialEvolutionData,
  control_estimation = set_estimation_opt(
    return_interval_loglik = TRUE,
    engine = "default_c"
  )
)

examine_outliers(mod01)

examine_changepoints(mod01)

International bilateral fisheries treaties (1960-1970)

Description

An abbreviated version of the international fisheries agreements dataset, including only bilateral agreements, fewer variables, and ranging only between 1960 and 1970 inclusive. This data set is only meant for testing, and not for inference. It provides an example of an undirected, weighted (by integer/increment) network, with composition change and both monadic and dyadic covariates. Monadic variables include the dates states gain or lose sovereign status, their polity score, and their GDP. Dyadic variables include bilateral fisheries agreements between states, and states' contiguity with one another over time.

Usage

data(Fisheries_Treaties_6070)

bilatchanges

bilatnet

contigchanges

contignet

gdpchanges

regchanges

sovchanges

states

Format

The data includes several dataframes: states (154 rows, 4 columns, monadic), sovchanges (62 rows, 3 columns, monadic), regchanges (145 rows, 3 columns, monadic), gdpchanges (979 rows, 3 columns, monadic), bilatchanges (77 rows, 4 columns, dyadic), contigchanges (139 rows, 4 columns, dyadic). See below for variables and formats.

Object Description Format
states$label Node identifier labels character
states$present Node present in dataset boolean
states$regime Placeholder for regime variable numeric (NA)
states$gdp Placeholder for GDP variable numeric (NA)
sovchanges$time Date of state sovereignty update POSIXct
sovchanges$node Node for state sovereignty update integer
sovchanges$replace State sovereignty update boolean
regchanges$time Date of regime update POSIXct
regchanges$node Node for regime update integer
regchanges$replace Regime update integer (-10--10)
gdpchanges$time Date of GDP update POSIXct
gdpchanges$node Node for GDP update integer
gdpchanges$replace GDP update numeric
bilatchanges$time Date of bilateral change POSIXct
bilatchanges$sender First bilateral change node integer
bilatchanges$receiver Second bilateral change node integer
bilatchanges$increment Create or dissolve tie numeric (-1 or 1)
contigchanges$time Date of contiguity change POSIXct
contigchanges$sender First contiguity change node integer
contigchanges$receiver Second contiguity change node integer
contigchanges$replace New contiguity value numeric

An object of class data.frame with 77 rows and 4 columns.

An object of class matrix (inherits from array) with 154 rows and 154 columns.

An object of class data.frame with 139 rows and 4 columns.

An object of class matrix (inherits from array) with 154 rows and 154 columns.

An object of class data.frame with 979 rows and 3 columns.

An object of class data.frame with 145 rows and 3 columns.

An object of class data.frame with 62 rows and 3 columns.

An object of class data.frame with 154 rows and 4 columns.

References

Hollway, James, and Johan Koskinen. 2016. Multilevel Embeddedness: The Case of the Global Fisheries Governance Complex. Social Networks, 44: 281-94. doi:10.1016/j.socnet.2015.03.001.

Hollway, James, and Johan H Koskinen. 2016. Multilevel Bilateralism and Multilateralism: States' Bilateral and Multilateral Fisheries Treaties and Their Secretariats. In Multilevel Network Analysis for the Social Sciences, edited by Emmanuel Lazega and Tom A B Snijders, 315-32. Cham: Springer International Publishing. doi:10.1007/978-3-319-24520-1_13.


Gather model data from a formula

Description

Gather the preprocess data from a formula given a model and sub model, where the output corresponds to the data structure used by the engine gather_compute; see estimate.

Usage

gather_model_data(
  formula,
  model = c("DyNAM", "REM"),
  sub_model = c("choice", "choice_coordination", "rate"),
  data = NULL,
  control_preprocessing = set_preprocessing_opt(),
  progress = getOption("progress")
)

Arguments

formula

a formula object that defines at the left-hand side the dependent network (see make_dependent_events()) and at the right-hand side the effects and the variables for which the effects are expected to occur (see vignette("goldfishEffects")).

model

a character string defining the model type. Current options include "DyNAM", "DyNAMi" or "REM"

DyNAM

Dynamic Network Actor Models (Stadtfeld, Hollway and Block, 2017 and Stadtfeld and Block, 2017)

DyNAMi

Dynamic Network Actor Models for interactions (Hoffman et al., 2020)

REM

Relational Event Model (Butts, 2008)

sub_model

A character string specifying the sub-model to be estimated. It can be "rate" to model the waiting times between events, "choice" to model the choice of the receiver, or "choice_coordination" to model coordination ties. See details.

choice

a multinomial receiver choice model estimate_dynam() (Stadtfeld and Block, 2017). A multinomial group choice model estimate_dynami() (Hoffman et al., 2020)

choice_coordination

a multinomial-multinomial model for coordination ties estimate_dynam() (Stadtfeld, Hollway and Block, 2017)

rate

A individual activity rates model estimate_dynam() (Stadtfeld and Block, 2017). Two rate models, one for individuals joining groups and one for individuals leaving groups, jointly estimated estimate_dynami()(Hoffman et al., 2020)

data

a data.goldfish object created with make_data(). It is an environment that contains the nodesets, networks, attributes and dependent events objects. Default to NULL.

control_preprocessing

An object of class "preprocessing_options.goldfish", usually the result of a call to set_preprocessing_opt(). This object contains parameters that control the data preprocessing. See set_preprocessing_opt() for details on the available parameters.

progress

logical indicating whether should print a minimal output to the console of the progress of the preprocessing and estimation processes.

Details

It differs from the estimate_dynam(), estimate_rem() and estimate_dynami() output when the argument preprocessing_only is set to TRUE regarding the memory space requirement. The gather_model_data() produces a list where the first element is a matrix that could have up to the number of events times the number of actors rows and the number of effects columns. For medium to large datasets with thousands of events and thousands of actors, the memory RAM requirements are large and, therefore, errors are produced due to a lack of space. The advantage of the data structure is that it can be adapted to estimate the models (or extensions of them) using standard packages for generalized linear models (or any other model) that use tabular data as input.

Value

a list object including:

stat_all_events

a matrix. The number of rows can be up to the number of events times the number of actors (square number of actors for the REM). Rigth-censored events are included when the model has an intercept. The number of columns is the number of effects in the model. Every row is the effect statistics at the time of the event for each actor in the choice set or the sender set.

n_candidates

a numeric vector with the number of rows related with an event. The length correspond to the number of events plus right censored events if any.

selected

a numeric vector with the position of the selected actor (choice model), sender actor (rate model), or active dyad (choice-coordination model, REM model). Indexing start at 1 for each event.

sender, receiver

a character vector with the label of the sender/receiver actor. For right-censored events the receiver values is not meaningful.

has_intercept

a logical value indicating if the model has an intercept.

namesEffects

a character vector with a short name of the effect. It includes the name of the object used to calculate the effects and modifiers of the effect, e.g., the type of effect, weighted effect.

effectDescription

a character matrix with the description of the effects. It includes the name of the object used to calculate the effects and additional information of the effect, e.g., the type of effect, weighted effect, transformation function, window length.

If the model has an intercept and the sub_model is rate or model is REM, additional elements are included:

timespan

a numeric vector with the time span between events, including right-censored events.

isDependent

a logical vector indicating if the event is dependent or right-censored.

Examples

data("Fisheries_Treaties_6070")
states <- make_nodes(states)
states <- link_events(states, sovchanges, attribute = "present")
states <- link_events(states, regchanges, attribute = "regime")
states <- link_events(states, gdpchanges, attribute = "gdp")

bilatnet <- make_network(bilatnet, nodes = states, directed = FALSE)
bilatnet <- link_events(bilatnet, bilatchanges, nodes = states)

createBilat <- make_dependent_events(
  events = bilatchanges[bilatchanges$increment == 1, ],
  nodes = states, default_network = bilatnet
)

fisheriesData <- make_data(createBilat)

gatheredData <- gather_model_data(
  createBilat ~ inertia(bilatnet) + trans(bilatnet) + tie(contignet),
  model = "DyNAM", sub_model = "choice_coordination",
  data = fisheriesData
)

Extract log-likelihood from a fitted model object

Description

This function extract the log-likelihood from the output of a estimate call. The extracted log-likelihood correspond to the value in the last iteration of the estimate call, users should check convergence of the Gauss/Fisher scoring method before using the log-likelihood statistic to compare models.

Usage

## S3 method for class 'result.goldfish'
logLik(object, ..., avgPerEvent = FALSE)

Arguments

object

an object of class result.goldfish output from an estimate call with a fitted model.

...

additional arguments to be passed.

avgPerEvent

a logical value indicating whether the average likelihood per event should be calculated.

Details

Users might use stats::AIC() and stats::BIC() to compute the Information Criteria from one or several fitted model objects. An information criterion could be used to compare models with respect to their predictive power.

Alternatively, lmtest::lrtest() can be used to compare models via asymptotic likelihood ratio tests. The test is designed to compare nested models. i.e., models where the model specification of one contains a subset of the predictor variables that define the other.

Value

Returns an object of class logLik when avgPerEvent = FALSE. This is a number with the extracted log-likelihood from the fitted model, and with the following attributes:

df

degrees of freedom with the number of estimated parameters in the model

nobs

the number of observations used in estimation. In general, it corresponds to the number of dependent events used in estimation. For a subModel = "rate" or model = "REM" with intercept, it corresponds to the number of dependent events plus right-censored events due to exogenous or endogenous changes.

When avgPerEvent = TRUE, the function returns a number with the average log-likelihood per event. The total number of events depends on the presence of right-censored events in a similar way that the attribute nobs is computed when avgPerEvent = FALSE.


Create a data object for goldfish models

Description

This function creates a new object of class data.goldfish and populates it with the provided R objects and their linked objects, as specified by attributes common in the 'goldfish' package. This is useful for creating a self-contained data context for estimate_dynam(), estimate_rem(), estimate_dynami() and gather_model_data().

Usage

make_data(..., parent_env = parent.frame())

make_data_goldfish(..., parent_env = parent.frame())

Arguments

...

Objects to be included in the data environment. These objects will be copied by their given argument names.

parent_env

The parent environment for the new data environment. Also, the environment from which linked objects (not explicitly provided in ...) will be searched. Defaults to parent.frame().

Details

The function recursively searches for linked objects:

  • For a nodes.goldfish object: Events that modify its nodal attributes.

  • For a network.goldfish object: Events that modify its structure, and the nodes.goldfish object(s) that define its nodes.

  • For a dependent.goldfish object: The network.goldfish object and nodes.goldfish object(s) defining its events' scope.

Linked objects are searched for in the parent_env (defaults to the calling environment) and the enclosing frames of the parent_env environment (see base::get(), base::exists()).

Value

An environment of class data.goldfish containing the specified objects and their resolved dependencies.

Examples

data("Social_Evolution")
callNetwork <- make_network(nodes = actors, directed = TRUE)
callNetwork <- link_events(
  x = callNetwork, change_event = calls,
  nodes = actors
)
callsDependent <- make_dependent_events(
  events = calls, nodes = actors,
  default_network = callNetwork
)
socialEvolutionData <- make_data(
  callNetwork, callsDependent, actors
)

data("Fisheries_Treaties_6070")
states <- make_nodes(states)
states <- link_events(states, sovchanges, attribute = "present")
states <- link_events(states, regchanges, attribute = "regime")
states <- link_events(states, gdpchanges, attribute = "gdp")

bilatnet <- make_network(bilatnet, nodes = states, directed = FALSE)
bilatnet <- link_events(bilatnet, bilatchanges, nodes = states)

contignet <- make_network(contignet, nodes = states, directed = FALSE)
contignet <- link_events(contignet, contigchanges, nodes = states)

createBilat <- make_dependent_events(
  events = bilatchanges[bilatchanges$increment == 1, ],
  nodes = states, default_network = bilatnet
)

fisheriesData <- make_data(
  bilatnet, createBilat, states,
  contignet, sovchanges, regchanges, gdpchanges
)

Define dependent events for a model

Description

The final step in defining the data objects is to identify the dependent events.

Usage

make_dependent_events(
  events,
  nodes,
  nodes2 = NULL,
  default_network = NULL,
  envir = environment()
)

make_dependent_events_goldfish(
  events,
  nodes,
  nodes2 = NULL,
  default_network = NULL,
  envir = environment()
)

Arguments

events

a data frame containing the event list that should be considered as a dependent variable in models.

nodes

a data frame or a nodes.goldfish object containing the nodes used in the event list.

nodes2

a second nodeset in the case that the events occurs in a two-mode network.

default_network

the name of a network.goldfish object.

envir

An environment object where the nodes-set and default network objects are defined. The default value is environment().

Details

Before this step is performed, we have to define: the nodeset (make_nodes()), the network (make_network()) and the link the event list to the network (link_events()).

During the definition as a dependent event, some checks are done to ensure consistency with the default network and the nodeset. In particular, consistency of the labels of nodes in the events with the nodes' labels in the network and the nodeset is done.

It is possible to define as a dependent event a different set of events to the ones link to the default network. This is useful to model different type of events where the event dynamic is driven by different effects or its weight differs. Fisheries_Treaties_6070 has an example of it, the relational event modeled are fisheries treaties between countries. The bilatchanges data frame contains information of creation and dissolution of treaties. vignette(teaching2) shows how to model just the creation of treaties conditional on creation and dissolution.

Value

an object with additional class dependent.goldfish with attributes:

nodes

a character vector with the names of the nodes set that define the dimensions of the default_network. nodes and nodes2 arguments.

default_network

A character value with the name of the network object when this is present. default_network argument.

type

A character value that can take values monadic or dyadic depending on the arguments used during the definition.

The object can be modified using methods for data frames.

See Also

make_nodes(), make_network(), link_events()

Examples

actors <- data.frame(
  actor = 1:5, label = paste("Actor", 1:5),
  present = TRUE, gender = sample.int(2, 5, replace = TRUE)
)
actors <- make_nodes(nodes = actors)
calls <- data.frame(
  time = c(12, 27, 45, 56, 66, 68, 87),
  sender = paste("Actor", c(1, 3, 5, 2, 3, 4, 2)),
  receiver = paste("Actor", c(4, 2, 3, 5, 1, 2, 5)), increment = rep(1, 7)
)
callNetwork <- make_network(nodes = actors)
callNetwork <- link_events(
  x = callNetwork, change_events = calls, nodes = actors
)

# Defining the dependent events:
callDependent <- make_dependent_events(
  events = calls, nodes = actors, default_network = callNetwork
)

Define a global time-varying attribute

Description

This function allows to define a global attribute of the nodeset (i.e a variable that is identical for each node but changes over time).

Usage

make_global_attribute(global)

make_global_attribute_goldfish(global)

Arguments

global

a data frame containing all the values this global attribute takes along time.

Details

For instance, seasonal climate changes could be defined as a changing global attribute. Then, this global attribute can be linked to the nodeset by using link_events()

Value

an object of class global.goldfish

Examples

seasons <- make_global_attribute(data.frame(time = 1:12, replace = 1:12))

To define the second mode of a DyNAM-i model

Description

This function create all objects necessary to the estimation of a DyNAM-i model model = "DyNAMi" from dyadic interaction records and an actor set. It first creates a nodeset for the second mode of the interaction network that will be modeled, i.e. the interaction groups set, and an event list that indicates when groups are present or not through time. It then creates a list of interaction events, between actors and groups, in which an actor either joins or leaves a group. It is decomposed in an list of dependent events (that should be modeled) and a list of exogenous events (that should not be modeled). For example when an actor leaves a group and joins her own singleton group, only the leaving event is modeled but not the joining one, and vice versa when an actor belonging to a singleton group joins another group.

Usage

make_groups_interaction(
  records,
  actors,
  seed_randomization,
  progress = getOption("progress")
)

Arguments

records

an object of class data.frame that is a list of rows of type node A, nodeB, Start, End, where nodeA and nodeB indicate the actors involved in a dyadic interaction, and Start and End indicating the starting and ending time of their interaction.

actors

a object of class nodes.goldfish that defines the actors interacting (labels in records and actors should be identical).

seed_randomization

an integer used whenever there should be some random choice to be made.

progress

logical weather detailed information of intermediate steps should be printed in the console.

Details

It is important to notice that sometimes some random decisions have to be made regarding who joined or left a group, for example when two actors start interacting but we do not know who initiated the interaction. Tot est for the robustness of such a procedure, one can use different randomization seeds and run the model several times.

Value

a list with the following data frames

interaction.updates

containing all joining and leaving events

groups

containing the nodeset corresponding to interaction groups (the second mode of the network)

dependent.events

for the events that should be modeled

exogenous.events

that are not modeled (for example when an actor leaves a group and joins its own singleton group, only the leaving event is modeled but not the joining event)

composition.changes

that is an events list that should be attached to the groups nodeset to indicate when a group is present or not


Defining a network with dynamic events

Description

The function defines a network object either from a nodeset or from a matrix (sociomatrix or adjacency matrix). If a matrix is used as input, make_network() returns a network filled with the same values as the ones present in the provided network. If the nodeset is the only argument, make_network() returns an empty network with the number of columns and rows corresponding to the size of the nodeset. These networks are static, but they can be turned into dynamic networks by linking dynamic events to the network objectw using link_events().

Usage

make_network(
  matrix = NULL,
  nodes,
  nodes2 = NULL,
  directed = TRUE,
  envir = environment()
)

make_network_goldfish(
  matrix = NULL,
  nodes,
  nodes2 = NULL,
  directed = TRUE,
  envir = environment()
)

Arguments

matrix

An initial matrix (optional), and object of class matrix.

nodes

A node-set (see make_nodes()).

nodes2

A second optional node-set for the definition of two-mode networks.

directed

A logical value indicating whether the network is directed.

envir

An environment object where the nodes-set objects are defined. The default value is environment().

Details

If a matrix is used as input, its dimension names must be a subset of the nodes in the nodeset as defined with the make_nodes() and the order of the labels in rows and columns must correspond to the order of node labels in the nodeset. The matrix can be directed or undirected (as specified with the directed argument).

If the network is updated over time (e.g., a new wave of friendship data is collected), these changes can be added with the link_events() - similar to link changing attribute events to a nodeset. This time, the user needs to provide the network and the associated nodeset. If no matrix is provided, goldfish only considers the nodeset and assumes the initial state to be empty (i.e., a matrix containing only 0s).

Value

an object with additional class network.goldfish with attributes:

nodes

a character vector with the names of the nodes set objects used during the definition. nodes and nodes2 arguments.

directed

Logical value indicating whether the network is directed. directed argument

events

An empty character vector. link_events() is used to link event data frames.

The object can be modified using methods for matrix.

See Also

make_nodes(), link_events()

Examples

# If no intial matrix is provided
data("Social_Evolution")
callNetwork <- make_network(nodes = actors)

# If a initial matrix is provided
data("Fisheries_Treaties_6070")
bilatnet <- make_network(bilatnet, nodes = states, directed = FALSE)

Defining a node set with (dynamic) node attributes.

Description

The make_nodes() function processes and checks the data.frame passed to the nodes argument. This is a recommended step before the definition of the network.

Usage

make_nodes(nodes)

make_nodes_goldfish(nodes)

Arguments

nodes

a data.frame object with the nodes attributes with the following reserved names

label

character variable containing the nodes labels (mandatory)

present

logical variable indicating if the respective node is present at the first time-point (optional)

Details

Additional variables in the nodes data frame object are considered as the initial values of the nodes attributes. Those variables must be of class numeric, character, logical.

It is important that the initial definition of the node set contain all the nodes that could be potential senders or receivers of events. In case that all the nodes are not available at all times, the present variable can be used to define compositional changes. Therefore, the initial node set would contain all the potential senders and receivers nodes and the variable present will indicate all the nodes present at the beginning as senders or receivers. Using link_events() is possible to link events where the composition of available nodes changes over time, see vignette("teaching2").

For the attributes in the nodeset to become dynamic, them can be linked to a dynamic event-list data frames in the initial state object by using the link_events(). A new call of link_events() is required for each attribute that is dynamic.

Objects of class tibble::tbl_df from the tibble package frequently use in the tidyverse ecosystem and objects from the data.table package will produce errors in later steps for goldfish. Current implementation of goldfish relies on the subsetting behavior of data frames objects. The previous mentioned objects classes change this behavior producing errors.

Value

an object with an additional class nodes.goldfish with attributes:

events

An empty character vector. link_events() is used to link event data frames.

dynamic_attributes

An empty character vector. link_events() is used to link event data frames and their related attribute.

The object can be modified using methods for data frames.

See Also

make_network(), link_events()

Examples

nodesAttr <- data.frame(
  label = paste("Actor", 1:5),
  present = c(TRUE, FALSE, TRUE, TRUE, FALSE),
  gender = c(1, 2, 1, 1, 2)
)
nodesAttr <- make_nodes(nodes = nodesAttr)

# Social evolution nodes definition
data("Social_Evolution")
actors <- make_nodes(actors)

# Fisheries treaties nodes definition
data("Fisheries_Treaties_6070")
states <- make_nodes(states)

RFID Validity dataset

Description

Dataset collected at ETH Zürich by Timon Elmer and colleagues in order to test the accuracy of Radio Frequency Identification (RFID) badges for measuring social interactions. Social interactions of 11 individuals (from the university staff) were recorded with RFID badges in an informal setting. They were then compared to the interactions observed by two confederates who watched the video recording of the event. The RFID data went through the data processing procedure detailed in the original article. See Elmer et al, 2019 for more details, and the OSF platform for all details on the dataset.

Usage

data(RFID_Validity_Study)

rfid

video

known.before

participants

Format

3 dataframes:

  • participants (11 rows, 7 columns): attributes of the experiment's participants

  • rfid (1011 rows, 4 columns): dyadic interactions detected by the RFID badges (after data processing)

  • video (219 rows, 4 columns): dyadic interactions detected by the video rating
    and one network:

  • known.before (11 rows, 11 columns): network of previous acquaintances
    See below for variables and formats.

Object Description Format
participants$actor Identifier of the actor integer
participants$label (Anonymized) name Factor
participants$present Presence of the actor (all actors are present) logical
participants$age Actor's age integer
participants$gender Actor's gender (0: male, 1: female) integer
participants$group Actor's group affiliation (groups have distinct ids) integer
participants$level Actor's seniority (1: MSc student, 2: PhD student, 3: PostDoc, 4: Prof) integer
rfid$NodeA Identifier for the first actor chr
rfid$NodeB Identifier for the second actor chr
rfid$Start Time of the beginning of the dyadic interaction integer
rfid$End Time of the end of the dyadic interaction integer
video$NodeA Identifier for the first actor chr
video$NodeB Identifier for the second actor chr
video$Start Time of the beginning of the dyadic interaction integer
video$End Time of the end of the dyadic interaction integer

An object of class data.frame with 1011 rows and 4 columns.

An object of class data.frame with 219 rows and 4 columns.

An object of class matrix (inherits from array) with 11 rows and 11 columns.

An object of class data.frame with 11 rows and 7 columns.

Source

https://osf.io/rrhxe/

References

Elmer, T., Chaitanya, K., Purwar, P., & Stadtfeld, C. (2019). The validity of RFID badges measuring face-to-face interactions. Behavior research methods, 1-19. doi:10.3758/s13428-018-1180-y


Control Parameters for Estimation

Description

Specifies control parameters for the model estimation process in ⁠[estimate]⁠.

Usage

set_estimation_opt(
  initial_parameters = NULL,
  fixed_parameters = NULL,
  max_iterations = 20,
  convergence_criterion = 0.001,
  initial_damping = NULL,
  damping_increase_factor = 2,
  damping_decrease_factor = 3,
  return_interval_loglik = FALSE,
  return_probabilities = FALSE,
  engine = c("default_c", "default", "gather_compute")
)

Arguments

initial_parameters

A numeric vector. It includes initial parameter values used to initialize the estimation process. Default is NULL, which means parameters are initialized at zero, except for the rate intercept when present.

fixed_parameters

A numeric vector of the same length as the number of parameters to be estimated in the model. NA values indicate parameters to be estimated, while numeric values indicate parameters to be fixed at the given value. For example, if the vector is c(2, NA) then the first component of the parameter is fixed to 2 during the estimation process. Default is NULL (all parameters are estimated).

max_iterations

An integer. The maximum number of iterations in the Gauss-Fisher scoring algorithm. Default is 20.

convergence_criterion

A numeric value. The convergence criterion for the estimation. The algorithm stops if the sum of absolute scores is smaller than this value. Default is 0.001.

initial_damping

A numeric value. The initial damping factor for the Gauss-Fisher scoring algorithm. Default is NULL, which allows estimate_dynam(), estimate_rem() and estimate_dynami() to set a context-dependent default (e.g., 30 or 10 based on wheter the model has windows effects). If set, this value is used directly.

damping_increase_factor

A numeric value. Factor by which damping is increased when improvements in the estimation are found. Must be >= 1. Default is 2.

damping_decrease_factor

A numeric value. Factor by which damping is decreased when no improvements in the estimation are found. Must be >= 1. Default is 3.

return_interval_loglik

A logical value. Whether to keep and return the log-likelihood for each event. Default is FALSE.

return_probabilities

A logical value. Whether to keep and return the probabilities for all alternatives for each event.

  • When subModel = "choice" the probabilities correspond to all actors in the choice set present at the time of the event.

  • When model = "REM" the probabilities correspond to all dyads present at the time of the event. Default is FALSE.

engine

A character string specifying the estimation engine. Options are:

default_c

⁠C++⁠ based implementation using RcppEigen and RcppParallel.

default

R-based implementation.

gather_compute

⁠C++⁠ based implementation with a different data structure that reduces the time but it can increase the memory usage.

Default is "default_c".

Details

The damping factors arguments control the step size at each iteration of the Newton-Raphson algorithm. They have a bigger impact in the first iterations of the algorithm and will decrease by half after each iteration. In particular, the increase factor is the one that is expected to play a role during the first iterations where it's easier to improve the log-likelihood. In scenarios where the model is fit in a large dataset, for example, when the number of actors in the system is large, the damping_increase_factor and the damping_decrease_factor arguments can be increased from the default values (e.g., 4 or 6) to speed up the estimation process with large changes in the coefficients. However, this should have the opposite effect in small datasets producing large changes in the coefficients that would create more iterations (in a similar vein to the step size parameter in gradient descendent).

Value

An object of class estimation_opt.goldfish (a list object), where the components values are the default values or the values provided to the function. The list object has the following components:

initial_parameters

Initial parameter values used during the estimation process.

fixed_parameters

Values for parameters fixed during the estimation process.

max_iterations

Maximum number of iterations in the estimation process.

convergence_criterion

Convergence criterion for the estimation process.

initial_damping

Initial damping factor for the estimation process.

damping_increase_factor

Factor by which damping is increased when improvements in the estimation are found.

damping_decrease_factor

Factor by which damping is decreased when no improvements in the estimation are found.

return_interval_loglik

Logical value indicating whether to return the log-likelihood for each event.

engine

Estimation engine used in the estimation process.

Examples

est_ctrl <- set_estimation_opt(
  max_iterations = 50,
  convergence_criterion = 1e-4
)

Control Parameters for Preprocessing

Description

Specifies control parameters for the data preprocessing stage, used by estimate_dynam(), estimate_rem() and estimate_dynami() (when preprocessingInit is not a preprocessed.goldfish object) and gather_model_data().

Usage

set_preprocessing_opt(
  start_time = NULL,
  end_time = NULL,
  opportunities_list = NULL
)

Arguments

start_time

A numerical value or a date-time character string (parsable by as.POSIXct) indicating the starting time when the events are considered for likelihood computation. All the events that happen before the start_time are used to compute the initial values of the effects statistics in the model. It's useful to set this parameter when the model has windowed effects or effects that depends of previous order of events (e.g., trans() and cycle() when history argument is set to sequential or consecutive, as they are initialized with empty values. Default is NULL (start from the first event).

end_time

A numerical value or a date-time character string (parsable by as.POSIXct) indicating the end time when the events are not to be considered for likelihood computation. The preprocessing stage won't stop at this time and will continue processing events after this time. Default is NULL (end with the last event).

opportunities_list

A list object. For choice models, this list specifies, for each dependent event, the set of available nodes in the choice set. The list should have the same length as the number of events in the dependent events objects created with make_dependent_events(). Default is NULL, so the choice set is the set of all nodes present at the time of the event.

Value

An object of class preprocessing_opt.goldfish (a list object), with where the components values are the default values or the values provided to the function. The list object has the following components:

start_time

Value from start_time argument.

end_time

Value from end_time argument.

opportunities_list

Value from opportunities_list argument.

Examples

prep_ctrl <- set_preprocessing_opt(
  start_time = "2000-01-01 00:00:00",
  end_time = "2000-12-31 23:59:59"
)

Social evolution of a university dormitory cohort

Description

An abbreviated version of the MIT Reality Commons Social Evolution dataset, spanning a reduced time period and with fewer variables. Dyadic variables include binary friendships at time of survey, and time-stamped phone call occurrences. Individual variables include the floor of the dormitory on which the student resides, and the grade type of each student including freshmen, sophomore, junior, senior, or graduate tutors.

Usage

data(Social_Evolution)

actors

calls

friendship

Format

3 dataframes: actors (84 rows, 4 columns), calls (439 rows, 4 columns), friendship (766 rows, 4 columns). See below for variables and formats.

Object Description Format
actors$label Actor identifier labels character
actors$present Actor present in dataset boolean
actors$floor Floor of residence actor lives on numeric (1-9)
actors$gradeType Degree level numeric (1-5)
calls$time Time and date of call numeric from POSIXct
calls$sender Initiator of phone call character
calls$receiver Recipient of phone call character
calls$increment Indicates call number increment (all 1s) numeric (1)
friendship$time Time and date of friend nomination numeric from POSIXct
friendship$sender Nominator of friendship character
friendship$receiver Nominee of friendship character
friendship$replace Indicates friendship value at $time numeric

An object of class data.frame with 84 rows and 4 columns.

An object of class data.frame with 439 rows and 4 columns.

An object of class data.frame with 766 rows and 4 columns.

References

A. Madan, M. Cebrian, S. Moturu, K. Farrahi, A. Pentland (2012). Sensing the 'Health State' of a Community. Pervasive Computing. 11, 4, pp. 36-45. doi:10.1109/MPRV.2011.79.


Methods to update a nodes or network object

Description

Methods to create a data frame from an object of class nodes.goldfish (see make_nodes()) or a matrix from an object of class network.goldfish (see make_network()) with the attributes or the network ties updated according with the events linked to the object using the link_events()) function.

Usage

## S3 method for class 'nodes.goldfish'
as.data.frame(x, ..., time = -Inf, startTime = -Inf, envir = new.env())

## S3 method for class 'network.goldfish'
as.matrix(x, ..., time = -Inf, startTime = -Inf)

Arguments

x

an object of class nodes.goldfish for as.data.frame() method or network.goldfish for as.matrix() method.

...

Not further arguments are required.

time

a numeric value or a calendar date value (see as.Date()) to update the state of the object x until this time value (event time < time).

startTime

a numeric as.Date format value; prior events are disregarded.

envir

an environment where the nodes and linked events objects are available.

Value

The respective object updated accordingly to the events link to it. For nodes.goldfish object the attributes are updated according to the events linked to them. For network.goldfish object the network ties are updated according to the events linked to it.

See Also

make_network(), make_nodes(), link_events()

Examples

data("Fisheries_Treaties_6070")
states <- make_nodes(states)
states <- link_events(states, sovchanges, attribute = "present")
states <- link_events(states, regchanges, attribute = "regime")
states <- link_events(states, gdpchanges, attribute = "gdp")

bilatnet <- make_network(bilatnet, nodes = states, directed = FALSE)
bilatnet <- link_events(bilatnet, bilatchanges, nodes = states)

updateStates <- as.data.frame(
  states,
  time = as.numeric(as.POSIXct("1965-12-31"))
)


updateNet <- as.matrix(bilatnet, time = as.numeric(as.POSIXct("1965-12-31")))