Title: | Exponential Random Partition Models |
---|---|
Description: | Simulates and estimates the Exponential Random Partition Model presented in the paper Hoffman, Block, and Snijders (2023) <doi:10.1177/00811750221145166>. It can also be used to estimate longitudinal partitions, following the model proposed in Hoffman and Chabot (2023) <doi:10.1016/j.socnet.2023.04.002>. The model is an exponential family distribution on the space of partitions (sets of non-overlapping groups) and is called in reference to the Exponential Random Graph Models (ERGM) for networks. |
Authors: | Marion Hoffman [cre, aut, cph] , Alexandra Amani [aut], Nico Keiser [aut] |
Maintainer: | Marion Hoffman <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.2.0.9000 |
Built: | 2025-01-03 04:58:58 UTC |
Source: | https://github.com/stocnet/erpm |
Function to calculate the number of partitions with groups of sizes between smin and smax
Bell_constraints(n, smin, smax)
Bell_constraints(n, smin, smax)
n |
number of nodes |
smin |
minimum group size possible in the partition |
smax |
minimum group size possible in the partition |
a numeric
n <- 6 size_min <- 2 size_max <- 4 Bell_constraints(n,size_min,size_max)
n <- 6 size_min <- 2 size_max <- 4 Bell_constraints(n,size_min,size_max)
Recursive function to calculate the denominator for the model with a single statistic for the number of groups and a given parameter value. The set of possible partitions can be restricted to partitions with groups of a certain size.
calculate_denominator_Dirichlet_restricted(n, smin, smax, alpha, results)
calculate_denominator_Dirichlet_restricted(n, smin, smax, alpha, results)
n |
number of nodes |
smin |
minimum size for a group |
smax |
maximum size for a group |
alpha |
parameter value |
results |
a list |
a numeric
Calculate the probability of observing a partition with a given number of groups for a model with a single statistic for the number of groups and a given parameter value. The set of possible partitions can be restricted to partitions with groups of a certain size.
calculate_proba_Dirichlet_restricted(alpha, stat, n, smin, smax)
calculate_proba_Dirichlet_restricted(alpha, stat, n, smin, smax)
alpha |
parameter value |
stat |
observed stat (number of groups) |
n |
number of nodes |
smin |
minimum size for a group |
smax |
maximum size for a group |
a numeric
Function to determine whether a partition contains the allowed group sizes
check_sizes(partition, sizes.allowed, numgroups.allowed)
check_sizes(partition, sizes.allowed, numgroups.allowed)
partition |
observed partition |
sizes.allowed |
vector containing possible group sizes in the partition |
numgroups.allowed |
vector containing possible number of groups in the partition |
boolean
Recursive function to compute the average size of a random partition for a given number of nodes
compute_averagesize(num.nodes)
compute_averagesize(num.nodes)
num.nodes |
number of nodes |
a numeric
n <- 6 compute_averagesize(n)
n <- 6 compute_averagesize(n)
Recursive function to compute the value of the denominator for the model with a single statistic which is the number of groups
compute_numgroups_denominator(num.nodes, alpha)
compute_numgroups_denominator(num.nodes, alpha)
num.nodes |
number of nodes |
alpha |
parameter value |
a numeric
Function that computes the statistic vector for a given partition and a given model
computeStatistics(partition, nodes, effects, objects)
computeStatistics(partition, nodes, effects, objects)
partition |
vector, A partition |
nodes |
data frame, Node set |
effects |
list with a vector "names", and a vector "objects", Effects/sufficient statistics |
objects |
list with a vector "name", and a vector "object", Objects used for statistics calculation |
the statistics
Function that computes the statistic vector for given (multiple) partitions and a given model
computeStatistics_multiple( partitions, presence.tables, nodes, effects, objects, single.obs = NULL )
computeStatistics_multiple( partitions, presence.tables, nodes, effects, objects, single.obs = NULL )
partitions |
Observed partitions |
presence.tables |
to indicate which nodes were present when |
nodes |
Node set (data frame) |
effects |
Effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
Objects used for statistics calculation (list with a vector "name", and a vector "object") |
single.obs |
equal NULL by default |
A list
This function computes the correlation between the group averages of the two attributes.
correlation_between(partition, attribute1, attribute2)
correlation_between(partition, attribute1, attribute2)
partition |
A partition (vector) |
attribute1 |
A vector containing the values of the first attribute |
attribute2 |
A vector containing the values of the second attribute |
A number corresponding to the correlation coefficient
p <- c(1,2,2,3,3,4,4,4,5) at <- c(3,5,23,2,1,0,3,9,2) at2 <- c(3,5,20,2,1,0,0,9,0) correlation_between(p,at,at2)
p <- c(1,2,2,3,3,4,4,4,5) at <- c(3,5,23,2,1,0,3,9,2) at2 <- c(3,5,20,2,1,0,0,9,0) correlation_between(p,at,at2)
This function computes the correlation between an attribute and the size of the groups.
correlation_with_size(partition, attribute, categorical)
correlation_with_size(partition, attribute, categorical)
partition |
A partition (vector) |
attribute |
A vector containing the values of the attribute |
categorical |
A Boolean (True or False) indicating if the attribute is categorical |
A number corresponding to the correlation coefficient if the attribute is numerical or the correlation ratio if the attribute is categorical.
p <- c(1,2,2,3,3,4,4,4,5) at <- c(3,5,23,2,1,0,3,9,2) correlation_with_size(p,at,categorical=FALSE)
p <- c(1,2,2,3,3,4,4,4,5) at <- c(3,5,23,2,1,0,3,9,2) correlation_with_size(p,at,categorical=FALSE)
This function computes the correlation between the two attributes for individuals in the same group.
correlation_within(partition, attribute1, attribute2, group)
correlation_within(partition, attribute1, attribute2, group)
partition |
A partition (vector) |
attribute1 |
A vector containing the values of the first attribute |
attribute2 |
A vector containing the values of the second attribute |
group |
A number indicating the selected group |
A number corresponding to the correlation coefficient
p <- c(1,2,2,3,3,4,4,4,5) at <- c(3,5,23,2,1,0,3,9,2) at2 <- c(3,5,20,2,1,0,0,9,0) correlation_within(p,at,at2,4)
p <- c(1,2,2,3,3,4,4,4,5) at <- c(3,5,23,2,1,0,3,9,2) at2 <- c(3,5,20,2,1,0,0,9,0) correlation_within(p,at,at2,4)
Function to count the number of partitions with a certain group size structure, for all possible group size structure. Function to use after calling the "find_all_partitions" function.
count_classes(allpartitions)
count_classes(allpartitions)
allpartitions |
matrix containing all possible partitions for a nodeset |
integer(number of partitions with different group structures)
#find partitions first n <- 6 all_partitions <- find_all_partitions(n) # count classes counts_partition_classes <- count_classes(all_partitions)
#find partitions first n <- 6 all_partitions <- find_all_partitions(n) # count classes counts_partition_classes <- count_classes(all_partitions)
This function tests a partition statistic against a "conditional uniform partition null hypothesi: It compares a statistic computed on an observed partition and the same statistic computed on a set of permuted partition (partitions with the same group structure as the observed partition, with nodes being permuted).
CUP(observation, fun, permutations = NULL, num.permutations = 1000)
CUP(observation, fun, permutations = NULL, num.permutations = 1000)
observation |
A vector giving the observed partition |
fun |
A function used to compute a given partition statistic to be computed |
permutations |
A matrix, whose lines contain partitions which are permutations of the observed partition. This argument is NULL by default (in that case, the permutations are created automatically). |
num.permutations |
An integer indicating the number of permutations to generate, if they are not already given. 1000 permutations are generated by default. |
This test is similar to Conditional Uniform Graph tests in networks (we translate this into Condtional Uniform Partition tests).
The value of the statistic calculated for the observed partition, the mean value of the statistic among permuted partitions, the standard deviation of the statistic among permuted partitions, the proportion of permutation below the observed statistic, the proportion of permutation above the observed statistic, the lower boundary of the 95% CI, the upper boundary of the 95% CI
p <- c(1,2,2,3,3,4,4,4,5) at <- c(0,1,1,1,1,0,0,0,0) CUP(p,fun=function(x){same_pairs(x,at,'avg_pergroup')})
p <- c(1,2,2,3,3,4,4,4,5) at <- c(0,1,1,1,1,0,0,0,0) CUP(p,fun=function(x){same_pairs(x,at,'avg_pergroup')})
Function to sample the model with a Markov chain (single partition procedure).
draw_Metropolis_multiple( theta, first.partitions, presence.tables, nodes, effects, objects, burnin, thining, num.steps, neighborhood = c(0.7, 0.3, 0), numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, return.all.partitions = FALSE, verbose = FALSE )
draw_Metropolis_multiple( theta, first.partitions, presence.tables, nodes, effects, objects, burnin, thining, num.steps, neighborhood = c(0.7, 0.3, 0), numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, return.all.partitions = FALSE, verbose = FALSE )
theta |
model parameters |
first.partitions |
starting partition for the Markov chain |
presence.tables |
matrix indicating which actors were present for each observations (mandatory) |
nodes |
node set (data frame) |
effects |
effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
objects used for statistics calculation (list with a vector "name", and a vector "object") |
burnin |
integer for the number of burn-in steps before sampling |
thining |
integer for the number of thining steps between sampling |
num.steps |
number of samples |
neighborhood |
= c(0.7,0.3,0), way of choosing partitions: probability vector (2 actors swap, merge/division, single actor move, single pair move, 2 pairs swap, 2 groups reshuffle) |
numgroups.allowed |
= NULL, # vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
= NULL, # vector containing the number of groups simulated |
sizes.allowed |
= NULL, vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
= NULL, vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
return.all.partitions |
= FALSE, option to return the sampled partitions on top of their statistics (for GOF) |
verbose |
logical: should intermediate results during the estimation be printed or not? Defaults to FALSE. |
A list
# define an arbitrary set of n = 6 nodes with attributes, and an arbitrary covariate matrix n <- 6 nodes <- data.frame(label = c("A","B","C","D","E","F"), gender = c(1,1,2,1,2,2), age = c(20,22,25,30,30,31)) friendship <- matrix(c(0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0), 6, 6, TRUE) # specify whether nodes are present at different points of time presence.tables <- matrix(c(1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1), 6, 3) # choose effects to be included in the estimated model effects_multiple <- list(names = c("num_groups","same","diff","tie","inertia_1"), objects = c("partitions","gender","age","friendship","partitions"), objects2 = c("","","","","")) objects_multiple <- list() objects_multiple[[1]] <- list(name = "friendship", object = friendship) # set parameter values for each of these effects parameters <- c(-0.2,0.2,-0.1,0.5,1) # set a starting point for the simulation first.partitions <- matrix(c(1, 1, 2, 2, 2, 3, NA, 1, 1, 2, 2, 2, 1, NA, 2, 3, 3, 1), 6, 3) # generate the simulated sample nsteps <- 50 sample <- draw_Metropolis_multiple(theta = parameters, first.partitions = first.partitions, nodes = nodes, presence.tables = presence.tables, effects = effects_multiple, objects = objects_multiple, burnin = 100, thining = 100, num.steps = nsteps, neighborhood = c(0,1,0), numgroups.allowed = 1:n, numgroups.simulated = 1:n, sizes.allowed = 1:n, sizes.simulated = 1:n, return.all.partitions = TRUE)
# define an arbitrary set of n = 6 nodes with attributes, and an arbitrary covariate matrix n <- 6 nodes <- data.frame(label = c("A","B","C","D","E","F"), gender = c(1,1,2,1,2,2), age = c(20,22,25,30,30,31)) friendship <- matrix(c(0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0), 6, 6, TRUE) # specify whether nodes are present at different points of time presence.tables <- matrix(c(1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1), 6, 3) # choose effects to be included in the estimated model effects_multiple <- list(names = c("num_groups","same","diff","tie","inertia_1"), objects = c("partitions","gender","age","friendship","partitions"), objects2 = c("","","","","")) objects_multiple <- list() objects_multiple[[1]] <- list(name = "friendship", object = friendship) # set parameter values for each of these effects parameters <- c(-0.2,0.2,-0.1,0.5,1) # set a starting point for the simulation first.partitions <- matrix(c(1, 1, 2, 2, 2, 3, NA, 1, 1, 2, 2, 2, 1, NA, 2, 3, 3, 1), 6, 3) # generate the simulated sample nsteps <- 50 sample <- draw_Metropolis_multiple(theta = parameters, first.partitions = first.partitions, nodes = nodes, presence.tables = presence.tables, effects = effects_multiple, objects = objects_multiple, burnin = 100, thining = 100, num.steps = nsteps, neighborhood = c(0,1,0), numgroups.allowed = 1:n, numgroups.simulated = 1:n, sizes.allowed = 1:n, sizes.simulated = 1:n, return.all.partitions = TRUE)
Function to sample the model with a Markov chain (single partition procedure).
draw_Metropolis_single( theta, first.partition, nodes, effects, objects, burnin, thining, num.steps, neighborhood = c(0.7, 0.3, 0), numgroups.allowed = NULL, numgroups.simulated = NULL, sizes.allowed = NULL, sizes.simulated = NULL, return.all.partitions = FALSE )
draw_Metropolis_single( theta, first.partition, nodes, effects, objects, burnin, thining, num.steps, neighborhood = c(0.7, 0.3, 0), numgroups.allowed = NULL, numgroups.simulated = NULL, sizes.allowed = NULL, sizes.simulated = NULL, return.all.partitions = FALSE )
theta |
model parameters |
first.partition |
starting partition for the Markov chain |
nodes |
nodeset (data frame) |
effects |
effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
objects used for statistics calculation (list with a vector "name", and a vector "object") |
burnin |
integer for the number of burn-in steps before sampling |
thining |
integer for the number of thining steps between sampling |
num.steps |
number of samples |
neighborhood |
= c(0.7,0.3,0), way of choosing partitions: probability vector (2 actors swap, merge/division, single actor move, single pair move, 2 pairs swap, 2 groups reshuffle) |
numgroups.allowed |
= NULL, # vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
= NULL, # vector containing the number of groups simulated |
sizes.allowed |
= NULL, vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
= NULL, vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
return.all.partitions |
= FALSE option to return the sampled partitions on top of their statistics (for GOF) |
A list
# define an arbitrary set of n = 6 nodes with attributes, and an arbitrary covariate matrix n <- 6 nodes <- data.frame(label = c("A","B","C","D","E","F"), gender = c(1,1,2,1,2,2), age = c(20,22,25,30,30,31)) friendship <- matrix(c(0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0), 6, 6, TRUE) # choose the effects to be included (see manual for all effect names) effects <- list(names = c("num_groups","same","diff","tie"), objects = c("partition","gender","age","friendship")) objects <- list() objects[[1]] <- list(name = "friendship", object = friendship) # set parameter values for each of these effects parameters <- c(-0.2, 0.2, -0.1, 0.5) # generate simulated sample, by setting the desired additional parameters for the # Metropolis sampler and choosing a starting point for the chain (first.partition) nsteps <- 100 sample <- draw_Metropolis_single(theta = parameters, first.partition = c(1,1,2,2,3,3), nodes = nodes, effects = effects, objects = objects, burnin = 100, thining = 10, num.steps = nsteps, neighborhood = c(0,1,0), numgroups.allowed = 1:n, numgroups.simulated = 1:n, sizes.allowed = 1:n, sizes.simulated = 1:n, return.all.partitions = TRUE) # or: simulate an estimated model partition <- c(1,1,2,2,2,3) # the partition already defined for the (previous) estimation nsimulations <- 1000 simulations <- draw_Metropolis_single(theta = estimation$results$est, first.partition = partition, nodes = nodes, effects = effects, objects = objects, burnin = 100, thining = 20, num.steps = nsimulations, neighborhood = c(0,1,0), sizes.allowed = 1:n, sizes.simulated = 1:n, return.all.partitions = TRUE)
# define an arbitrary set of n = 6 nodes with attributes, and an arbitrary covariate matrix n <- 6 nodes <- data.frame(label = c("A","B","C","D","E","F"), gender = c(1,1,2,1,2,2), age = c(20,22,25,30,30,31)) friendship <- matrix(c(0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0), 6, 6, TRUE) # choose the effects to be included (see manual for all effect names) effects <- list(names = c("num_groups","same","diff","tie"), objects = c("partition","gender","age","friendship")) objects <- list() objects[[1]] <- list(name = "friendship", object = friendship) # set parameter values for each of these effects parameters <- c(-0.2, 0.2, -0.1, 0.5) # generate simulated sample, by setting the desired additional parameters for the # Metropolis sampler and choosing a starting point for the chain (first.partition) nsteps <- 100 sample <- draw_Metropolis_single(theta = parameters, first.partition = c(1,1,2,2,3,3), nodes = nodes, effects = effects, objects = objects, burnin = 100, thining = 10, num.steps = nsteps, neighborhood = c(0,1,0), numgroups.allowed = 1:n, numgroups.simulated = 1:n, sizes.allowed = 1:n, sizes.simulated = 1:n, return.all.partitions = TRUE) # or: simulate an estimated model partition <- c(1,1,2,2,2,3) # the partition already defined for the (previous) estimation nsimulations <- 1000 simulations <- draw_Metropolis_single(theta = estimation$results$est, first.partition = partition, nodes = nodes, effects = effects, objects = objects, burnin = 100, thining = 20, num.steps = nsimulations, neighborhood = c(0,1,0), sizes.allowed = 1:n, sizes.simulated = 1:n, return.all.partitions = TRUE)
Function to estimate a given model for a given observed partition. All options of the algorithm can be specified here.
estimate_ERPM( partition, nodes, objects, effects, startingestimates, gainfactor = 0.1, a.scaling = 0.8, r.truncation.p1 = -1, r.truncation.p2 = -1, burnin = 30, thining = 10, length.p1 = 100, min.iter.p2 = NULL, max.iter.p2 = NULL, multiplication.iter.p2 = 100, num.steps.p2 = 6, length.p3 = 1000, neighborhood = c(0.7, 0.3, 0), fixed.estimates = NULL, numgroups.allowed = NULL, numgroups.simulated = NULL, sizes.allowed = NULL, sizes.simulated = NULL, double.averaging = FALSE, inv.zcov = NULL, inv.scaling = NULL, parallel = FALSE, parallel2 = FALSE, cpus = 1, verbose = FALSE )
estimate_ERPM( partition, nodes, objects, effects, startingestimates, gainfactor = 0.1, a.scaling = 0.8, r.truncation.p1 = -1, r.truncation.p2 = -1, burnin = 30, thining = 10, length.p1 = 100, min.iter.p2 = NULL, max.iter.p2 = NULL, multiplication.iter.p2 = 100, num.steps.p2 = 6, length.p3 = 1000, neighborhood = c(0.7, 0.3, 0), fixed.estimates = NULL, numgroups.allowed = NULL, numgroups.simulated = NULL, sizes.allowed = NULL, sizes.simulated = NULL, double.averaging = FALSE, inv.zcov = NULL, inv.scaling = NULL, parallel = FALSE, parallel2 = FALSE, cpus = 1, verbose = FALSE )
partition |
observed partition |
nodes |
nodeset (data frame) |
objects |
objects used for statistics calculation (list with a vector "name", and a vector "object") |
effects |
effects/sufficient statistics (list with a vector "names", and a vector "objects") |
startingestimates |
first guess for the model parameters |
gainfactor |
numeric used to decrease the size of steps made in the Newton optimization |
a.scaling |
numeric used to reduce the influence of non-diagonal elements in the scaling matrix (for stability) |
r.truncation.p1 |
numeric used to limit extreme values in the covariance matrix (for stability) |
r.truncation.p2 |
numeric used to limit extreme values in the covariance matrix (for stability) |
burnin |
integer for the number of burn-in steps before sampling |
thining |
integer for the number of thining steps between sampling |
length.p1 |
number of samples in phase 1 |
min.iter.p2 |
minimum number of sub-steps in phase 2 |
max.iter.p2 |
maximum number of sub-steps in phase 2 |
multiplication.iter.p2 |
value for the lengths of sub-steps in phase 2 (multiplied by 2.52^k) |
num.steps.p2 |
number of optimisation steps in phase 2 |
length.p3 |
number of samples in phase 3 |
neighborhood |
way of choosing partitions: probability vector (actors swap, merge/division, single actor move) |
fixed.estimates |
if some parameters are fixed, list with as many elements as effects, these elements equal a fixed value if needed, or NULL if they should be estimated |
numgroups.allowed |
vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
vector containing the number of groups simulated |
sizes.allowed |
vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
double.averaging |
option to average the statistics sampled in each sub-step of phase 2 |
inv.zcov |
initial value of the inverted covariance matrix (if a phase 3 was run before) to bypass the phase 1 |
inv.scaling |
initial value of the inverted scaling matrix (if a phase 3 was run before) to bypass the phase 1 |
parallel |
whether the phase 1 and 3 should be parallelized |
parallel2 |
whether there should be several phases 2 run in parallel |
cpus |
how many cores can be used |
verbose |
logical: should intermediate results during the estimation be printed or not? Defaults to FALSE. |
A list with the outputs of the three different phases of the algorithm
# define an arbitrary set of n = 6 nodes with attributes, and an arbitrary covariate matrix n <- 6 nodes <- data.frame(label = c("A","B","C","D","E","F"), gender = c(1,1,2,1,2,2), age = c(20,22,25,30,30,31)) friendship <- matrix(c(0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0), 6, 6, TRUE) # choose the effects to be included (see manual for all effect names) effects <- list(names = c("num_groups","same","diff","tie"), objects = c("partition","gender","age","friendship")) objects <- list() objects[[1]] <- list(name = "friendship", object = friendship) # define observed partition partition <- c(1,1,2,2,2,3) # estimate startingestimates <- c(-2,0,0,0) estimation <- estimate_ERPM(partition, nodes, objects, effects, startingestimates = startingestimates, burnin = 100, thining = 20, length.p1 = 500, # number of samples in phase 1 multiplication.iter.p2 = 20, # factor for the number of iterations in phase 2 subphases num.steps.p2 = 4, # number of phase 2 subphases length.p3 = 1000) # number of samples in phase 3 # get results table estimation
# define an arbitrary set of n = 6 nodes with attributes, and an arbitrary covariate matrix n <- 6 nodes <- data.frame(label = c("A","B","C","D","E","F"), gender = c(1,1,2,1,2,2), age = c(20,22,25,30,30,31)) friendship <- matrix(c(0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0), 6, 6, TRUE) # choose the effects to be included (see manual for all effect names) effects <- list(names = c("num_groups","same","diff","tie"), objects = c("partition","gender","age","friendship")) objects <- list() objects[[1]] <- list(name = "friendship", object = friendship) # define observed partition partition <- c(1,1,2,2,2,3) # estimate startingestimates <- c(-2,0,0,0) estimation <- estimate_ERPM(partition, nodes, objects, effects, startingestimates = startingestimates, burnin = 100, thining = 20, length.p1 = 500, # number of samples in phase 1 multiplication.iter.p2 = 20, # factor for the number of iterations in phase 2 subphases num.steps.p2 = 4, # number of phase 2 subphases length.p3 = 1000) # number of samples in phase 3 # get results table estimation
Function to estimate the log likelihood of a model for an observed partition
estimate_logL( partition, nodes, effects, objects, theta, theta_0, M, num.steps, burnin, thining, neighborhoods = c(0.7, 0.3, 0), numgroups.allowed = NULL, numgroups.simulated = NULL, sizes.allowed = NULL, sizes.simulated = NULL, logL_0 = NULL, parallel = FALSE, cpus = 1, verbose = FALSE )
estimate_logL( partition, nodes, effects, objects, theta, theta_0, M, num.steps, burnin, thining, neighborhoods = c(0.7, 0.3, 0), numgroups.allowed = NULL, numgroups.simulated = NULL, sizes.allowed = NULL, sizes.simulated = NULL, logL_0 = NULL, parallel = FALSE, cpus = 1, verbose = FALSE )
partition |
observed partition |
nodes |
node set (data frame) |
effects |
effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
objects used for statistics calculation (list with a vector "name", and a vector "object") |
theta |
estimated model parameters |
theta_0 |
model parameters if all other effects than "num-groups" are fixed to 0 (basic Dirichlet partition model) |
M |
number of steps in the path-sampling algorithm |
num.steps |
number of samples in each step |
burnin |
integer for the number of burn-in steps before sampling |
thining |
integer for the number of thining steps between sampling |
neighborhoods |
= c(0.7,0.3,0) way of choosing partitions |
numgroups.allowed |
= NULL, # vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
= NULL, # vector containing the number of groups simulated |
sizes.allowed |
= NULL, vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
= NULL, vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
logL_0 |
= NULL, if known, the value of the log likelihood of the basic dirichlet model |
parallel |
= FALSE, indicating whether the code should be run in parallel |
cpus |
= 1, number of cpus required for the parallelization |
verbose |
= FALSE, to print the current step the algorithm is in |
List with the log likelihood , AIC, lambda and the draws
# estimate the log-likelihood and AIC of an estimated model (e.g. useful to compare two models) # define an arbitrary set of n = 6 nodes with attributes, and an arbitrary covariate matrix n <- 6 nodes <- data.frame(label = c("A","B","C","D","E","F"), gender = c(1,1,2,1,2,2), age = c(20,22,25,30,30,31)) friendship <- matrix(c(0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0), 6, 6, TRUE) # choose the effects to be included (see manual for all effect names) effects <- list(names = c("num_groups","same","diff","tie"), objects = c("partition","gender","age","friendship")) objects <- list() objects[[1]] <- list(name = "friendship", object = friendship) # define observed partition partition <- c(1,1,2,2,2,3) # (an exemplary estimation is internally stored in order to save time) # first: estimate the ML estimates of a simple model with only one parameter # for number of groups (this parameter should be in the model!) likelihood_function <- function(x){ exp(x*max(partition)) / compute_numgroups_denominator(n,x)} curve(likelihood_function, from=-2, to=0) parameter_base <- optimize(likelihood_function, interval=c(-2, 0), maximum=TRUE) parameters_basemodel <- c(parameter_base$maximum,0,0,0) # estimate logL and AIC logL_AIC <- estimate_logL(partition, nodes, effects, objects, theta = estimation$results$est, theta_0 = parameters_basemodel, M = 3, num.steps = 200, burnin = 100, thining = 20) logL_AIC$logL logL_AIC$AIC
# estimate the log-likelihood and AIC of an estimated model (e.g. useful to compare two models) # define an arbitrary set of n = 6 nodes with attributes, and an arbitrary covariate matrix n <- 6 nodes <- data.frame(label = c("A","B","C","D","E","F"), gender = c(1,1,2,1,2,2), age = c(20,22,25,30,30,31)) friendship <- matrix(c(0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0), 6, 6, TRUE) # choose the effects to be included (see manual for all effect names) effects <- list(names = c("num_groups","same","diff","tie"), objects = c("partition","gender","age","friendship")) objects <- list() objects[[1]] <- list(name = "friendship", object = friendship) # define observed partition partition <- c(1,1,2,2,2,3) # (an exemplary estimation is internally stored in order to save time) # first: estimate the ML estimates of a simple model with only one parameter # for number of groups (this parameter should be in the model!) likelihood_function <- function(x){ exp(x*max(partition)) / compute_numgroups_denominator(n,x)} curve(likelihood_function, from=-2, to=0) parameter_base <- optimize(likelihood_function, interval=c(-2, 0), maximum=TRUE) parameters_basemodel <- c(parameter_base$maximum,0,0,0) # estimate logL and AIC logL_AIC <- estimate_logL(partition, nodes, effects, objects, theta = estimation$results$est, theta_0 = parameters_basemodel, M = 3, num.steps = 200, burnin = 100, thining = 20) logL_AIC$logL logL_AIC$AIC
Function to estimate a given model for given observed (multiple) partitions. All options of the algorithm can be specified here.
estimate_multipleERPM( partitions, presence.tables, nodes, objects, effects, startingestimates, gainfactor = 0.1, a.scaling = 0.8, r.truncation.p1 = -1, r.truncation.p2 = -1, burnin = 30, thining = 10, length.p1 = 100, min.iter.p2 = NULL, max.iter.p2 = NULL, multiplication.iter.p2 = 200, num.steps.p2 = 6, length.p3 = 1000, neighborhood = c(0.7, 0.3, 0), fixed.estimates = NULL, numgroups.allowed = NULL, numgroups.simulated = NULL, sizes.allowed = NULL, sizes.simulated = NULL, double.averaging = FALSE, inv.zcov = NULL, inv.scaling = NULL, parallel = FALSE, parallel2 = FALSE, cpus = 1, verbose = FALSE )
estimate_multipleERPM( partitions, presence.tables, nodes, objects, effects, startingestimates, gainfactor = 0.1, a.scaling = 0.8, r.truncation.p1 = -1, r.truncation.p2 = -1, burnin = 30, thining = 10, length.p1 = 100, min.iter.p2 = NULL, max.iter.p2 = NULL, multiplication.iter.p2 = 200, num.steps.p2 = 6, length.p3 = 1000, neighborhood = c(0.7, 0.3, 0), fixed.estimates = NULL, numgroups.allowed = NULL, numgroups.simulated = NULL, sizes.allowed = NULL, sizes.simulated = NULL, double.averaging = FALSE, inv.zcov = NULL, inv.scaling = NULL, parallel = FALSE, parallel2 = FALSE, cpus = 1, verbose = FALSE )
partitions |
observed partitions |
presence.tables |
XXX |
nodes |
nodeset (data frame) |
objects |
objects used for statistics calculation (list with a vector "name", and a vector "object") |
effects |
effects/sufficient statistics (list with a vector "names", and a vector "objects") |
startingestimates |
first guess for the model parameters |
gainfactor |
numeric used to decrease the size of steps made in the Newton optimization |
a.scaling |
numeric used to reduce the influence of non-diagonal elements in the scaling matrix (for stability) |
r.truncation.p1 |
numeric used to limit extreme values in the covariance matrix (for stability) |
r.truncation.p2 |
numeric used to limit extreme values in the covariance matrix (for stability) |
burnin |
integer for the number of burn-in steps before sampling |
thining |
integer for the number of thining steps between sampling |
length.p1 |
number of samples in phase 1 |
min.iter.p2 |
minimum number of sub-steps in phase 2 |
max.iter.p2 |
maximum number of sub-steps in phase 2 |
multiplication.iter.p2 |
value for the lengths of sub-steps in phase 2 (multiplied by 2.52^k) |
num.steps.p2 |
number of optimisation steps in phase 2 |
length.p3 |
number of samples in phase 3 |
neighborhood |
way of choosing partitions: probability vector (actors swap, merge/division, single actor move) |
fixed.estimates |
if some parameters are fixed, list with as many elements as effects, these elements equal a fixed value if needed, or NULL if they should be estimated |
numgroups.allowed |
vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
vector containing the number of groups simulated |
sizes.allowed |
vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
double.averaging |
option to average the statistics sampled in each sub-step of phase 2 |
inv.zcov |
initial value of the inverted covariance matrix (if a phase 3 was run before) to bypass the phase 1 |
inv.scaling |
initial value of the inverted scaling matrix (if a phase 3 was run before) to bypass the phase 1 |
parallel |
whether the phase 1 and 3 should be parallelized |
parallel2 |
whether there should be several phases 2 run in parallel |
cpus |
how many cores can be used |
verbose |
logical: should intermediate results during the estimation be printed or not? Defaults to FALSE. |
A list with the outputs of the three different phases of the algorithm
# define an arbitrary set of n = 6 nodes with attributes, and an arbitrary covariate matrix n <- 6 nodes <- data.frame(label = c("A","B","C","D","E","F"), gender = c(1,1,2,1,2,2), age = c(20,22,25,30,30,31)) friendship <- matrix(c(0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0), 6, 6, TRUE) # specify whether nodes are present at different points of time presence.tables <- matrix(c(1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1), 6, 3) # choose effects to be included in the estimated model effects_multiple <- list(names = c("num_groups","same","diff","tie","inertia_1"), objects = c("partitions","gender","age","friendship","partitions"), objects2 = c("","","","","")) objects_multiple <- list() objects_multiple[[1]] <- list(name = "friendship", object = friendship) # define the observation partitions <- matrix(c(1, 1, 2, 2, 2, 3, NA, 1, 1, 2, 2, 2, 1, NA, 2, 3, 3, 1), 6, 3) # estimate startingestimates <- c(-2,0,0,0,0) estimation <- estimate_multipleERPM(partitions, presence.tables, nodes, objects_multiple, effects_multiple, startingestimates = startingestimates, burnin = 100, thining = 50, gainfactor = 0.6, length.p1 = 200, multiplication.iter.p2 = 20, num.steps.p2 = 4, length.p3 = 1000) # get results table estimation
# define an arbitrary set of n = 6 nodes with attributes, and an arbitrary covariate matrix n <- 6 nodes <- data.frame(label = c("A","B","C","D","E","F"), gender = c(1,1,2,1,2,2), age = c(20,22,25,30,30,31)) friendship <- matrix(c(0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0), 6, 6, TRUE) # specify whether nodes are present at different points of time presence.tables <- matrix(c(1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1), 6, 3) # choose effects to be included in the estimated model effects_multiple <- list(names = c("num_groups","same","diff","tie","inertia_1"), objects = c("partitions","gender","age","friendship","partitions"), objects2 = c("","","","","")) objects_multiple <- list() objects_multiple[[1]] <- list(name = "friendship", object = friendship) # define the observation partitions <- matrix(c(1, 1, 2, 2, 2, 3, NA, 1, 1, 2, 2, 2, 1, NA, 2, 3, 3, 1), 6, 3) # estimate startingestimates <- c(-2,0,0,0,0) estimation <- estimate_multipleERPM(partitions, presence.tables, nodes, objects_multiple, effects_multiple, startingestimates = startingestimates, burnin = 100, thining = 50, gainfactor = 0.6, length.p1 = 200, multiplication.iter.p2 = 20, num.steps.p2 = 4, length.p3 = 1000) # get results table estimation
This function finds the best estimate for a model only including the statistics of number of groups. It does a grid search for a vector of potential parameters, for all numbers of groups.
exactestimates_numgroups(num.nodes, pmin, pmax, pinc)
exactestimates_numgroups(num.nodes, pmin, pmax, pinc)
num.nodes |
number of nodes |
pmin |
lowest parameter value |
pmax |
highest parameter value |
pinc |
increment between different parameter values |
a list
Function to enumerate all possible partitions for a given n
find_all_partitions(n)
find_all_partitions(n)
n |
number of nodes |
matrix where each line corresponds to a possible partition
n <- 6 all_partitions <- find_all_partitions(n)
n <- 6 all_partitions <- find_all_partitions(n)
Function that can be used to find a good length for the burn-in of the Markov chain for a given model and differents sets of transitions in the chain (the neighborhoods). For each neighborhood, it draws a chain and calculates the mean statistics for different burn-ins.
gridsearch_burnin_single( partition, theta, nodes, effects, objects, num.steps, neighborhoods, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, parallel = FALSE, cpus = 1 )
gridsearch_burnin_single( partition, theta, nodes, effects, objects, num.steps, neighborhoods, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, parallel = FALSE, cpus = 1 )
partition |
A partition (vector) |
theta |
Initial model parameters |
nodes |
Node set (data frame) |
effects |
Effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
Objects used for statistics calculation (list with a vector "name", and a vector "object") |
num.steps |
Number of samples wanted |
neighborhoods |
List of probability vectors (proba actors swap, proba merge/division, proba single actor move) |
numgroups.allowed |
= NULL, # vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
vector containing the number of groups simulated |
sizes.allowed |
Vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
Vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
parallel |
False, to run different neighborhoods in parallel |
cpus |
Equal to 1 |
all simulations
Function that simulates the Markov chain for a given model and several sets of transitions (the neighborhoods), for multiple partitions. For each neighborhood, it calculates the autocorrelation of statistics for different thinings and the average statistics for different burn-ins. Then the best neighborhood can be selected along with good values for burn-in and thining
gridsearch_burninthining_multiple( partitions, presence.tables, theta, nodes, effects, objects, num.steps, neighborhoods, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, max.thining, parallel = FALSE, cpus = 1 )
gridsearch_burninthining_multiple( partitions, presence.tables, theta, nodes, effects, objects, num.steps, neighborhoods, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, max.thining, parallel = FALSE, cpus = 1 )
partitions |
Observed partitions |
presence.tables |
Presence of nodes |
theta |
Initial model parameters |
nodes |
Node set (data frame) |
effects |
Effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
Objects used for statistics calculation (list with a vector "name", and a vector "object") |
num.steps |
Number of samples wanted |
neighborhoods |
List of probability vectors (proba actors swap, proba merge/division, proba single actor move) |
numgroups.allowed |
vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
vector containing the number of groups simulated |
sizes.allowed |
Vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
Vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
max.thining |
Where to stop adding thining |
parallel |
False, to run different neighborhoods in parallel |
cpus |
Equal to 1 |
list
Function that simulates the Markov chain for a given model and several sets of transitions (the neighborhoods), for a single partition. For each neighborhood, it calculates the autocorrelation of statistics for different thinings and the average statistics for different burn-ins. Then the best neighborhood can be selected along with good values for burn-in and thining
gridsearch_burninthining_single( partition, theta, nodes, effects, objects, num.steps, neighborhoods, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, max.thining, parallel = FALSE, cpus = 1 )
gridsearch_burninthining_single( partition, theta, nodes, effects, objects, num.steps, neighborhoods, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, max.thining, parallel = FALSE, cpus = 1 )
partition |
A partition (vector) |
theta |
Initial model parameters |
nodes |
Node set (data frame) |
effects |
Effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
Objects used for statistics calculation (list with a vector "name", and a vector "object") |
num.steps |
Number of samples wanted |
neighborhoods |
List of probability vectors (proba actors swap, proba merge/division, proba single actor move) |
numgroups.allowed |
vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
vector containing the number of groups simulated |
sizes.allowed |
Vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
Vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
max.thining |
Where to stop adding thining |
parallel |
False, to run different neighborhoods in parallel |
cpus |
Equal to 1 |
list
Function that can be used to find a good length for the thining of the Markov chain for a given model and differents sets of transitions in the chain (the neighborhoods). For each neighborhood, it draws a chain and calculates the autocorrelation of statistics for different thinings.
gridsearch_thining_single( partition, theta, nodes, effects, objects, num.steps, neighborhoods, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, burnin, max.thining, parallel = FALSE, cpus = 1 )
gridsearch_thining_single( partition, theta, nodes, effects, objects, num.steps, neighborhoods, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, burnin, max.thining, parallel = FALSE, cpus = 1 )
partition |
A partition (vector) |
theta |
Initial model parameters |
nodes |
Node set (data frame) |
effects |
Effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
Objects used for statistics calculation (list with a vector "name", and a vector "object") |
num.steps |
Number of samples wanted |
neighborhoods |
List of probability vectors (proba actors swap, proba merge/division, proba single actor move) |
numgroups.allowed |
vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
vector containing the number of groups simulated |
sizes.allowed |
Vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
Vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
burnin |
length of the burn-in period |
max.thining |
maximal value for the thining to be tested |
parallel |
False, to run different neighborhoods in parallel |
cpus |
Equal to 1 |
all simulations
This function computes the average or the standard deviation of the size of groups in a partition.
group_size(partition, stat)
group_size(partition, stat)
partition |
A partition (vector) |
stat |
The statistic to compute : 'avg' for average and 'sd' for standard deviation |
A number corresponding to the correlation coefficient if the attribute is numerical or the correlation ratio if the attribute is categorical.
p <- c(1,2,2,3,3,4,4,4,5) group_size(p,'avg') group_size(p,'sd')
p <- c(1,2,2,3,3,4,4,4,5) group_size(p,'avg') group_size(p,'sd')
This function computes the intra class correlation correlation of attributes for 2 randomly drawn individuals in the same group.
icc(partition, attribute)
icc(partition, attribute)
partition |
A partition |
attribute |
A vector containing the values of the attribute |
A number corresponding to the ICC
p <- c(1,2,2,3,3,4,4,4,5) at <- c(3,5,23,2,1,0,3,9,2) icc(p, at)
p <- c(1,2,2,3,3,4,4,4,5) at <- c(3,5,23,2,1,0,3,9,2) icc(p, at)
This function computes the total number of individuals being in a category of an attribute in a partition. It also computes the sum of the proportion in each group of individuals being in a category.
number_categories(partition, attribute, stat, category)
number_categories(partition, attribute, stat, category)
partition |
A partition (vector) |
attribute |
A vector containing the values of the attribute |
stat |
The statistic to compute : 'avg' for the sum of proportion per group and 'sum' for the total number |
category |
The category to consider or category = 'all' if all categories have to be considered |
The statisic chosen in stat depending on the value of category. If category = 'all', returns a vector.
p <- c(1,2,2,3,3,4,4,4,5) at <- c(1,0,0,0,1,1,0,0,1) number_categories(p,at,'avg','all')
p <- c(1,2,2,3,3,4,4,4,5) at <- c(1,0,0,0,1,1,0,0,1) number_categories(p,at,'avg','all')
This function computes the number of ties.
number_ties(partition, dyadic_attribute, stat)
number_ties(partition, dyadic_attribute, stat)
partition |
A partition (vector) |
dyadic_attribute |
A matrix containing the values of the attribute |
stat |
The statistic to compute : 'avg_pergroup' for the average per group , 'sum_pergroup' for the sum, 'sum_perind' and 'avg_perind' for the number of ties per individuals each individual has in its group. |
The statisic chosen in stat
p <- c(1,2,2,3,3,4) v <- c(0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0) at <- matrix(v,6,6, byrow = TRUE) number_ties(p,at,'avg_pergroup')
p <- c(1,2,2,3,3,4) v <- c(0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0) at <- matrix(v,6,6, byrow = TRUE) number_ties(p,at,'avg_pergroup')
[2 1 1 4 2]
becomes [1 2 2 3 1]
Function to replace the ids of the group without forgetting an id
and put in the first appearance order
for example: [2 1 1 4 2]
becomes [1 2 2 3 1]
order_groupids(partition)
order_groupids(partition)
partition |
observed partition |
a vector (partition)
These are exemplary outcome objects for the ERPM package and can be used in order not to run all precedent functions and thus save time. The following products are provided:
estimation
An results object created by the function estimate_ERPM()
.
Core function for Phase 1
phase1( startingestimates, inv.zcov, inv.scaling, z.phase1, z.obs, nodes, effects, objects, r.truncation.p1, length.p1, fixed.estimates, verbose = FALSE )
phase1( startingestimates, inv.zcov, inv.scaling, z.phase1, z.obs, nodes, effects, objects, r.truncation.p1, length.p1, fixed.estimates, verbose = FALSE )
startingestimates |
vector containing initial parameter values |
inv.zcov |
inverted covariance matrix |
inv.scaling |
scaling matrix |
z.phase1 |
statistics retrieved from phase 1 |
z.obs |
observed statistics |
nodes |
node set (data frame) |
effects |
effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
objects used for statistics calculation (list with a vector "name", and a vector "object") |
r.truncation.p1 |
numeric used to limit extreme values in the covariance matrix (for stability) |
length.p1 |
number of samples in phase 1 |
fixed.estimates |
if some parameters are fixed, list with as many elements as effects, these elements equal a fixed value if needed, or NULL if they should be estimated |
verbose |
logical: should intermediate results during the estimation be printed or not? Defaults to FALSE. |
estimated parameters after phase 1
Function to plot the average size of a random partition depending on the number of nodes
plot_averagesizes(nmin, nmax, ninc)
plot_averagesizes(nmin, nmax, ninc)
nmin |
minimum number of nodes |
nmax |
maximum number of nodes |
ninc |
increment between the different number of nodes |
a vector
Function to plot the log-likelihood of the model with a single statistic (number of groups) depending on the parameter value for this statistic
plot_numgroups_likelihood(m.obs, num.nodes, pmin, pmax, pinc)
plot_numgroups_likelihood(m.obs, num.nodes, pmin, pmax, pinc)
m.obs |
observed number of groups |
num.nodes |
number of nodes |
pmin |
lowest parameter value |
pmax |
highest parameter value |
pinc |
increment between different parameter values |
a vector
This function plot the groups of a partition
plot_partition( partition, title = NULL, group.color = NULL, attribute.color = NULL, attribute.shape = NULL )
plot_partition( partition, title = NULL, group.color = NULL, attribute.color = NULL, attribute.shape = NULL )
partition |
A partition (vector) |
title |
Character, the title of the plot (default=NULL) |
group.color |
A vector with the colors of the groups (default=NULL) |
attribute.color |
A vector, attribute to represent with colors (default=NULL) |
attribute.shape |
A vector, attribute to represent with shapes (default=NULL) |
A plot of the partition
p <- c(1,1,1,2,2,2,2,3,3,3,4,4,4,4,4,4) attr1 <- c(1,0,0,1,0,0,1,0,1,0,1,1,1,1,1,2) attr2 <- c(1,1,1,1,0,0,3,0,1,0,1,1,1,1,1,2) plot_partition(p,attribute.color = attr1, attribute.shape = attr2)
p <- c(1,1,1,2,2,2,2,3,3,3,4,4,4,4,4,4) attr1 <- c(1,0,0,1,0,0,1,0,1,0,1,1,1,1,1,2) attr2 <- c(1,1,1,1,0,0,3,0,1,0,1,1,1,1,1,2) plot_partition(p,attribute.color = attr1, attribute.shape = attr2)
Print results of bayesian estimation (beta version)
## S3 method for class 'results.bayesian.erpm' print(x, ...)
## S3 method for class 'results.bayesian.erpm' print(x, ...)
x |
output of the bayesian estimate function |
... |
For internal use only. |
a data frame
Print estimation results
## S3 method for class 'results.list.erpm' print(x, ...)
## S3 method for class 'results.list.erpm' print(x, ...)
x |
output of the estimate function |
... |
For internal use only. |
a data frame
Print results of estimation of phase 3
## S3 method for class 'results.p3.erpm' print(x, ...)
## S3 method for class 'results.p3.erpm' print(x, ...)
x |
output of the estimate function |
... |
For internal use only. |
a data frame
This function computes the proportion of individuals not joining others.
proportion_isolate(partition)
proportion_isolate(partition)
partition |
A partition (vector) |
A number corresponding to proportion of individuals alone.
p <- c(1,2,2,3,3,4,4,4,5) proportion_isolate(p)
p <- c(1,2,2,3,3,4,4,4,5) proportion_isolate(p)
This function computes the sum or the average range of an attribute for groups in a partition.
range_attribute(partition, attribute, stat)
range_attribute(partition, attribute, stat)
partition |
A partition (vector) |
attribute |
A vector containing the values of the attribute |
stat |
The statistic to compute : 'avg_pergroup' for the average per group and 'sum_pergroup' for the sum of the ranges |
The statisic chosen in stat
p <- c(1,2,2,3,3,4,4,4,5) at <- c(3,5,23,2,1,0,3,9,2) range_attribute(p,at,'avg_pergroup')
p <- c(1,2,2,3,3,4,4,4,5) at <- c(3,5,23,2,1,0,3,9,2) range_attribute(p,at,'avg_pergroup')
Phase 1 wrapper for multiple observations
run_phase1_multiple( partitions, startingestimates, z.obs, presence.tables, nodes, effects, objects, burnin, thining, gainfactor, a.scaling, r.truncation.p1, length.p1, neighborhood, fixed.estimates, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, parallel = FALSE, cpus = 1, verbose = FALSE )
run_phase1_multiple( partitions, startingestimates, z.obs, presence.tables, nodes, effects, objects, burnin, thining, gainfactor, a.scaling, r.truncation.p1, length.p1, neighborhood, fixed.estimates, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, parallel = FALSE, cpus = 1, verbose = FALSE )
partitions |
observed partitions |
startingestimates |
vector containing initial parameter values |
z.obs |
observed statistics |
presence.tables |
data frame to indicate which times nodes are present in the partition |
nodes |
node set (data frame) |
effects |
effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
objects used for statistics calculation (list with a vector "name", and a vector "object") |
burnin |
integer for the number of burn-in steps before sampling |
thining |
integer for the number of thining steps between sampling |
gainfactor |
gain factor (useless now) |
a.scaling |
scaling factor |
r.truncation.p1 |
truncation factor (for stability) |
length.p1 |
number of samples for phase 1 |
neighborhood |
vector for the probability of choosing a particular transition in the chain |
fixed.estimates |
if some parameters are fixed, list with as many elements as effects, these elements equal a fixed value if needed, or NULL if they should be estimated |
numgroups.allowed |
vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
vector containing the number of groups simulated |
sizes.allowed |
vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
vector of group sizes allowed in the Markov chain but not necessarily sampled (now, it only works for vectors like size_min:size_max) |
parallel |
boolean to indicate whether the code should be run in parallel |
cpus |
number of cpus if parallel = TRUE |
verbose |
logical: should intermediate results during the estimation be printed or not? Defaults to FALSE. |
a list
Phase 1 wrapper for single observation
run_phase1_single( partition, startingestimates, z.obs, nodes, effects, objects, burnin, thining, gainfactor, a.scaling, r.truncation.p1, length.p1, neighborhood, fixed.estimates, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, parallel = TRUE, cpus = 1, verbose = FALSE )
run_phase1_single( partition, startingestimates, z.obs, nodes, effects, objects, burnin, thining, gainfactor, a.scaling, r.truncation.p1, length.p1, neighborhood, fixed.estimates, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, parallel = TRUE, cpus = 1, verbose = FALSE )
partition |
observed partition |
startingestimates |
vector containing initial parameter values |
z.obs |
observed statistics |
nodes |
node set (data frame) |
effects |
effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
objects used for statistics calculation (list with a vector "name", and a vector "object") |
burnin |
integer for the number of burn-in steps before sampling |
thining |
integer for the number of thining steps between sampling |
gainfactor |
gain factor (useless now) |
a.scaling |
scaling factor |
r.truncation.p1 |
truncation factor (for stability) |
length.p1 |
number of samples for phase 1 |
neighborhood |
vector for the probability of choosing a particular transition in the chain |
fixed.estimates |
if some parameters are fixed, list with as many elements as effects, these elements equal a fixed value if needed, or NULL if they should be estimated |
numgroups.allowed |
vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
vector containing the number of groups simulated |
sizes.allowed |
vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
vector of group sizes allowed in the Markov chain but not necessarily sampled (now, it only works for vectors like size_min:size_max) |
parallel |
boolean to indicate whether the code should be run in parallel |
cpus |
number of cpus if parallel = TRUE |
verbose |
logical: should intermediate results during the estimation be printed or not? Defaults to FALSE. |
a list
Phase 2 wrapper for multiple observation
run_phase2_multiple( partitions, estimates.phase1, inv.zcov, inv.scaling, z.obs, presence.tables, nodes, effects, objects, burnin, thining, num.steps, gainfactors, r.truncation.p2, min.iter, max.iter, multiplication.iter, neighborhood, fixed.estimates, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, double.averaging, parallel = FALSE, cpus = 1, verbose = FALSE )
run_phase2_multiple( partitions, estimates.phase1, inv.zcov, inv.scaling, z.obs, presence.tables, nodes, effects, objects, burnin, thining, num.steps, gainfactors, r.truncation.p2, min.iter, max.iter, multiplication.iter, neighborhood, fixed.estimates, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, double.averaging, parallel = FALSE, cpus = 1, verbose = FALSE )
partitions |
observed partitions |
estimates.phase1 |
vector containing parameter values after phase 1 |
inv.zcov |
inverted covariance matrix |
inv.scaling |
scaling matrix |
z.obs |
observed statistics |
presence.tables |
data frame to indicate which times nodes are present in the partition |
nodes |
node set (data frame) |
effects |
effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
objects used for statistics calculation (list with a vector "name", and a vector "object") |
burnin |
integer for the number of burn-in steps before sampling |
thining |
integer for the number of thining steps between sampling |
num.steps |
number of sub-phases in phase 2 |
gainfactors |
vector of gain factors |
r.truncation.p2 |
truncation factor |
min.iter |
minimum numbers of steps in each subphase |
max.iter |
maximum numbers of steps in each subphase |
multiplication.iter |
used to calculate min.iter and max.iter if not specified |
neighborhood |
vector for the probability of choosing a particular transition in the chain |
fixed.estimates |
if some parameters are fixed, list with as many elements as effects, these elements equal a fixed value if needed, or NULL if they should be estimated |
numgroups.allowed |
vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
vector containing the number of groups simulated |
sizes.allowed |
vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
double.averaging |
boolean to indicate whether we follow the double-averaging procedure (often leads to better convergence) |
parallel |
boolean to indicate whether the code should be run in parallel |
cpus |
number of cpus if parallel = TRUE |
verbose |
logical: should intermediate results during the estimation be printed or not? Defaults to FALSE. |
a list
Phase 2 wrapper for single observation
run_phase2_single( partition, estimates.phase1, inv.zcov, inv.scaling, z.obs, nodes, effects, objects, burnin, thining, num.steps, gainfactors, r.truncation.p2, min.iter, max.iter, multiplication.iter, neighborhood, fixed.estimates, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, double.averaging, parallel = FALSE, cpus = 1, verbose = FALSE )
run_phase2_single( partition, estimates.phase1, inv.zcov, inv.scaling, z.obs, nodes, effects, objects, burnin, thining, num.steps, gainfactors, r.truncation.p2, min.iter, max.iter, multiplication.iter, neighborhood, fixed.estimates, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, double.averaging, parallel = FALSE, cpus = 1, verbose = FALSE )
partition |
observed partition |
estimates.phase1 |
vector containing parameter values after phase 1 |
inv.zcov |
inverted covariance matrix |
inv.scaling |
scaling matrix |
z.obs |
observed statistics |
nodes |
node set (data frame) |
effects |
effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
objects used for statistics calculation (list with a vector "name", and a vector "object") |
burnin |
integer for the number of burn-in steps before sampling |
thining |
integer for the number of thining steps between sampling |
num.steps |
number of sub-phases in phase 2 |
gainfactors |
vector of gain factors |
r.truncation.p2 |
truncation factor |
min.iter |
minimum numbers of steps in each subphase |
max.iter |
maximum numbers of steps in each subphase |
multiplication.iter |
used to calculate min.iter and max.iter if not specified |
neighborhood |
vector for the probability of choosing a particular transition in the chain |
fixed.estimates |
if some parameters are fixed, list with as many elements as effects, these elements equal a fixed value if needed, or NULL if they should be estimated |
numgroups.allowed |
vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
vector containing the number of groups simulated |
sizes.allowed |
vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
double.averaging |
boolean to indicate whether we follow the double-averaging procedure (often leads to better convergence) |
parallel |
boolean to indicate whether the code should be run in parallel |
cpus |
number of cpus if parallel = TRUE |
verbose |
logical: should intermediate results during the estimation be printed or not? Defaults to FALSE. |
a list
Phase 3 wrapper for multiple observation
run_phase3_multiple( partitions, estimates.phase2, z.obs, presence.tables, nodes, effects, objects, burnin, thining, a.scaling, length.p3, neighborhood, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, fixed.estimates, parallel = FALSE, cpus = 1, verbose = FALSE )
run_phase3_multiple( partitions, estimates.phase2, z.obs, presence.tables, nodes, effects, objects, burnin, thining, a.scaling, length.p3, neighborhood, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, fixed.estimates, parallel = FALSE, cpus = 1, verbose = FALSE )
partitions |
observed partitions |
estimates.phase2 |
vector containing parameter values after phase 2 |
z.obs |
observed statistics |
presence.tables |
data frame to indicate which times nodes are present in the partition |
nodes |
node set (data frame) |
effects |
effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
objects used for statistics calculation (list with a vector "name", and a vector "object") |
burnin |
integer for the number of burn-in steps before sampling |
thining |
integer for the number of thining steps between sampling |
a.scaling |
multiplicative factor for out-of-diagonal elements of the covariance matrix |
length.p3 |
number of samples in phase 3 |
neighborhood |
vector for the probability of choosing a particular transition in the chain |
numgroups.allowed |
vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
vector containing the number of groups simulated |
sizes.allowed |
vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
fixed.estimates |
if some parameters are fixed, list with as many elements as effects, these elements equal a fixed value if needed, or NULL if they should be estimated |
parallel |
boolean to indicate whether the code should be run in parallel |
cpus |
number of cpus if parallel = TRUE |
verbose |
logical: should intermediate results during the estimation be printed or not? Defaults to FALSE. |
a list
Phase 3 wrapper for single observation
run_phase3_single( partition, estimates.phase2, z.obs, nodes, effects, objects, burnin, thining, a.scaling, length.p3, neighborhood, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, fixed.estimates, parallel = FALSE, cpus = 1, verbose = FALSE )
run_phase3_single( partition, estimates.phase2, z.obs, nodes, effects, objects, burnin, thining, a.scaling, length.p3, neighborhood, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, fixed.estimates, parallel = FALSE, cpus = 1, verbose = FALSE )
partition |
observed partition |
estimates.phase2 |
vector containing parameter values after phase 2 |
z.obs |
observed statistics |
nodes |
node set (data frame) |
effects |
effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
objects used for statistics calculation (list with a vector "name", and a vector "object") |
burnin |
integer for the number of burn-in steps before sampling |
thining |
integer for the number of thining steps between sampling |
a.scaling |
multiplicative factor for out-of-diagonal elements of the covariance matrix |
length.p3 |
number of sampled partitions in phase 3 |
neighborhood |
vector for the probability of choosing a particular transition in the chain |
numgroups.allowed |
vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
vector containing the number of groups simulated |
sizes.allowed |
vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
fixed.estimates |
if some parameters are fixed, list with as many elements as effects, these elements equal a fixed value if needed, or NULL if they should be estimated |
parallel |
boolean to indicate whether the code should be run in parallel |
cpus |
number of cpus if parallel = TRUE |
verbose |
logical: should intermediate results during the estimation be printed or not? Defaults to FALSE. |
a list
This function computes the total number, the average number having the same value of a categorical variable and the number of individuals a partition.
same_pairs(partition, attribute, stat)
same_pairs(partition, attribute, stat)
partition |
A partition (vector) |
attribute |
A vector containing the values of the attribute |
stat |
The statistic to compute : 'avg_pergroup' for the average, 'sum_pergroup' for the sum, 'sum_perind' and 'avg_perind' for the number of ties per individual each individual has in its group. |
The statistic chosen in stat
p <- c(1,2,2,3,3,4,4,4,5) at <- c(0,1,1,1,1,0,0,0,0) same_pairs(p,at,'avg_pergroup')
p <- c(1,2,2,3,3,4,4,4,5) at <- c(0,1,1,1,1,0,0,0,0) same_pairs(p,at,'avg_pergroup')
This function computes the total number, the average number having the close values of a numerical variable and the number of individuals a partition.
similar_pairs(partition, attribute, stat, threshold)
similar_pairs(partition, attribute, stat, threshold)
partition |
A partition (vector) |
attribute |
A vector containing the values of the attribute |
stat |
The statistic to compute : 'avg_pergroup' for the average, 'sum_pergroup' for the sum, 'sum_perind' and 'avg_perind' for individuals |
threshold |
Threshold to determine if 2 individuals attributes values are close |
The statisic chosen in stat
p <- c(1,2,2,3,3,4,4,4,5) at <- c(3,5,23,2,1,0,3,9,2) similar_pairs(p,at,1,'avg_pergroup')
p <- c(1,2,2,3,3,4,4,4,5) at <- c(3,5,23,2,1,0,3,9,2) similar_pairs(p,at,1,'avg_pergroup')
Function that can be used to find a good length for the burn-in of the Markov chain for a given model and a given set of transitions in the chain (the neighborhood). It draws a chain and calculates the mean statistics for different burn-ins.
simulate_burnin_single( partition, theta, nodes, effects, objects, num.steps, neighborhood, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated )
simulate_burnin_single( partition, theta, nodes, effects, objects, num.steps, neighborhood, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated )
partition |
A partition (vector) |
theta |
Initial model parameters |
nodes |
Node set (data frame) |
effects |
Effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
Objects used for statistics calculation (list with a vector "name", and a vector "object") |
num.steps |
Number of samples wanted |
neighborhood |
Way of choosing partitions: probability vector (proba actors swap, proba merge/division, proba single actor move) |
numgroups.allowed |
vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
vector containing the number of groups simulated |
sizes.allowed |
Vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
Vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
A list with list the draws, the moving.means and the moving means smoothed
Function that simulates the Markov chain for a given model and a set of transitions (the neighborhood), for multiple partitions. It calculates the autocorrelation of statistics for different thinings and the average statistics for different burn-ins.
simulate_burninthining_multiple( partitions, presence.tables, theta, nodes, effects, objects, num.steps, neighborhood, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, max.thining, verbose = FALSE )
simulate_burninthining_multiple( partitions, presence.tables, theta, nodes, effects, objects, num.steps, neighborhood, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, max.thining, verbose = FALSE )
partitions |
Observed partitions |
presence.tables |
to indicate which nodes were present when |
theta |
Initial model parameters |
nodes |
Node set (data frame) |
effects |
Effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
Objects used for statistics calculation (list with a vector "name", and a vector "object") |
num.steps |
Number of samples wanted |
neighborhood |
Way of choosing partitions: probability vector (proba actors swap, proba merge/division, proba single actor move) |
numgroups.allowed |
vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
vector containing the number of groups simulated |
sizes.allowed |
Vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
Vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
max.thining |
maximal number of simulated steps in the thining |
verbose |
logical: should intermediate results during the estimation be printed or not? Defaults to FALSE. |
A list
Function that simulates the Markov chain for a given model and a set of transitions (the neighborhood), for a single partition. It calculates the autocorrelation of statistics for different thinings and the average statistics for different burn-ins.
simulate_burninthining_single( partition, theta, nodes, effects, objects, num.steps, neighborhood, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, max.thining, verbose = FALSE )
simulate_burninthining_single( partition, theta, nodes, effects, objects, num.steps, neighborhood, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, max.thining, verbose = FALSE )
partition |
Observed partition (vector) |
theta |
Initial model parameters |
nodes |
Node set (data frame) |
effects |
Effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
Objects used for statistics calculation (list with a vector "name", and a vector "object") |
num.steps |
Number of samples wanted |
neighborhood |
Way of choosing partitions: probability vector (proba actors swap, proba merge/division, proba single actor move) |
numgroups.allowed |
vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
vector containing the number of groups simulated |
sizes.allowed |
Vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
Vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
max.thining |
maximal number of simulated steps in the thining |
verbose |
logical: should intermediate results during the estimation be printed or not? Defaults to FALSE. |
A list
Function that can be used to find a good length for the thining of the Markov chain for a given model and a set of transitions in the chain (the neighborhood). It draws a chain and calculates the autocorrelation of statistics for different thinings.
simulate_thining_single( partition, theta, nodes, effects, objects, num.steps, neighborhood, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, burnin, max.thining, verbose = FALSE )
simulate_thining_single( partition, theta, nodes, effects, objects, num.steps, neighborhood, numgroups.allowed, numgroups.simulated, sizes.allowed, sizes.simulated, burnin, max.thining, verbose = FALSE )
partition |
A partition (vector) |
theta |
Initial model parameters |
nodes |
Node set (data frame) |
effects |
Effects/sufficient statistics (list with a vector "names", and a vector "objects") |
objects |
Objects used for statistics calculation (list with a vector "name", and a vector "object") |
num.steps |
Number of samples wanted |
neighborhood |
Way of choosing partitions: probability vector (proba actors swap, proba merge/division, proba single actor move) |
numgroups.allowed |
vector containing the number of groups allowed in the partition (now, it only works with vectors like num_min:num_max) |
numgroups.simulated |
vector containing the number of groups simulated |
sizes.allowed |
Vector of group sizes allowed in sampling (now, it only works for vectors like size_min:size_max) |
sizes.simulated |
Vector of group sizes allowed in the Markov chain but not necessraily sampled (now, it only works for vectors like size_min:size_max) |
burnin |
number of simulated steps for the burn-in |
max.thining |
maximal number of simulated steps in the thining |
verbose |
logical: should intermediate results during the estimation be printed or not? Defaults to FALSE. |
A list
Function to calculate the number of partitions with k groups of sizes between smin and smax
Stirling2_constraints(n, k, smin, smax)
Stirling2_constraints(n, k, smin, smax)
n |
number of nodes |
k |
number of groups |
smin |
minimum group size possible in the partition |
smax |
maximum group size possible in the partition |
a numeric
n <- 6 k <- 2 size_min <- 2 size_max <- 4 Stirling2_constraints(n,k,size_min,size_max)
n <- 6 k <- 2 size_min <- 2 size_max <- 4 Stirling2_constraints(n,k,size_min,size_max)