Title: | Generalized Multipartite Networks |
---|---|
Description: | We define generalized multipartite networks as the joint observation of several networks implying some common pre-specified groups of individuals. The aim is to fit an adapted version of the popular stochastic block model to multipartite networks, as described in Bar-hen, Barbillon and Donnet (2020) <arXiv:1807.10138>. |
Authors: | Sophie Donnet [aut, cre] , Pierre Barbillon [aut] |
Maintainer: | Sophie Donnet <[email protected]> |
License: | GPL-3 |
Version: | 0.2.1 |
Built: | 2024-10-31 05:51:32 UTC |
Source: | https://github.com/grosssbm/gremlins |
Compare two classifications on all the Functional groups
comparClassif(classif1, classif2)
comparClassif(classif1, classif2)
classif1 |
: list a length n_FG. |
classif2 |
: list a length n_FG. |
Adjusted Rand Index (ARI) for each Functional Group.
nFG <- 3; vK <- c(4,5,2) ; vNQ <- c(100,40,50); classif1 <- lapply(1:nFG,function(q){sample(1:vK[q],vNQ[q],replace=TRUE)}) classif2 <- classif1 classif2[[2]] <- sample(1:vK[2],vNQ[2],replace=TRUE) resCompar <- comparClassif (classif1,classif2)
nFG <- 3; vK <- c(4,5,2) ; vNQ <- c(100,40,50); classif1 <- lapply(1:nFG,function(q){sample(1:vK[q],vNQ[q],replace=TRUE)}) classif2 <- classif1 classif2[[2]] <- sample(1:vK[2],vNQ[2],replace=TRUE) resCompar <- comparClassif (classif1,classif2)
compute the Integrated likeilhood and the ICL criteria for the MBM
compLikICL(paramEstim, list_Net, v_distrib = NULL)
compLikICL(paramEstim, list_Net, v_distrib = NULL)
paramEstim |
Estimated parameters of MBM |
list_Net |
A list of network |
v_distrib |
Type of proababilistic distributions in each network : if 0/1 then Bernoulli, if counting then Poisson. My default = Bernoulli. Must give a vector whose length is the number of networks in list_Net |
Pseudo-Likelihood and penalty
Define a network providing its matrix of interactions and specifying the functions groups in row and col.
defineNetwork(mat, typeInter, rowFG, colFG)
defineNetwork(mat, typeInter, rowFG, colFG)
mat |
An adjacency matrix (symmetric or not) or an incidence matrix |
typeInter |
Type of the matrix, choice between "inc" (incidence), "adj" (adjacency) and "diradj" (directed adjacency) |
rowFG |
Name of the functional group in row |
colFG |
Name of the function group in column |
a list object formatted for the GREMLINS package
A <- matrix(rbinom(100,1,.2),10,10) type <- "diradj" defineNetwork(A,"diradj","FG1","FG1")
A <- matrix(rbinom(100,1,.2),10,10) type <- "diradj" defineNetwork(A,"diradj","FG1","FG1")
Extract the clusters in each functional group
extractClustersMBM(resMBM, whichModel = 1)
extractClustersMBM(resMBM, whichModel = 1)
resMBM |
A fitted Generalized BlockModel |
whichModel |
The index corresponding to the model to plot (default is 1, the best model) |
a list a length the number of Functional Groups. Each element is a list of length the number of blocks composed of the index of the individuals in each block of each cluster.
Generalized multipartite networks consist in the joint observation of several networks implying some common pre-specified groups of individuals. GREMLIM adjusts an adapted version of the popular stochastic block model to multipartite networks, as described in Bar-hen, Barbillon and Donnet (2020) The GREMLINS package provides the following top-level major functions:
defineNetwork
a function to define carefully a single network.
rMBM
a function to simulate a collection of networks involving common functional groups of entities (with various emission distributions).
multipartiteBM
a function to perform inference (model selection and estimation ) of SBM for a multipartite network.
multipartiteBMFixedModel
a function to estimate the parameters of SBM for a multipartite network for fixed numbers of blocks
We also provide some additional functions useful to analyze the results:
extractClustersMBM
a function to extract the clusters in each functional group
comparClassif
a function to compute the Adjusted Rand Index (ARI) between two classifications
predictMBM
a function to compute the predictions once the model has been fitted
compLikICL
a function to compute the Integrated Likelihood and the ICL criteria for the MBM
Pierre Barbillon, Sophie Donnet
Bar-Hen, A. and Barbillon, P. & Donnet S. (2020), "Block models for multipartite networks. Applications in ecology and ethnobiology. Journal of Statistical Modelling (to appear)
Multipartite network of mutualistic interactions between plants and pollinators, plants and birds and plants and ants.
MPEcoNetwork
MPEcoNetwork
A list a 3 binary incidence matrices
Interactions between plants (rows) and ants (cols). Matrix with 141 rows and 30 columns
Interactions between plants (rows) and birds (cols). Matrix with141 rows and 46 columns
Interactions between plants (rows) and pollinators (cols). Matrix with 141 rows and 173 columns
...
Dataset compiled and conducted at Centro de Investigaciones Costeras La Mancha (CICOLMA), located on the central coast of the Gulf of Mexico, Veracruz, Mexico. https://royalsocietypublishing.org/doi/full/10.1098/rspb.2016.1564 https://github.com/lucaspdmedeiros/multi-network_core_removal/tree/master/data
Select the number of blocks and identify the blocks per functional group using a variational EM algorithm
multipartiteBM( list_Net, v_distrib = NULL, namesFG = NULL, v_Kmin = 1, v_Kmax = 10, v_Kinit = NULL, initBM = TRUE, keep = FALSE, verbose = TRUE, nbCores = NULL, maxiterVE = NULL, maxiterVEM = NULL )
multipartiteBM( list_Net, v_distrib = NULL, namesFG = NULL, v_Kmin = 1, v_Kmax = 10, v_Kinit = NULL, initBM = TRUE, keep = FALSE, verbose = TRUE, nbCores = NULL, maxiterVE = NULL, maxiterVEM = NULL )
list_Net |
a list of networks (defined via the function defineNetwork) i.e. a multipartite network |
v_distrib |
an optional vector of characters of length the number of networks and specifying the distribution used in each network (possible values |
namesFG |
an optional vector of characters containing the names of functional groups (FG) (If Specified, must correspond to the names in |
v_Kmin |
an optional vector of integers, specifying the minimal number of blocks per functional group (must be provided in the same order as in |
v_Kmax |
an optional vector of integers specifying the maximal number of blocks per functional group provided in the same order as in |
v_Kinit |
an optional vector of integers specifying initial numbers of blocks per FG provided in the same order as in |
initBM |
an optional boolean. If initBM = TRUE an aditional initialisation is done using simple LBM or SBM on each network separatly.
Default value |
keep |
an optional boolean. If TRUE return the estimated parameters for intermediate visited models. Otherwise, only the better model (in ICL sense) is the ouput. Default value |
verbose |
an optional boolean. If TRUE, display the current step of the search algorithm |
nbCores |
an optional integer specifying the number or cores used for the estimation. Not parallelized on windows. If |
maxiterVE |
an optional integer specifying the maximum number of iterations in the VE step of the VEM algorithm. If NULL then default value |
maxiterVEM |
an optional integer specifying the maximum number of iterations of the VEM algorithm. If NULL then default value Default value |
The function multipartiteBM
selects the better numbers of blocks in each FG (with a penalized likelihood criterion). The model selection is performed with a forward backward strategy and the likelihood of each model is maximized with a variational EM).
a list of estimated parameters for the different models ordered by decreasing ICL. If keep = FALSE
, the length is of length 1 (only the better model is returned).
fittedModel
contains the results of the inference. res$fittedModel[[1]]
is a list with fields
paramEstim
a MBMfit object.
ICL
the penalized likelihood criterion ICL.
vJ
the sequence of the varational bound of the likelihood through iterations of the VEM.
convergence
TRUE if the VEM reached convergence.
list_Net
contains the data.
namesFG <- c('A','B') list_pi <- list(c(0.5,0.5),c(0.3,0.7)) # prop of blocks in each FG E <- rbind(c(1,2),c(2,2)) # architecture of the multipartite net. typeInter <- c( "inc","diradj") v_distrib <- c('gaussian','bernoulli') list_theta <- list() list_theta[[1]] <- list() list_theta[[1]]$mean <- matrix(c(6.1, 8.9, 6.6, 3), 2, 2) list_theta[[1]]$var <- matrix(c(1.6, 1.6, 1.8, 1.5),2, 2) list_theta[[2]] <- matrix(c(0.7,1.0, 0.4, 0.6),2, 2) list_Net <- rMBM(v_NQ = c(30,30),E , typeInter, v_distrib, list_pi, list_theta, namesFG = namesFG, seed = 2)$list_Net res_MBMsimu <- multipartiteBM(list_Net, v_distrib, namesFG = c('A','B'), v_Kinit = c(2,2), nbCores = 2,initBM = FALSE)
namesFG <- c('A','B') list_pi <- list(c(0.5,0.5),c(0.3,0.7)) # prop of blocks in each FG E <- rbind(c(1,2),c(2,2)) # architecture of the multipartite net. typeInter <- c( "inc","diradj") v_distrib <- c('gaussian','bernoulli') list_theta <- list() list_theta[[1]] <- list() list_theta[[1]]$mean <- matrix(c(6.1, 8.9, 6.6, 3), 2, 2) list_theta[[1]]$var <- matrix(c(1.6, 1.6, 1.8, 1.5),2, 2) list_theta[[2]] <- matrix(c(0.7,1.0, 0.4, 0.6),2, 2) list_Net <- rMBM(v_NQ = c(30,30),E , typeInter, v_distrib, list_pi, list_theta, namesFG = namesFG, seed = 2)$list_Net res_MBMsimu <- multipartiteBM(list_Net, v_distrib, namesFG = c('A','B'), v_Kinit = c(2,2), nbCores = 2,initBM = FALSE)
Estimate the parameters and give the clustering for given numbers of blocks
multipartiteBMFixedModel( list_Net, v_distrib, namesFG, v_K, classifInit = NULL, nbCores = NULL, maxiterVE = NULL, maxiterVEM = NULL, verbose = TRUE )
multipartiteBMFixedModel( list_Net, v_distrib, namesFG, v_K, classifInit = NULL, nbCores = NULL, maxiterVE = NULL, maxiterVEM = NULL, verbose = TRUE )
list_Net |
A list of network (defined via the function DefineNetwork) |
v_distrib |
Type of proababilistic distributions in each network : if 0/1 then bernoulli, if counting then poisson, gaussian or Zero Inflated Gaussian (ZIgaussian) My default = Bernoulli. Must give a vector whose length is the number of networks in list_Net |
namesFG |
Names of functional groups (must correspond to names in listNet) |
v_K |
A vector with the numbers of blocks per functional group |
classifInit |
A list of initial classification for each functional group in the same order as in namesFG |
nbCores |
Number or cores used for the estimation. Not parallelized on windows. By default : half of the cores |
maxiterVE |
Maximum number of iterations in the VE step of the VEM algorithm. Default value = 1000 |
maxiterVEM |
Maximum number of iterations of the VEM algorithm. Default value = 1000 |
verbose |
Set to TRUE to display the current step of the search algorithm |
Estimated parameters and a classification
namesFG <- c('A','B') list_pi <- list(c(0.5,0.5),c(0.3,0.7)) # prop of blocks in each FG E <- rbind(c(1,2),c(2,2)) # architecture of the multipartite net. typeInter <- c( "inc","diradj") v_distrib <- c('poisson','bernoulli') list_theta <- list() list_theta[[1]] <- matrix(c(6.1, 8.9, 6.6, 3), 2, 2) list_theta[[2]] <- matrix(c(0.7,1.0, 0.4, 0.6),2, 2) list_Net <- rMBM(v_NQ = c(20,20),E , typeInter, v_distrib, list_pi, list_theta, namesFG = namesFG, seed = 2)$list_Net #res_MBMsimu_fixed <- multipartiteBMFixedModel(list_Net, v_distrib, # namesFG = namesFG, # v_K = c(1,2), # nbCores = 2)
namesFG <- c('A','B') list_pi <- list(c(0.5,0.5),c(0.3,0.7)) # prop of blocks in each FG E <- rbind(c(1,2),c(2,2)) # architecture of the multipartite net. typeInter <- c( "inc","diradj") v_distrib <- c('poisson','bernoulli') list_theta <- list() list_theta[[1]] <- matrix(c(6.1, 8.9, 6.6, 3), 2, 2) list_theta[[2]] <- matrix(c(0.7,1.0, 0.4, 0.6),2, 2) list_Net <- rMBM(v_NQ = c(20,20),E , typeInter, v_distrib, list_pi, list_theta, namesFG = namesFG, seed = 2)$list_Net #res_MBMsimu_fixed <- multipartiteBMFixedModel(list_Net, v_distrib, # namesFG = namesFG, # v_K = c(1,2), # nbCores = 2)
Predict NAs in a Collection of Networks from a fitted MBM
predictMBM(RESMBM, whichModel = 1)
predictMBM(RESMBM, whichModel = 1)
RESMBM |
a fitted multipartite blockmodel |
whichModel |
The index corresponding to the model used for prediction (default is 1, the best model) |
the collection of matrices of predictions (probability for binary, intensity for weighted network) a
namesFG <- c('A','B') list_pi <- list(c(0.5,0.5),c(0.3,0.7)) # prop of blocks in each FG E <- rbind(c(1,2),c(2,2)) # architecture of the multipartite net. typeInter <- c( "inc","diradj") v_distrib <- c('gaussian','bernoulli') list_theta <- list() list_theta[[1]] <- list() list_theta[[1]]$mean <- matrix(c(6.1, 8.9, 6.6, 3), 2, 2) list_theta[[1]]$var <- matrix(c(1.6, 1.6, 1.8, 1.5),2, 2) list_theta[[2]] <- matrix(c(0.7,1.0, 0.4, 0.6),2, 2) list_Net <- rMBM(v_NQ = c(30,30),E , typeInter, v_distrib, list_pi, list_theta, namesFG = namesFG, seed = 2)$list_Net res_MBMsimu <- multipartiteBM(list_Net, v_distrib, namesFG = c('A','B'), v_Kinit = c(2,2), nbCores = 2,initBM = FALSE) pred <- predictMBM(res_MBMsimu)
namesFG <- c('A','B') list_pi <- list(c(0.5,0.5),c(0.3,0.7)) # prop of blocks in each FG E <- rbind(c(1,2),c(2,2)) # architecture of the multipartite net. typeInter <- c( "inc","diradj") v_distrib <- c('gaussian','bernoulli') list_theta <- list() list_theta[[1]] <- list() list_theta[[1]]$mean <- matrix(c(6.1, 8.9, 6.6, 3), 2, 2) list_theta[[1]]$var <- matrix(c(1.6, 1.6, 1.8, 1.5),2, 2) list_theta[[2]] <- matrix(c(0.7,1.0, 0.4, 0.6),2, 2) list_Net <- rMBM(v_NQ = c(30,30),E , typeInter, v_distrib, list_pi, list_theta, namesFG = namesFG, seed = 2)$list_Net res_MBMsimu <- multipartiteBM(list_Net, v_distrib, namesFG = c('A','B'), v_Kinit = c(2,2), nbCores = 2,initBM = FALSE) pred <- predictMBM(res_MBMsimu)
rMBM
simulates a collection of networks involving common functional groups of entities. The networks may be directed, undirected or bipartite. The emission distribution of the edges may be Bernoulli, Poisson, Gaussian, Zero-Inflated Gaussian, or Laplace. See the vignette for more information about the model.
rMBM( v_NQ, E, typeInter, v_distrib, list_pi, list_theta, namesFG = NULL, keepClassif = FALSE, seed = NULL )
rMBM( v_NQ, E, typeInter, v_distrib, list_pi, list_theta, namesFG = NULL, keepClassif = FALSE, seed = NULL )
v_NQ |
: number of individual in each Functional Group (FG) |
E |
: define the architecture of the Multipartite. |
typeInter |
: type of interaction in each network: undirected adjacency (adj), directed adjacency (diradj) or incidence (inc). (vector of size equal to nrow(E) ) |
v_distrib |
: vector of the distributions: 'bernoulli', 'poisson', 'gaussian', 'ZIgaussian' (for Zero inflated gaussian) or 'laplace' ( vector of size equal to nrow(E) ) |
list_pi |
: parameters of the blocks distribution |
list_theta |
: parameters of the interactions distribution. For Bernoulli a probability, for Poisson positive real number, for Gaussian a list specifying mean and var (plus p0 for ZIgaussian), for Laplace a list with location and scale |
namesFG |
: names of the FG. (default value = NULL, then the functional groups are labelled FG1, FG2, etc) |
keepClassif |
: equal to TRUE if you want to keep the simulated blocks/classification (default value = FALSE). |
seed |
: set the seed for the random simulation (default value = NULL) |
A list of lists containing the networks (list_net) and if keepClassif = TRUE the classifications (classif) Each element of list_net corresponds to a network : each network is a list containing the matrix (mat) , the type of network(diradj, adj, inc), the functional group in row (rowFG) and the functional group in columns (colFG)
namesFG <- c('A','B','C') list_pi = list(c(0.16 ,0.40 ,0.44),c(0.3,0.7),c(0.5,0.5)) E <- rbind(c(1,2),c(2,3),c(1,1)) typeInter <- c( "inc","inc", "adj") v_distrib <- c('ZIgaussian','bernoulli','poisson') list_theta <- list() list_theta[[1]] <- list() list_theta[[1]]$mean <- matrix(c(6.1, 8.9, 6.6, 9.8, 2.6, 1.0), 3, 2) list_theta[[1]]$var <- matrix(c(1.6, 1.6, 1.8, 1.7 ,2.3, 1.5),3, 2) list_theta[[1]]$p0 <- matrix(c(0.4, 0.1, 0.6, 0.5 , 0.2, 0),3, 2) list_theta[[2]] <- matrix(c(0.7,1.0, 0.4, 0.6),2, 2) m3 <- matrix(c(2.5, 2.6 ,2.2 ,2.2, 2.7 ,3.0 ,3.6, 3.5, 3.3),3,3 ) list_theta[[3]] <- (m3 + t(m3))/2 dataSim <- rMBM(v_NQ = c(100,50,40) , E = E , typeInter = typeInter, v_distrib = v_distrib, list_pi = list_pi, list_theta = list_theta, namesFG) list_net <- dataSim$list_Net classifTrue <- dataSim$classif
namesFG <- c('A','B','C') list_pi = list(c(0.16 ,0.40 ,0.44),c(0.3,0.7),c(0.5,0.5)) E <- rbind(c(1,2),c(2,3),c(1,1)) typeInter <- c( "inc","inc", "adj") v_distrib <- c('ZIgaussian','bernoulli','poisson') list_theta <- list() list_theta[[1]] <- list() list_theta[[1]]$mean <- matrix(c(6.1, 8.9, 6.6, 9.8, 2.6, 1.0), 3, 2) list_theta[[1]]$var <- matrix(c(1.6, 1.6, 1.8, 1.7 ,2.3, 1.5),3, 2) list_theta[[1]]$p0 <- matrix(c(0.4, 0.1, 0.6, 0.5 , 0.2, 0),3, 2) list_theta[[2]] <- matrix(c(0.7,1.0, 0.4, 0.6),2, 2) m3 <- matrix(c(2.5, 2.6 ,2.2 ,2.2, 2.7 ,3.0 ,3.6, 3.5, 3.3),3,3 ) list_theta[[3]] <- (m3 + t(m3))/2 dataSim <- rMBM(v_NQ = c(100,50,40) , E = E , typeInter = typeInter, v_distrib = v_distrib, list_pi = list_pi, list_theta = list_theta, namesFG) list_net <- dataSim$list_Net classifTrue <- dataSim$classif