API Reference

networksampler.compute_shortest_length_paths_matrix(A: ndarray)

Compute the shortest length path matrix

Parameters:

A (numpy.ndarray) – The graph adjacency matrix

Returns:

The shortest length path matrix, 2D array containing data with int type.

Return type:

numpy.array(dtype=int)

Examples

Here we use the function to generate the matrix, from A adjacency matrix:

>>> import networksampler
>>> D = networksampler.utils.compute_shortest_length_paths_matrix(A)
>>> D
array([ 0.30220482,  0.86820401,  0.1654503 ,  0.11659149,  0.54323428]) # random
networksampler.generate_random_network(nodes_num=1000)

Generate a random network with unweighted and undirected edges. The main purpose of this function is as test tool.

Parameters:

nodes_num (int) – Number of nodes in the network

Returns:

Adjacency matrix

Return type:

(numpy.ndarray)

networksampler.info()

Print information about the NetworkSampler package.

Examples

>>> import networksampler
>>> networksampler.info()
networksampler.node_random_sample(A: ndarray, k: int, measure='degree', random_seed=None)

This function extracts from an adjacency matrix a list of k nodes based on the centrality measure selected.

Parameters:
  • A (numpy.ndarray) – The graph adjacency matrix

  • k (int) – The size of the sample to extract

  • measure ({"degree", "closeness", "eigenvector", "betweeness"}, optional) – The centrality measure to use

  • random_seed ({None, int, array_like[ints], SeedSequence, BitGenerator, Generator}, optional) – This argument is supposed to be used for output reproducibility. A seed to initialize the BitGenerator. If None, then fresh, unpredictable entropy will be pulled from the OS. If an int or array_like[ints] is passed, then it will be passed to SeedSequence to derive the initial BitGenerator state. One may also pass in a SeedSequence instance. Additionally, when passed a BitGenerator, it will be wrapped by Generator. If passed a Generator, it will be returned unaltered.

Returns:

The index array of the sampled nodes

Return type:

numpy.array(dtype=int)

networksampler.sa_sampling(A: ndarray, ns: int, p=-4, q=4, r=0.1, D=None, random_seed=None)

This function a simulated annealing algorithm to extract a node sample, minimizing the geodesi distance and maximizing the network coverage.

Parameters:
  • A (numpy.ndarray) – The graph adjacency matrix. The graph must be unweighted (all the edges must be 1), undirected (the matrix must be symmetrical) and fully connected (a path from each node to each other node must exists). For networks larger than 5000 nodes, this process could be really slow. In case of larger networks, evaluate the usage of sa_sampling_twophases.

  • ns (int) – The size of the sample to extract.

  • p (int, optional) – The parameter p

  • q (int, optional) – The parameter q

  • r (float, optional) – The decay rate

  • D ({None, numpy.ndarray}, optional) – A precomputed shortest path length matrix. Generating this matrix requires finding all the shortest path between each nodes couple, and could be a slow operation for large Adjacency matrix. With this parameter you can provide a cached matrix

  • random_seed ({None, int, array_like[ints], SeedSequence, BitGenerator, Generator}, optional) – This argument is supposed to be used for output reproducibility. A seed to initialize the BitGenerator. If None, then fresh, unpredictable entropy will be pulled from the OS. If an int or array_like[ints] is passed, then it will be passed to SeedSequence to derive the initial BitGenerator state. One may also pass in a SeedSequence instance. Additionally, when passed a BitGenerator, it will be wrapped by Generator. If passed a Generator, it will be returned unaltered.

Returns:

The index array of the sampled nodes

Return type:

(numpy.array(dtype=int), int)

networksampler.sa_sampling_twophases(A: ndarray, ns: int, p=-4, q=4, r=0.1, centrality_measure='betweenness', first_phase_sample_fraction=0.1, D=None, random_seed=None)

This function a simulated annealing algorithm to extract a node sample, minimizing the geodesi distance and maximizing the network coverage.

Parameters:
  • A (numpy.ndarray) – The graph adjacency matrix. The graph must be unweighted (all the edges must be 1), undirected (the matrix must be symmetrical) and fully connected (a path from each node to each other node must exists)

  • ns (int) – The size of the sample to extract

  • p (int, optional) – The parameter p

  • q (int, optional) – The parameter q

  • r (float, optional) – The decay rate

  • centrality_measure ({"degree", "closeness", "eigenvector", "betweeness"}, optional) – The centrality measure to use for the first phase node extraction

  • ns_firstphase (float, optional) – The number of nodes to sample in the first phase, before applying the simulated annealing sampling in the second phase.

  • D ({None, numpy.ndarray}, optional) – A precomputed shortest path length matrix. Generating this matrix requires finding all the shortest path between each nodes couple, and could be a slow operation for large Adjacency matrix. With this parameter you can provide a cached matrix

  • random_seed ({None, int, array_like[ints], SeedSequence, BitGenerator, Generator}, optional) – This argument is supposed to be used for output reproducibility. A seed to initialize the BitGenerator. If None, then fresh, unpredictable entropy will be pulled from the OS. If an int or array_like[ints] is passed, then it will be passed to SeedSequence to derive the initial BitGenerator state. One may also pass in a SeedSequence instance. Additionally, when passed a BitGenerator, it will be wrapped by Generator. If passed a Generator, it will be returned unaltered.

Returns:

The index array of the sampled nodes

Return type:

(numpy.array(dtype=int), int)

networksampler.utils.nodes_centrality_measures(A)

This function extracts from an adjacency matrix dictionaries of several centrality measures: “degree”, “closeness”, “eigenvector”, “betweeness”.

Parameters:

A (numpy.ndarray) – The graph adjacency matrix

Returns:

A tuple of dictionaries matching in order: “degree”, “closeness”, “eigenvector”, “betweeness”.

Return type:

(dictionary, dictionary, dictionary, dictionary,)