Limits...
Estimating Vertex Measures in Social Networks by Sampling Completions of RDS Trees.

Khan B, Dombrowski K, Curtis R, Wendel T - Soc Netw (2015)

Bottom Line: In this paper, we discuss the problem of missing data and describe the protocols of our completion method, and finally the results of an experiment where ECSTC was used to estimate graph dependent vertex properties from spanning trees sampled from a graph whose characteristics were known ahead of time.The results show that ECSTC methods hold more promise for obtaining network-centric properties of individuals from a limited set of data than researchers may have previously assumed.Such an approach represents a break with past strategies of working with missing data which have mainly sought means to complete the graph, rather than ECSTC's approach, which is to estimate network properties themselves without deciding on the final edge set.

View Article: PubMed Central - PubMed

Affiliation: Department of Math and Computer Science, John Jay College (CUNY), New York, USA.

ABSTRACT

This paper presents a new method for obtaining network properties from incomplete data sets. Problems associated with missing data represent well-known stumbling blocks in Social Network Analysis. The method of "estimating connectivity from spanning tree completions" (ECSTC) is specifically designed to address situations where only spanning tree(s) of a network are known, such as those obtained through respondent driven sampling (RDS). Using repeated random completions derived from degree information, this method forgoes the usual step of trying to obtain final edge or vertex rosters, and instead aims to estimate network-centric properties of vertices probabilistically from the spanning trees themselves. In this paper, we discuss the problem of missing data and describe the protocols of our completion method, and finally the results of an experiment where ECSTC was used to estimate graph dependent vertex properties from spanning trees sampled from a graph whose characteristics were known ahead of time. The results show that ECSTC methods hold more promise for obtaining network-centric properties of individuals from a limited set of data than researchers may have previously assumed. Such an approach represents a break with past strategies of working with missing data which have mainly sought means to complete the graph, rather than ECSTC's approach, which is to estimate network properties themselves without deciding on the final edge set.

No MeSH data available.


A 100 vertex BA graph.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4380167&req=5

Figure 1: A 100 vertex BA graph.

Mentions: To illustrate, fix p = 1 as the number of trees andk = 10 as the number of completions. Figure 1 shows a 100 vertex Barabasi-Albert (BA) graphG sampled from . Figure 2shows three graphs, one for each of the network measures considered. Each vertexv is plotted as a bar that relates the actual measure to theestimated measure (y-coordinate). The bar corresponding to vertex vhas x-coordinate μG (v); itcentral y coordinate is atμμT(p), and the length ofthe vertical error bar is the standard deviation of the set of estimates generated by each of the 10completions. The value of r is given for each plot in the upperright hand corner, and a best fit line is drawn through the centers of the errorbars. Figure 3 shows analogous results for 10completions of a single BA network with 500 vertices. Together, Figure 2 and Figure 3 showthat for all three network measures, the ECSTC method is able to produce a highcorrelation with the actual values using only completions of a single spanning treesamples.


Estimating Vertex Measures in Social Networks by Sampling Completions of RDS Trees.

Khan B, Dombrowski K, Curtis R, Wendel T - Soc Netw (2015)

A 100 vertex BA graph.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4380167&req=5

Figure 1: A 100 vertex BA graph.
Mentions: To illustrate, fix p = 1 as the number of trees andk = 10 as the number of completions. Figure 1 shows a 100 vertex Barabasi-Albert (BA) graphG sampled from . Figure 2shows three graphs, one for each of the network measures considered. Each vertexv is plotted as a bar that relates the actual measure to theestimated measure (y-coordinate). The bar corresponding to vertex vhas x-coordinate μG (v); itcentral y coordinate is atμμT(p), and the length ofthe vertical error bar is the standard deviation of the set of estimates generated by each of the 10completions. The value of r is given for each plot in the upperright hand corner, and a best fit line is drawn through the centers of the errorbars. Figure 3 shows analogous results for 10completions of a single BA network with 500 vertices. Together, Figure 2 and Figure 3 showthat for all three network measures, the ECSTC method is able to produce a highcorrelation with the actual values using only completions of a single spanning treesamples.

Bottom Line: In this paper, we discuss the problem of missing data and describe the protocols of our completion method, and finally the results of an experiment where ECSTC was used to estimate graph dependent vertex properties from spanning trees sampled from a graph whose characteristics were known ahead of time.The results show that ECSTC methods hold more promise for obtaining network-centric properties of individuals from a limited set of data than researchers may have previously assumed.Such an approach represents a break with past strategies of working with missing data which have mainly sought means to complete the graph, rather than ECSTC's approach, which is to estimate network properties themselves without deciding on the final edge set.

View Article: PubMed Central - PubMed

Affiliation: Department of Math and Computer Science, John Jay College (CUNY), New York, USA.

ABSTRACT

This paper presents a new method for obtaining network properties from incomplete data sets. Problems associated with missing data represent well-known stumbling blocks in Social Network Analysis. The method of "estimating connectivity from spanning tree completions" (ECSTC) is specifically designed to address situations where only spanning tree(s) of a network are known, such as those obtained through respondent driven sampling (RDS). Using repeated random completions derived from degree information, this method forgoes the usual step of trying to obtain final edge or vertex rosters, and instead aims to estimate network-centric properties of vertices probabilistically from the spanning trees themselves. In this paper, we discuss the problem of missing data and describe the protocols of our completion method, and finally the results of an experiment where ECSTC was used to estimate graph dependent vertex properties from spanning trees sampled from a graph whose characteristics were known ahead of time. The results show that ECSTC methods hold more promise for obtaining network-centric properties of individuals from a limited set of data than researchers may have previously assumed. Such an approach represents a break with past strategies of working with missing data which have mainly sought means to complete the graph, rather than ECSTC's approach, which is to estimate network properties themselves without deciding on the final edge set.

No MeSH data available.