# netlogit: Logistic Regression for Network Data In sna: Tools for Social Network Analysis

## Description

`netlogit` performs a logistic regression of the network variable in `y` on the network variables in set `x`. The resulting fits (and coefficients) are then tested against the indicated null hypothesis.

## Usage

 ```1 2 3 4``` ```netlogit(y, x, intercept=TRUE, mode="digraph", diag=FALSE, nullhyp=c("qap", "qapspp", "qapy", "qapx", "qapallx", "cugtie", "cugden", "cuguman", "classical"), test.statistic = c("z-value","beta"), tol=1e-7, reps=1000) ```

## Arguments

 `y` dependent network variable. `NA`s are allowed, and the data should be dichotomous. `x` the stack of independent network variables. Note that `NA`s are permitted, as is dichotomous data. `intercept` logical; should an intercept term be fitted? `mode` string indicating the type of graph being evaluated. `"digraph"` indicates that edges should be interpreted as directed; `"graph"` indicates that edges are undirected. `mode` is set to `"digraph"` by default. `diag` boolean indicating whether or not the diagonal should be treated as valid data. Set this true if and only if the data can contain loops. `diag` is `FALSE` by default. `nullhyp` string indicating the particular null hypothesis against which to test the observed estimands. `test.statistic` string indicating the test statistic to be used for the Monte Carlo procedures. `tol` tolerance parameter for `qr.solve`. `reps` integer indicating the number of draws to use for quantile estimation. (Relevant to the null hypothesis test only – the analysis itself is unaffected by this parameter.) Note that, as for all Monte Carlo procedures, convergence is slower for more extreme quantiles. By default, `reps`=1000.

## Details

`netlogit` is primarily a front-end to the built-in `glm.fit` routine. `netlogit` handles vectorization, sets up `glm` options, and deals with null hypothesis testing; the actual fitting is taken care of by `glm.fit`.

Logistic network regression using is directly analogous to standard logistic regression elementwise on the appropriately vectorized adjacency matrices of the networks involved. As such, it is often a more appropriate model for fitting dichotomous response networks than is linear network regression.

Because of the frequent presence of row/column/block autocorrelation in network data, classical hull hypothesis tests (and associated standard errors) are generally suspect. Further, it is sometimes of interest to compare fitted parameter values to those arising from various baseline models (e.g., uniform random graphs conditional on certain observed statistics). The tests supported by `netlogit` are as follows:

`classical`

tests based on classical asymptotics.

`cug`

conditional uniform graph test (see `cugtest`) controlling for order.

`cugden`

conditional uniform graph test, controlling for order and density.

`cugtie`

conditional uniform graph test, controlling for order and tie distribution.

`qap`

QAP permutation test (see `qaptest`); currently identical to `qapspp`.

`qapallx`

QAP permutation test, using independent x-permutations.

`qapspp`

QAP permutation test, using Dekker's “semi-partialling plus” procedure.

`qapx`

QAP permutation test, using (single) x-permutations.

`qapy`

QAP permutation test, using y-permutations.

Note that interpretation of quantiles for single coefficients can be complex in the presence of multicollinearity or third variable effects. Although `qapspp` is known to be robust to these conditions in the OLS case, there are no equivalent results for logistic regression. Caution is thus advised.

The statistic to be employed in the above tests may be selected via `test.statistic`. By default, the z-statistic (rather than estimated coefficient) is used, as this is more approximately pivotal; coefficient-based tests are not recommended for QAP null hypotheses, although they are provided here for legacy purposes.

Reasonable printing and summarizing of `netlogit` objects is provided by `print.netlogit` and `summary.netlogit`, respectively. No plot methods exist at this time.

## Value

An object of class `netlogit`

## Author(s)

Carter T. Butts [email protected]

## References

Butts, C.T., and Carley, K.M. (2001). “Multivariate Methods for Interstructural Analysis.” CASOS working paper, Carnegie Mellon University.

`glm`, `netlm`

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17``` ```## Not run: #Create some input graphs x<-rgraph(20,4) #Create a response structure y.l<-x[1,,]+4*x[2,,]+2*x[3,,] #Note that the fourth graph is #unrelated y.p<-apply(y.l,c(1,2),function(a){1/(1+exp(-a))}) y<-rgraph(20,tprob=y.p) #Fit a netlogit model nl<-netlogit(y,x,reps=100) #Examine the results summary(nl) ## End(Not run) ```

### Example output

```Loading required package: statnet.common

Attaching package: 'statnet.common'

The following object is masked from 'package:base':

order

network: Classes for Relational Data
Version 1.13.0 created on 2015-08-31.
copyright (c) 2005, Carter T. Butts, University of California-Irvine
Mark S. Handcock, University of California -- Los Angeles
David R. Hunter, Penn State University
Martina Morris, University of Washington
Skye Bender-deMoll, University of Washington
For citation information, type citation("network").
Type help("network-package") to get started.

sna: Tools for Social Network Analysis
Version 2.4 created on 2016-07-23.
copyright (c) 2005, Carter T. Butts, University of California-Irvine
For citation information, type citation("sna").
Type help(package="sna") to get started.

Network Logit Model

Coefficients:
Estimate   Exp(b)     Pr(<=b) Pr(>=b) Pr(>=|b|)
(intercept)  0.1426240  1.1532961 0.63    0.37    0.7
x1           1.2737458  3.5742158 1.00    0.00    0.0
x2           3.5528050 34.9111052 1.00    0.00    0.0
x3           1.5810174  4.8598979 1.00    0.00    0.0
x4          -0.1913785  0.8258199 0.27    0.73    0.6

Goodness of Fit Statistics:

Null deviance: 526.7919 on 380 degrees of freedom
Residual deviance: 190.5348 on 375 degrees of freedom
Chi-Squared test of fit improvement:
336.2571 on 5 degrees of freedom, p-value 0
AIC: 200.5348 	BIC: 220.2356
Pseudo-R^2 Measures:
(Dn-Dr)/(Dn-Dr+dfn): 0.4694642
(Dn-Dr)/Dn: 0.6383111
Contingency Table (predicted (rows) x actual (cols)):

Actual
Predicted     0     1
0    15    13
1    29   323

Total Fraction Correct: 0.8894737
Fraction Predicted 1s Correct: 0.9176136
Fraction Predicted 0s Correct: 0.5357143
False Negative Rate: 0.03869048
False Positive Rate: 0.6590909

Test Diagnostics:

Null Hypothesis: qap
Replications: 100
Distribution Summary:

(intercept)       x1       x2       x3       x4
Min       -3.22813 -1.93496 -2.31429 -2.24605 -2.28840
1stQ      -0.61746 -0.52743 -0.65229 -0.83190 -0.61240
Median     0.06627  0.16235  0.08688 -0.11347  0.07947
Mean       0.07156  0.13032  0.09500 -0.06160  0.06654
3rdQ       0.79539  0.65294  0.87295  0.72190  0.68786
Max        2.91819  2.42589  2.46216  2.30809  2.30236
```

sna documentation built on May 30, 2017, 12:18 a.m.