1 Data summary

There are 5490 species in total, with the following breakdown:

. Freq
skin 826
world 4519
both 145

Some bacteria (158) apeear both in skin and in the world data base (as should be expected - because where in the heck did that “world database come from anyway).

There are 23 traits:

trait n.levels
oxygen 5
general_motility 3
specific_motility 7
spore 3
pigment 3
shape 5
aggregation 4
gram 4
acid.phosphatase 2
alkaline.phosphatase 2
aesculin.hydrolysis 2
alpha.galactosidase 2
arylsulfatase 2
catalase 2
oxidase 2
urease 2
gelatinase 2
pyrazinamidase 2
tellurite.reductase 2
H2S.production 2
indole.production 2
methane.production 2
nitrate.reduction 2

Most are two factor traits, a few have more.

We regress the probability of of bactiria occurring on skin against each of these traits individually. We do it in two ways: an “Any Skin” (i.e. both + skin vs. all) regression and a “Only Skin” (i.e. skin vs. [world + skin], elimnating the “both”). We account for phylogenetic relationships using the phylolm package in R (reference: Ho, L. S. T. and Ane, C. (2014) A linear-time algorithm for Gaussian and non-Gaussian trait evolution models Systematic Biology, 63(3):397-408.). To obtain an “overall significance” of a fitted regression, we compared with a null model using likelihood ratio test. Finally, we were curious how much the phylogenetic correction changes a naive logistic regression, so we fit that too, and compared.

2 Results: Any Skin analysis

2.1 Comparing p-values of glm and phylo-glm

2.2 Complete table of test statistics

trait beta.phylo p.phylo beta.glm p.glm
oxygen NA 0.0000000 NA 0.0000000
general_motility -0.6673549 0.0000000 -0.2551113 0.0006563
specific_motility NA 0.0000000 NA 0.0000000
spore -0.9993405 0.0000002 -0.4854541 0.0000095
pigment -1.1540611 0.0000000 -1.1349547 0.0000000
shape NA 0.0000000 NA 0.0000000
aggregation NA 0.0000000 NA 0.0000000
gram NA 0.0000000 NA 0.0000664
acid.phosphatase -1.4806000 0.0000000 -1.4685487 0.0000000
alkaline.phosphatase -1.1453974 0.0000000 -1.0814522 0.0000000
aesculin.hydrolysis -0.6018593 0.0000000 -0.5037737 0.0000000
alpha.galactosidase -1.3160602 0.0000000 -0.4460588 0.0034644
arylsulfatase 1.2843430 0.0223193 0.9562938 0.0157410
catalase -0.7969992 0.0000000 -0.9161372 0.0000000
oxidase -0.9880408 0.0000000 -0.9951707 0.0000000
urease -0.4479860 0.0000259 -0.1081602 0.2929794
gelatinase -0.4069435 0.0000079 -0.1598911 0.0831803
pyrazinamidase 0.8256022 0.0004150 0.2698825 0.2793105
tellurite.reductase 1.2791464 0.0224166 1.0032992 0.0354029
H2S.production 0.9799971 0.0000000 0.6606778 0.0000001
indole.production 0.8222832 0.0000001 0.4483304 0.0046263
methane.production 1.0220270 0.1911093 1.1350538 0.0791270
nitrate.reduction -0.3993381 0.0000002 -0.1375591 0.0853473

2.3 All Fits

Fitted phylogenetically corrected trait-specific logistic regressions below:

2.3.1 oxygen

Estimate StdErr z.value p.value
(Intercept) -2.375734 0.0804623 -29.52605 0
Xanaerobic 1.478374 0.1184405 12.48200 0
Xfacultative anaerobe 1.272976 0.1147091 11.09743 0
Xmicroaerophile 2.653446 0.2212847 11.99110 0

2.3.2 general_motility

Estimate StdErr z.value p.value
(Intercept) -1.0135254 0.0938487 -10.799569 0
Xnon-motile -0.6673549 0.0852684 -7.826522 0

2.3.3 specific_motility

Estimate StdErr z.value p.value
(Intercept) -0.4812075 0.7388473 -0.6512949 0.5148561
Xflagella -0.4064318 0.7412733 -0.5482887 0.5834937
Xgliding -0.6166526 0.7621322 -0.8091150 0.4184490
Xmotile- unknown mechanism -1.5929393 0.7502380 -2.1232453 0.0337333
Xnon-motile -1.2208705 0.7418856 -1.6456318 0.0998396
Xtype iv pili 2.4505569 1.4428790 1.6983800 0.0894361

2.3.4 spore

Estimate StdErr z.value p.value
(Intercept) -1.2947448 0.0973004 -13.306671 0e+00
Xyes -0.9993405 0.1923513 -5.195392 2e-07

2.3.5 pigment

Estimate StdErr z.value p.value
(Intercept) -1.444485 0.0948632 -15.227035 0
Xyes -1.154061 0.1205390 -9.574175 0

2.3.6 shape

Estimate StdErr z.value p.value
(Intercept) -1.0356685 0.1255341 -8.2500980 0.0000000
Xovoid/coccobacillus -0.0626183 0.1726287 -0.3627338 0.7168038
Xrod -0.8125424 0.1363415 -5.9596125 0.0000000
Xspirillum/corkscrew 0.4066053 0.2932925 1.3863476 0.1656408

2.3.7 aggregation

Estimate StdErr z.value p.value
(Intercept) -2.473021 0.2329036 -10.618220 0e+00
Xclump 1.293162 0.1532813 8.436526 0e+00
Xsingly 1.102990 0.2041619 5.402525 1e-07

2.3.8 gram

Estimate StdErr z.value p.value
(Intercept) -1.146090 0.1033365 -11.090857 0.0000000
Xpositive -1.081417 0.1885756 -5.734660 0.0000000
Xvariable -0.741431 0.3530883 -2.099846 0.0357424

2.3.9 acid.phosphatase

Estimate StdErr z.value p.value
(Intercept) -0.9895071 0.0860664 -11.49702 0
Xpositive -1.4806000 0.1143179 -12.95160 0

2.3.10 alkaline.phosphatase

Estimate StdErr z.value p.value
(Intercept) -1.052032 0.0778663 -13.51074 0
Xpositive -1.145397 0.0977170 -11.72158 0

2.3.11 aesculin.hydrolysis

Estimate StdErr z.value p.value
(Intercept) -1.2528917 0.0627988 -19.950870 0
Xpositive -0.6018593 0.0807793 -7.450664 0

2.3.12 alpha.galactosidase

Estimate StdErr z.value p.value
(Intercept) -0.3519542 0.1243067 -2.831336 0.0046354
Xpositive -1.3160602 0.1337305 -9.841139 0.0000000

2.3.13 arylsulfatase

Estimate StdErr z.value p.value
(Intercept) -1.587303 0.0468452 -33.88401 0.0000000
Xpositive 1.284343 0.5621027 2.28489 0.0223193

2.3.14 catalase

Estimate StdErr z.value p.value
(Intercept) -1.1601256 0.0596632 -19.444578 0
Xpositive -0.7969992 0.0801983 -9.937854 0

2.3.15 oxidase

Estimate StdErr z.value p.value
(Intercept) -1.0745325 0.0725912 -14.80251 0
Xpositive -0.9880408 0.0903872 -10.93121 0

2.3.16 urease

Estimate StdErr z.value p.value
(Intercept) -1.407276 0.0551256 -25.528562 0.00e+00
Xpositive -0.447986 0.1064823 -4.207141 2.59e-05

2.3.17 gelatinase

Estimate StdErr z.value p.value
(Intercept) -1.3813284 0.0559147 -24.704193 0.0e+00
Xpositive -0.4069435 0.0911049 -4.466758 7.9e-06

2.3.18 pyrazinamidase

Estimate StdErr z.value p.value
(Intercept) -1.5790769 0.0489905 -32.232300 0.000000
Xpositive 0.8256022 0.2338562 3.530384 0.000415

2.3.19 tellurite.reductase

Estimate StdErr z.value p.value
(Intercept) -1.588889 0.0468986 -33.879217 0.0000000
Xpositive 1.279146 0.5602345 2.283234 0.0224166

2.3.20 H2S.production

Estimate StdErr z.value p.value
(Intercept) -1.6329813 0.0552886 -29.535600 0
Xpositive 0.9799971 0.1229248 7.972329 0

2.3.21 indole.production

Estimate StdErr z.value p.value
(Intercept) -1.6070571 0.0512777 -31.340269 0e+00
Xpositive 0.8222832 0.1512857 5.435301 1e-07

2.3.22 methane.production

Estimate StdErr z.value p.value
(Intercept) -1.596219 0.0465006 -34.326835 0.0000000
Xpositive 1.022027 0.7817819 1.307305 0.1911093

2.3.23 nitrate.reduction

Estimate StdErr z.value p.value
(Intercept) -1.3279400 0.0593752 -22.365232 0e+00
Xpositive -0.3993381 0.0765168 -5.218956 2e-07

3 Results: Only Skin

3.1 Comparing p-values of glm and phylo-glm

3.2 Complete table of test statistics

trait beta.phylo p.phylo beta.glm p.glm
oxygen NA 0.0000000 NA 0.0000000
general_motility -0.6103038 0.0000000 -0.2172449 0.0067401
specific_motility NA 0.0000000 NA 0.0000000
spore -0.9246427 0.0000080 -0.4403182 0.0001516
pigment -1.0270767 0.0000000 -1.0801465 0.0000000
shape NA 0.0000000 NA 0.0000001
aggregation NA 0.0000000 NA 0.0000000
gram NA 0.0000000 NA 0.0022679
acid.phosphatase -1.5317251 0.0000000 -1.6949543 0.0000000
alkaline.phosphatase -1.1614503 0.0000000 -1.2227146 0.0000000
aesculin.hydrolysis -0.5509379 0.0000000 -0.5631656 0.0000000
alpha.galactosidase -1.1294707 0.0000000 -0.5786830 0.0008132
arylsulfatase 1.1312177 0.0495710 1.0132686 0.0134625
catalase -0.7377420 0.0000000 -0.9293000 0.0000000
oxidase -0.9886098 0.0000000 -1.0905776 0.0000000
urease 0.3555861 0.0001264 -0.0716958 0.5101770
gelatinase -0.3715614 0.0001523 -0.1819595 0.0673040
pyrazinamidase 0.5430504 0.0366891 0.2192366 0.4189684
tellurite.reductase 1.2520990 0.0274548 1.0109366 0.0437029
H2S.production 0.8998944 0.0000000 0.6936979 0.0000001
indole.production 0.7385433 0.0000038 0.4782827 0.0041286
methane.production 1.0417895 0.1888750 1.0086151 0.1541685
nitrate.reduction -0.3281991 0.0000501 -0.1496014 0.0812417

3.3 All fits

Fitted phylogenetically corrected trait-specific logistic regressions below - here simply removing all of the bacteria that appear in both datasets:

3.3.1 oxygen

Estimate StdErr z.value p.value
(Intercept) -2.575701 0.0868948 -29.64159 0
Xanaerobic 1.484504 0.1271637 11.67397 0
Xfacultative anaerobe 1.284803 0.1232022 10.42841 0
Xmicroaerophile 2.730389 0.2322806 11.75470 0

3.3.2 general_motility

Estimate StdErr z.value p.value
(Intercept) -1.2147833 0.0998921 -12.16095 0
Xnon-motile -0.6103038 0.0903818 -6.75251 0

3.3.3 specific_motility

Estimate StdErr z.value p.value
(Intercept) -0.8316165 0.8384370 -0.9918653 0.3212632
Xflagella -0.2529220 0.8411760 -0.3006767 0.7636610
Xgliding -0.4706323 0.8613019 -0.5464197 0.5847775
Xmotile- unknown mechanism -1.6753865 0.8527452 -1.9646978 0.0494492
Xnon-motile -0.9857832 0.8410776 -1.1720479 0.2411778
Xtype iv pili 2.9393890 1.5542025 1.8912522 0.0585907

3.3.4 spore

Estimate StdErr z.value p.value
(Intercept) -1.4997859 0.1039369 -14.429775 0e+00
Xyes -0.9246427 0.2070186 -4.466472 8e-06

3.3.5 pigment

Estimate StdErr z.value p.value
(Intercept) -1.718059 0.1040514 -16.511645 0
Xyes -1.027077 0.1310526 -7.837132 0

3.3.6 shape

Estimate StdErr z.value p.value
(Intercept) -1.3797372 0.1366235 -10.0988254 0.0000000
Xovoid/coccobacillus -0.0605245 0.1934197 -0.3129179 0.7543430
Xrod -0.6437381 0.1475047 -4.3641875 0.0000128
Xspirillum/corkscrew 0.4490007 0.3166017 1.4181879 0.1561359

3.3.7 aggregation

Estimate StdErr z.value p.value
(Intercept) -2.585585 0.2463655 -10.494913 0.0e+00
Xclump 1.261416 0.1627980 7.748351 0.0e+00
Xsingly 1.024163 0.2159151 4.743360 2.1e-06

3.3.8 gram

Estimate StdErr z.value p.value
(Intercept) -1.3385514 0.1042542 -12.839301 0.0000000
Xpositive -0.9135018 0.1922746 -4.751028 0.0000020
Xvariable -0.7890653 0.3914050 -2.015982 0.0438019

3.3.9 acid.phosphatase

Estimate StdErr z.value p.value
(Intercept) -1.157024 0.0990605 -11.67997 0
Xpositive -1.531725 0.1256630 -12.18915 0

3.3.10 alkaline.phosphatase

Estimate StdErr z.value p.value
(Intercept) -1.168761 0.0946727 -12.34528 0
Xpositive -1.161450 0.1026653 -11.31298 0

3.3.11 aesculin.hydrolysis

Estimate StdErr z.value p.value
(Intercept) -1.4616811 0.0663248 -22.038238 0
Xpositive -0.5509379 0.0866195 -6.360435 0

3.3.12 alpha.galactosidase

Estimate StdErr z.value p.value
(Intercept) -1.343752 0.0759557 -17.691267 0
Xpositive -1.129471 0.1860058 -6.072233 0

3.3.13 arylsulfatase

Estimate StdErr z.value p.value
(Intercept) -1.741263 0.0544349 -31.988002 0.000000
Xpositive 1.131218 0.5760798 1.963647 0.049571

3.3.14 catalase

Estimate StdErr z.value p.value
(Intercept) -1.388701 0.0641218 -21.657239 0
Xpositive -0.737742 0.0864356 -8.535164 0

3.3.15 oxidase

Estimate StdErr z.value p.value
(Intercept) -1.2718199 0.0773189 -16.44901 0
Xpositive -0.9886098 0.0984742 -10.03927 0

3.3.16 urease

Estimate StdErr z.value p.value
(Intercept) -1.7065267 0.0633325 -26.945495 0.0000000
Xpositive 0.3555861 0.0927610 3.833358 0.0001264

3.3.17 gelatinase

Estimate StdErr z.value p.value
(Intercept) -1.6021722 0.0609986 -26.26573 0.0000000
Xpositive -0.3715614 0.0981072 -3.78730 0.0001523

3.3.18 pyrazinamidase

Estimate StdErr z.value p.value
(Intercept) -1.7296406 0.0556216 -31.096580 0.0000000
Xpositive 0.5430504 0.2599313 2.089207 0.0366891

3.3.19 tellurite.reductase

Estimate StdErr z.value p.value
(Intercept) -1.734239 0.0538544 -32.20237 0.0000000
Xpositive 1.252099 0.5678478 2.20499 0.0274548

3.3.20 H2S.production

Estimate StdErr z.value p.value
(Intercept) -1.7969525 0.0604586 -29.722015 0
Xpositive 0.8998944 0.1306732 6.886601 0

3.3.21 indole.production

Estimate StdErr z.value p.value
(Intercept) -1.7563308 0.0575271 -30.530517 0.0e+00
Xpositive 0.7385433 0.1596971 4.624651 3.8e-06

3.3.22 methane.production

Estimate StdErr z.value p.value
(Intercept) -1.740111 0.0531164 -32.760349 0.000000
Xpositive 1.041789 0.7928899 1.313914 0.188875

3.3.23 nitrate.reduction

Estimate StdErr z.value p.value
(Intercept) -1.5359349 0.0637884 -24.078575 0.00e+00
Xpositive -0.3281991 0.0809339 -4.055148 5.01e-05