Data summary
There are 5490 species in total, with the following breakdown:
skin |
826 |
world |
4519 |
both |
145 |
Some bacteria (158) apeear both in skin and in the world data base (as should be expected - because where in the heck did that “world database come from anyway).
There are 23 traits:
oxygen |
5 |
general_motility |
3 |
specific_motility |
7 |
spore |
3 |
pigment |
3 |
shape |
5 |
aggregation |
4 |
gram |
4 |
acid.phosphatase |
2 |
alkaline.phosphatase |
2 |
aesculin.hydrolysis |
2 |
alpha.galactosidase |
2 |
arylsulfatase |
2 |
catalase |
2 |
oxidase |
2 |
urease |
2 |
gelatinase |
2 |
pyrazinamidase |
2 |
tellurite.reductase |
2 |
H2S.production |
2 |
indole.production |
2 |
methane.production |
2 |
nitrate.reduction |
2 |
Most are two factor traits, a few have more.
We regress the probability of of bactiria occurring on skin against each of these traits individually. We do it in two ways: an “Any Skin” (i.e. both + skin vs. all) regression and a “Only Skin” (i.e. skin vs. [world + skin], elimnating the “both”). We account for phylogenetic relationships using the phylolm
package in R (reference: Ho, L. S. T. and Ane, C. (2014) A linear-time algorithm for Gaussian and non-Gaussian trait evolution models Systematic Biology, 63(3):397-408.). To obtain an “overall significance” of a fitted regression, we compared with a null model using likelihood ratio test. Finally, we were curious how much the phylogenetic correction changes a naive logistic regression, so we fit that too, and compared.
Results: Any Skin analysis
Comparing p-values of glm and phylo-glm

Complete table of test statistics
oxygen |
NA |
0.0000000 |
NA |
0.0000000 |
general_motility |
-0.6673549 |
0.0000000 |
-0.2551113 |
0.0006563 |
specific_motility |
NA |
0.0000000 |
NA |
0.0000000 |
spore |
-0.9993405 |
0.0000002 |
-0.4854541 |
0.0000095 |
pigment |
-1.1540611 |
0.0000000 |
-1.1349547 |
0.0000000 |
shape |
NA |
0.0000000 |
NA |
0.0000000 |
aggregation |
NA |
0.0000000 |
NA |
0.0000000 |
gram |
NA |
0.0000000 |
NA |
0.0000664 |
acid.phosphatase |
-1.4806000 |
0.0000000 |
-1.4685487 |
0.0000000 |
alkaline.phosphatase |
-1.1453974 |
0.0000000 |
-1.0814522 |
0.0000000 |
aesculin.hydrolysis |
-0.6018593 |
0.0000000 |
-0.5037737 |
0.0000000 |
alpha.galactosidase |
-1.3160602 |
0.0000000 |
-0.4460588 |
0.0034644 |
arylsulfatase |
1.2843430 |
0.0223193 |
0.9562938 |
0.0157410 |
catalase |
-0.7969992 |
0.0000000 |
-0.9161372 |
0.0000000 |
oxidase |
-0.9880408 |
0.0000000 |
-0.9951707 |
0.0000000 |
urease |
-0.4479860 |
0.0000259 |
-0.1081602 |
0.2929794 |
gelatinase |
-0.4069435 |
0.0000079 |
-0.1598911 |
0.0831803 |
pyrazinamidase |
0.8256022 |
0.0004150 |
0.2698825 |
0.2793105 |
tellurite.reductase |
1.2791464 |
0.0224166 |
1.0032992 |
0.0354029 |
H2S.production |
0.9799971 |
0.0000000 |
0.6606778 |
0.0000001 |
indole.production |
0.8222832 |
0.0000001 |
0.4483304 |
0.0046263 |
methane.production |
1.0220270 |
0.1911093 |
1.1350538 |
0.0791270 |
nitrate.reduction |
-0.3993381 |
0.0000002 |
-0.1375591 |
0.0853473 |
All Fits
Fitted phylogenetically corrected trait-specific logistic regressions below:
oxygen
(Intercept) |
-2.375734 |
0.0804623 |
-29.52605 |
0 |
Xanaerobic |
1.478374 |
0.1184405 |
12.48200 |
0 |
Xfacultative anaerobe |
1.272976 |
0.1147091 |
11.09743 |
0 |
Xmicroaerophile |
2.653446 |
0.2212847 |
11.99110 |
0 |
general_motility
(Intercept) |
-1.0135254 |
0.0938487 |
-10.799569 |
0 |
Xnon-motile |
-0.6673549 |
0.0852684 |
-7.826522 |
0 |
specific_motility
(Intercept) |
-0.4812075 |
0.7388473 |
-0.6512949 |
0.5148561 |
Xflagella |
-0.4064318 |
0.7412733 |
-0.5482887 |
0.5834937 |
Xgliding |
-0.6166526 |
0.7621322 |
-0.8091150 |
0.4184490 |
Xmotile- unknown mechanism |
-1.5929393 |
0.7502380 |
-2.1232453 |
0.0337333 |
Xnon-motile |
-1.2208705 |
0.7418856 |
-1.6456318 |
0.0998396 |
Xtype iv pili |
2.4505569 |
1.4428790 |
1.6983800 |
0.0894361 |
spore
(Intercept) |
-1.2947448 |
0.0973004 |
-13.306671 |
0e+00 |
Xyes |
-0.9993405 |
0.1923513 |
-5.195392 |
2e-07 |
pigment
(Intercept) |
-1.444485 |
0.0948632 |
-15.227035 |
0 |
Xyes |
-1.154061 |
0.1205390 |
-9.574175 |
0 |
shape
(Intercept) |
-1.0356685 |
0.1255341 |
-8.2500980 |
0.0000000 |
Xovoid/coccobacillus |
-0.0626183 |
0.1726287 |
-0.3627338 |
0.7168038 |
Xrod |
-0.8125424 |
0.1363415 |
-5.9596125 |
0.0000000 |
Xspirillum/corkscrew |
0.4066053 |
0.2932925 |
1.3863476 |
0.1656408 |
aggregation
(Intercept) |
-2.473021 |
0.2329036 |
-10.618220 |
0e+00 |
Xclump |
1.293162 |
0.1532813 |
8.436526 |
0e+00 |
Xsingly |
1.102990 |
0.2041619 |
5.402525 |
1e-07 |
gram
(Intercept) |
-1.146090 |
0.1033365 |
-11.090857 |
0.0000000 |
Xpositive |
-1.081417 |
0.1885756 |
-5.734660 |
0.0000000 |
Xvariable |
-0.741431 |
0.3530883 |
-2.099846 |
0.0357424 |
acid.phosphatase
(Intercept) |
-0.9895071 |
0.0860664 |
-11.49702 |
0 |
Xpositive |
-1.4806000 |
0.1143179 |
-12.95160 |
0 |
alkaline.phosphatase
(Intercept) |
-1.052032 |
0.0778663 |
-13.51074 |
0 |
Xpositive |
-1.145397 |
0.0977170 |
-11.72158 |
0 |
aesculin.hydrolysis
(Intercept) |
-1.2528917 |
0.0627988 |
-19.950870 |
0 |
Xpositive |
-0.6018593 |
0.0807793 |
-7.450664 |
0 |
alpha.galactosidase
(Intercept) |
-0.3519542 |
0.1243067 |
-2.831336 |
0.0046354 |
Xpositive |
-1.3160602 |
0.1337305 |
-9.841139 |
0.0000000 |
arylsulfatase
(Intercept) |
-1.587303 |
0.0468452 |
-33.88401 |
0.0000000 |
Xpositive |
1.284343 |
0.5621027 |
2.28489 |
0.0223193 |
catalase
(Intercept) |
-1.1601256 |
0.0596632 |
-19.444578 |
0 |
Xpositive |
-0.7969992 |
0.0801983 |
-9.937854 |
0 |
oxidase
(Intercept) |
-1.0745325 |
0.0725912 |
-14.80251 |
0 |
Xpositive |
-0.9880408 |
0.0903872 |
-10.93121 |
0 |
urease
(Intercept) |
-1.407276 |
0.0551256 |
-25.528562 |
0.00e+00 |
Xpositive |
-0.447986 |
0.1064823 |
-4.207141 |
2.59e-05 |
gelatinase
(Intercept) |
-1.3813284 |
0.0559147 |
-24.704193 |
0.0e+00 |
Xpositive |
-0.4069435 |
0.0911049 |
-4.466758 |
7.9e-06 |
pyrazinamidase
(Intercept) |
-1.5790769 |
0.0489905 |
-32.232300 |
0.000000 |
Xpositive |
0.8256022 |
0.2338562 |
3.530384 |
0.000415 |
tellurite.reductase
(Intercept) |
-1.588889 |
0.0468986 |
-33.879217 |
0.0000000 |
Xpositive |
1.279146 |
0.5602345 |
2.283234 |
0.0224166 |
H2S.production
(Intercept) |
-1.6329813 |
0.0552886 |
-29.535600 |
0 |
Xpositive |
0.9799971 |
0.1229248 |
7.972329 |
0 |
indole.production
(Intercept) |
-1.6070571 |
0.0512777 |
-31.340269 |
0e+00 |
Xpositive |
0.8222832 |
0.1512857 |
5.435301 |
1e-07 |
methane.production
(Intercept) |
-1.596219 |
0.0465006 |
-34.326835 |
0.0000000 |
Xpositive |
1.022027 |
0.7817819 |
1.307305 |
0.1911093 |
nitrate.reduction
(Intercept) |
-1.3279400 |
0.0593752 |
-22.365232 |
0e+00 |
Xpositive |
-0.3993381 |
0.0765168 |
-5.218956 |
2e-07 |
Results: Only Skin
Comparing p-values of glm and phylo-glm

Complete table of test statistics
oxygen |
NA |
0.0000000 |
NA |
0.0000000 |
general_motility |
-0.6103038 |
0.0000000 |
-0.2172449 |
0.0067401 |
specific_motility |
NA |
0.0000000 |
NA |
0.0000000 |
spore |
-0.9246427 |
0.0000080 |
-0.4403182 |
0.0001516 |
pigment |
-1.0270767 |
0.0000000 |
-1.0801465 |
0.0000000 |
shape |
NA |
0.0000000 |
NA |
0.0000001 |
aggregation |
NA |
0.0000000 |
NA |
0.0000000 |
gram |
NA |
0.0000000 |
NA |
0.0022679 |
acid.phosphatase |
-1.5317251 |
0.0000000 |
-1.6949543 |
0.0000000 |
alkaline.phosphatase |
-1.1614503 |
0.0000000 |
-1.2227146 |
0.0000000 |
aesculin.hydrolysis |
-0.5509379 |
0.0000000 |
-0.5631656 |
0.0000000 |
alpha.galactosidase |
-1.1294707 |
0.0000000 |
-0.5786830 |
0.0008132 |
arylsulfatase |
1.1312177 |
0.0495710 |
1.0132686 |
0.0134625 |
catalase |
-0.7377420 |
0.0000000 |
-0.9293000 |
0.0000000 |
oxidase |
-0.9886098 |
0.0000000 |
-1.0905776 |
0.0000000 |
urease |
0.3555861 |
0.0001264 |
-0.0716958 |
0.5101770 |
gelatinase |
-0.3715614 |
0.0001523 |
-0.1819595 |
0.0673040 |
pyrazinamidase |
0.5430504 |
0.0366891 |
0.2192366 |
0.4189684 |
tellurite.reductase |
1.2520990 |
0.0274548 |
1.0109366 |
0.0437029 |
H2S.production |
0.8998944 |
0.0000000 |
0.6936979 |
0.0000001 |
indole.production |
0.7385433 |
0.0000038 |
0.4782827 |
0.0041286 |
methane.production |
1.0417895 |
0.1888750 |
1.0086151 |
0.1541685 |
nitrate.reduction |
-0.3281991 |
0.0000501 |
-0.1496014 |
0.0812417 |
All fits
Fitted phylogenetically corrected trait-specific logistic regressions below - here simply removing all of the bacteria that appear in both datasets:
oxygen
(Intercept) |
-2.575701 |
0.0868948 |
-29.64159 |
0 |
Xanaerobic |
1.484504 |
0.1271637 |
11.67397 |
0 |
Xfacultative anaerobe |
1.284803 |
0.1232022 |
10.42841 |
0 |
Xmicroaerophile |
2.730389 |
0.2322806 |
11.75470 |
0 |
general_motility
(Intercept) |
-1.2147833 |
0.0998921 |
-12.16095 |
0 |
Xnon-motile |
-0.6103038 |
0.0903818 |
-6.75251 |
0 |
specific_motility
(Intercept) |
-0.8316165 |
0.8384370 |
-0.9918653 |
0.3212632 |
Xflagella |
-0.2529220 |
0.8411760 |
-0.3006767 |
0.7636610 |
Xgliding |
-0.4706323 |
0.8613019 |
-0.5464197 |
0.5847775 |
Xmotile- unknown mechanism |
-1.6753865 |
0.8527452 |
-1.9646978 |
0.0494492 |
Xnon-motile |
-0.9857832 |
0.8410776 |
-1.1720479 |
0.2411778 |
Xtype iv pili |
2.9393890 |
1.5542025 |
1.8912522 |
0.0585907 |
spore
(Intercept) |
-1.4997859 |
0.1039369 |
-14.429775 |
0e+00 |
Xyes |
-0.9246427 |
0.2070186 |
-4.466472 |
8e-06 |
pigment
(Intercept) |
-1.718059 |
0.1040514 |
-16.511645 |
0 |
Xyes |
-1.027077 |
0.1310526 |
-7.837132 |
0 |
shape
(Intercept) |
-1.3797372 |
0.1366235 |
-10.0988254 |
0.0000000 |
Xovoid/coccobacillus |
-0.0605245 |
0.1934197 |
-0.3129179 |
0.7543430 |
Xrod |
-0.6437381 |
0.1475047 |
-4.3641875 |
0.0000128 |
Xspirillum/corkscrew |
0.4490007 |
0.3166017 |
1.4181879 |
0.1561359 |
aggregation
(Intercept) |
-2.585585 |
0.2463655 |
-10.494913 |
0.0e+00 |
Xclump |
1.261416 |
0.1627980 |
7.748351 |
0.0e+00 |
Xsingly |
1.024163 |
0.2159151 |
4.743360 |
2.1e-06 |
gram
(Intercept) |
-1.3385514 |
0.1042542 |
-12.839301 |
0.0000000 |
Xpositive |
-0.9135018 |
0.1922746 |
-4.751028 |
0.0000020 |
Xvariable |
-0.7890653 |
0.3914050 |
-2.015982 |
0.0438019 |
acid.phosphatase
(Intercept) |
-1.157024 |
0.0990605 |
-11.67997 |
0 |
Xpositive |
-1.531725 |
0.1256630 |
-12.18915 |
0 |
alkaline.phosphatase
(Intercept) |
-1.168761 |
0.0946727 |
-12.34528 |
0 |
Xpositive |
-1.161450 |
0.1026653 |
-11.31298 |
0 |
aesculin.hydrolysis
(Intercept) |
-1.4616811 |
0.0663248 |
-22.038238 |
0 |
Xpositive |
-0.5509379 |
0.0866195 |
-6.360435 |
0 |
alpha.galactosidase
(Intercept) |
-1.343752 |
0.0759557 |
-17.691267 |
0 |
Xpositive |
-1.129471 |
0.1860058 |
-6.072233 |
0 |
arylsulfatase
(Intercept) |
-1.741263 |
0.0544349 |
-31.988002 |
0.000000 |
Xpositive |
1.131218 |
0.5760798 |
1.963647 |
0.049571 |
catalase
(Intercept) |
-1.388701 |
0.0641218 |
-21.657239 |
0 |
Xpositive |
-0.737742 |
0.0864356 |
-8.535164 |
0 |
oxidase
(Intercept) |
-1.2718199 |
0.0773189 |
-16.44901 |
0 |
Xpositive |
-0.9886098 |
0.0984742 |
-10.03927 |
0 |
urease
(Intercept) |
-1.7065267 |
0.0633325 |
-26.945495 |
0.0000000 |
Xpositive |
0.3555861 |
0.0927610 |
3.833358 |
0.0001264 |
gelatinase
(Intercept) |
-1.6021722 |
0.0609986 |
-26.26573 |
0.0000000 |
Xpositive |
-0.3715614 |
0.0981072 |
-3.78730 |
0.0001523 |
pyrazinamidase
(Intercept) |
-1.7296406 |
0.0556216 |
-31.096580 |
0.0000000 |
Xpositive |
0.5430504 |
0.2599313 |
2.089207 |
0.0366891 |
tellurite.reductase
(Intercept) |
-1.734239 |
0.0538544 |
-32.20237 |
0.0000000 |
Xpositive |
1.252099 |
0.5678478 |
2.20499 |
0.0274548 |
H2S.production
(Intercept) |
-1.7969525 |
0.0604586 |
-29.722015 |
0 |
Xpositive |
0.8998944 |
0.1306732 |
6.886601 |
0 |
indole.production
(Intercept) |
-1.7563308 |
0.0575271 |
-30.530517 |
0.0e+00 |
Xpositive |
0.7385433 |
0.1596971 |
4.624651 |
3.8e-06 |
methane.production
(Intercept) |
-1.740111 |
0.0531164 |
-32.760349 |
0.000000 |
Xpositive |
1.041789 |
0.7928899 |
1.313914 |
0.188875 |
nitrate.reduction
(Intercept) |
-1.5359349 |
0.0637884 |
-24.078575 |
0.00e+00 |
Xpositive |
-0.3281991 |
0.0809339 |
-4.055148 |
5.01e-05 |