Limits...
Lewis Carroll's Doublets net of English words: network heterogeneity in a complex system.

Fushing H, Chen C, Hsieh YC, Farrell P - PLoS ONE (2014)

Bottom Line: Phonological communities are seen at the network level.And a balancing act between the language's global efficiency and redundancy is seen at the system level.Because the Doublets net is a modular complex cognitive system, the community geometry and computable multi-scale structural information may provide a foundation for understanding computational learning in many systems whose network structure has yet to be fully analyzed.

View Article: PubMed Central - PubMed

Affiliation: Department of Statistics, University of California Davis, Davis, California, United States of America.

ABSTRACT
Lewis Carroll's English word game Doublets is represented as a system of networks with each node being an English word and each connectivity edge confirming that its two ending words are equal in letter length, but different by exactly one letter. We show that this system, which we call the Doublets net, constitutes a complex body of linguistic knowledge concerning English word structure that has computable multiscale features. Distributed morphological, phonological and orthographic constraints and the language's local redundancy are seen at the node level. Phonological communities are seen at the network level. And a balancing act between the language's global efficiency and redundancy is seen at the system level. We develop a new measure of intrinsic node-to-node distance and a computational algorithm, called community geometry, which reveal the implicit multiscale structure within binary networks. Because the Doublets net is a modular complex cognitive system, the community geometry and computable multi-scale structural information may provide a foundation for understanding computational learning in many systems whose network structure has yet to be fully analyzed.

Show MeSH
Local to global views of the 4-letter word network.Typical patterns of network connectivity from the view of a node to its immediate neighbors (a), and the view to its two step neighbors (b), and the overall view of the whole network (c).
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4269387&req=5

pone-0114177-g001: Local to global views of the 4-letter word network.Typical patterns of network connectivity from the view of a node to its immediate neighbors (a), and the view to its two step neighbors (b), and the overall view of the whole network (c).

Mentions: To better see these features and their implications, we can zoom in on the network of all 4-letter words and look at the connectivity of one node, say “MARE”, as shown in Fig. 1(a). There are 25 edges, in all, connecting to the node “MARE”, for which reason we say that the degree of this node is 25. The 25 immediate connections from “MARE” are divided exactly into four small groups that are each defined by one of the four letter slots in the word. That is, there are 8 out of 25 ( = ) possible choices of English letters that can substitute for “M” in the first slot of “?-A-R-E”. Likewise, there are 3 for “M-?-R-E”, 7 for “M-A-?-E” and 7 for “M-A-R-?”. This kind of limitation on letter combinations illustrates the third distinctive feature of the system, which is especially important. The word “MARE” in fact has the largest degree among all 4-letter words. Yet, it only reaches one fifth of its capacity of 100 ( = ). Hence, this network is unlikely to be equipped with the so-called scale-free degree distribution, which would prescribe a power law of for its heavy tail indexed by degree and a positive constant (see [16], [17]). On the other hand, 75 out of the 100 possible single letter changes are impermissible due to linguistic constraints that increase redundancy or efficiency [18], [15]. For instance, “M-L-R-E” violates a letter-combinatoric (orthographic) constraint specific to English that reflects a set of sound-combinatoric (phonological) constraints that is only partially specific to English. The bilabial nasal consonant phoneme, for which the International Phonetic Alphabet symbol is/m/and which is usually indicated by the letter “M” in the English spelling system, can only be followed, when word-initial, by a vowel phoneme. There is, in effect, a systematic phonological and orthographic conspiracy against not only “M-L-R-E”, but also “M-T-R-E”, “M-S-R-E”, “M-N-R-E”, and so forth. The main components of this conspiracy are: an English-specific constraint with the effect of forcing the nasal phoneme/m/to be a consonantal part of the onset of a syllable when it is word-initial; universal constraints on consonant-vowel (C-V) structure that disallow most syllable onsets of the form C-C-C-, including/m-C-r-/[19], [20]; and an English-specific orthographic constraint limiting the set of single-character representations of vowel phonemes to the letters “A”, “E”, “I”, “O”, “U” and “Y”, even though the language actually has upwards of thirteen vowel phonemes in all, with the precise number depending on the dialect and exactly how the count is made.


Lewis Carroll's Doublets net of English words: network heterogeneity in a complex system.

Fushing H, Chen C, Hsieh YC, Farrell P - PLoS ONE (2014)

Local to global views of the 4-letter word network.Typical patterns of network connectivity from the view of a node to its immediate neighbors (a), and the view to its two step neighbors (b), and the overall view of the whole network (c).
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4269387&req=5

pone-0114177-g001: Local to global views of the 4-letter word network.Typical patterns of network connectivity from the view of a node to its immediate neighbors (a), and the view to its two step neighbors (b), and the overall view of the whole network (c).
Mentions: To better see these features and their implications, we can zoom in on the network of all 4-letter words and look at the connectivity of one node, say “MARE”, as shown in Fig. 1(a). There are 25 edges, in all, connecting to the node “MARE”, for which reason we say that the degree of this node is 25. The 25 immediate connections from “MARE” are divided exactly into four small groups that are each defined by one of the four letter slots in the word. That is, there are 8 out of 25 ( = ) possible choices of English letters that can substitute for “M” in the first slot of “?-A-R-E”. Likewise, there are 3 for “M-?-R-E”, 7 for “M-A-?-E” and 7 for “M-A-R-?”. This kind of limitation on letter combinations illustrates the third distinctive feature of the system, which is especially important. The word “MARE” in fact has the largest degree among all 4-letter words. Yet, it only reaches one fifth of its capacity of 100 ( = ). Hence, this network is unlikely to be equipped with the so-called scale-free degree distribution, which would prescribe a power law of for its heavy tail indexed by degree and a positive constant (see [16], [17]). On the other hand, 75 out of the 100 possible single letter changes are impermissible due to linguistic constraints that increase redundancy or efficiency [18], [15]. For instance, “M-L-R-E” violates a letter-combinatoric (orthographic) constraint specific to English that reflects a set of sound-combinatoric (phonological) constraints that is only partially specific to English. The bilabial nasal consonant phoneme, for which the International Phonetic Alphabet symbol is/m/and which is usually indicated by the letter “M” in the English spelling system, can only be followed, when word-initial, by a vowel phoneme. There is, in effect, a systematic phonological and orthographic conspiracy against not only “M-L-R-E”, but also “M-T-R-E”, “M-S-R-E”, “M-N-R-E”, and so forth. The main components of this conspiracy are: an English-specific constraint with the effect of forcing the nasal phoneme/m/to be a consonantal part of the onset of a syllable when it is word-initial; universal constraints on consonant-vowel (C-V) structure that disallow most syllable onsets of the form C-C-C-, including/m-C-r-/[19], [20]; and an English-specific orthographic constraint limiting the set of single-character representations of vowel phonemes to the letters “A”, “E”, “I”, “O”, “U” and “Y”, even though the language actually has upwards of thirteen vowel phonemes in all, with the precise number depending on the dialect and exactly how the count is made.

Bottom Line: Phonological communities are seen at the network level.And a balancing act between the language's global efficiency and redundancy is seen at the system level.Because the Doublets net is a modular complex cognitive system, the community geometry and computable multi-scale structural information may provide a foundation for understanding computational learning in many systems whose network structure has yet to be fully analyzed.

View Article: PubMed Central - PubMed

Affiliation: Department of Statistics, University of California Davis, Davis, California, United States of America.

ABSTRACT
Lewis Carroll's English word game Doublets is represented as a system of networks with each node being an English word and each connectivity edge confirming that its two ending words are equal in letter length, but different by exactly one letter. We show that this system, which we call the Doublets net, constitutes a complex body of linguistic knowledge concerning English word structure that has computable multiscale features. Distributed morphological, phonological and orthographic constraints and the language's local redundancy are seen at the node level. Phonological communities are seen at the network level. And a balancing act between the language's global efficiency and redundancy is seen at the system level. We develop a new measure of intrinsic node-to-node distance and a computational algorithm, called community geometry, which reveal the implicit multiscale structure within binary networks. Because the Doublets net is a modular complex cognitive system, the community geometry and computable multi-scale structural information may provide a foundation for understanding computational learning in many systems whose network structure has yet to be fully analyzed.

Show MeSH