A Basic Guide to WordBuilder

By CaesarVincens

The first thing you do when creating a script is define tokens. That is done by typing “Tokens” then the name of that set then the members of that set, each separated by a space. For now, make three Token sets, name the first “Onsets” and add any consonants that can start a word, if a group of consonants can begin a word add each group as well; then name the second “Vowels” and add all your vowels and diphthongs; finally, name the last “Codas” and add any consonants (or groups) that can end a word (if you have none, delete this set). Later we may create special sets, but these general ones will do for now. We can also make it so certain phonemes are chosen more often, but more on that later. Now you should have something like this (but you’ll have different letters probably).

Tokens Onsets p t k b d g s x m n l r w j pr tr kr br dr gr
Tokens Vowels i e a o u ï ë ä ö ü
Tokens Codas p t k b d g s x m n l mb nd ng

Next we make the starting rules. These allow you to generate words in prescribed amounts, with a default amount of 100. Let’s make two to start with and call them “noun” and “verb”.

StartingRule noun
StartingRule verb

Now we need to define those rules. Rules take the form of “Rule [rule name] [weight] {“. The name of the rule is used in calling it for other rules and functions. The weight determines the chance of this particular instance of the rule being used versus another rule with the same name. If no weight is given, the weight is set to 1. Finally, the open bracket begins the commands of the rule.

After we have defined our rules “noun” and “verb” we can set their commands, what they will do. Let’s look at an example of the rule “noun”.

Rule noun {
  Rule Syllable
  Random {
    1 Rule Animate
    1 Rule Inanimate
  }
}

The first line we have already discussed, so we will skip that. As you can see from this example, rules can contain other rules inside of them, even themselves. The third line has the command “Random”. Random picks one command from a weighted list. In this example each of the rules has an equal chance. If we changed one of the lines to read “2 Rule Animate”, for example, that rule would be chosen twice as often as rules weighted at 1. Go ahead and create a similar rule with the rule “Syllable” and the command “Random” with two or three rules in its list.

Let’s take a look at the rule “Syllable” now.

Rule Syllable {
  Loop 8[1] 6[2] 4[3] 1[4] {
    Loop 4[0] 3[1] {
      Token Onsets
    }
    Token Vowels
    Loop 2[0] 1[1] {
      Token Codas
    }
  }
}

The first line of this rule creates a loop which repeats a specified number of times. If there is only one number, the loop repeats that many times; if there is more than one number, one is randomly chosen, and again we can weight the numbers. This makes the “loop” command very powerful. Let’s take a closer look at the first line.

Loop 8[1] 6[2] 4[3] 1[4] {

There are four possibilities for this loop: that it runs once, twice, three times, or four times. Each of those possibilities has been weighted, so it is eight times more likely that it will run once than four times. There are two ways to weight a value: either put that value as many times as you want the chance, so we could put eight 1’s in the example, or put the values in brackets and put a multiplier immediately in front of the open bracket. So we could write the previous example as follows with the same results.

Loop 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 4 {

For two or three iterations, reduplicating the value is fine, but for more I recommend the brackets and multiplier. Begin the rule “Syllable” and add the first loop, don’t use 0, but use 1 for one syllable words, 2 for two and so on. For now, make three or four possibilities and give weights to them if you want.

We can weight the tokens we defined earlier in the same way. Going back to the tokens, let’s refine the chance of different sets.

Tokens Onsets 6[p t k] 5[b d g] 4[s x] 3[m n] 3[l r] 2[w j] pr tr kr br dr gr
Tokens Vowels 2[i e a o u] ï ë ä ö ü
Tokens Codas 6[p t k] 5[b d g] 4[s x] 3[m n] 3[l] mb nd ng

So now in my set, e.g. the plain vowels are twice as likely to be used as the umlauted vowels. Also notice that I don’t have to set the chance for each letter, but I can do so for a group. Go ahead and do that for your tokens. You can leave some letters outside of brackets if you like.

Going back to the rule “Syllable” let’s look at the next section.

Rule Syllable {
  Loop 8[1] 6[2] 4[3] 1[4] {
    Loop 4[0] 3[1] {
      Token Onsets
    }
    Token Vowels
    Loop 2[0] 1[1] {
      Token Codas
    }
  }
}

Each loop repeats all commands within its brackets a set number of times, so the second loop can happen either 0 times or once. That means we may get an onset or we may not, with it being twice as likely not to. The command “Token” chooses a random token from the token set defined and adds it to the end of the current word; the command “Prefix” does the same, but adds the token to the beginning of the word. Again the weights of the items in that set affect the chance of them being chosen. Then that loop closes and we have a non-looped token command which means that we will always have a vowel in this example. Finally, the last loop affects the chance of a coda.

Taken altogether, the rule “Syllable” will produce a word with one, two, three, or four syllables, each syllable will contain at least a vowel, and maybe an onset or a coda. Just by changing weights and token sets involved with this rule, we can create almost any word.

Now that we have defined a rule to create a word let’s see what else we can do. Returning to our rule “noun”, we see that there were two other rules that might happen.

Rule noun {
  Rule Syllable
  Random {
    1 Rule Animate
    1 Rule Inanimate
  }
}

Remember that the command “Random” chooses a command from a weighted list at random. An alternate way we could do that is to create two rules with the same name, say “Gender” then the rule “noun” would look like this:

Rule noun {
  Rule Syllable
  Rule Gender
}

There are advantages to each method. But if we make two rules with different names we will have this:

Rule Animate {
  Mark Noun Animate
  Branch Pl Pl
}
Rule Inanimate {
  Mark Noun Inanimate
  Branch Pl Pl
}

Or if we use only one name:

Rule Gender {
  Mark Noun Animate
  Branch Pl Pl
}
Rule Gender {
  Mark Noun Inanimate
  Branch Pl Pl
}

Either way, these rules have two new commands: Mark and Branch. Mark sets up classification for your words, the first word after “Mark” sets the type, and the second sets the value, and these can be anything you want. If we wanted, we could put “Mark Pineapple Blue”. Branch is an excellent command for morphology. It creates a dependent form of your word, named by the first word after “Branch” and uses a rule to create that, the second word. So now we need the rule “Pl”. This can be simple.

Rule Pl {
  Literal s
}

Literal adds the defined string to the end of the current word or branch. It has a counterpart “Prelit” to add a string to the beginning of a word. This is useful for case inflections or derivations or similar morphology.

At this point, we should be able to create nouns, but we also have a starting rule for verbs.

Rule verb {
  Rule Syllable
  Rule Verb-End
  Random {
    1 Mark Verb Stative
    1 Mark Verb Active
    1 Mark Verb Causative
  }
}

For this rule, we put the Mark command directly onto the main rule instead of in a later rule, but we also added a new rule, Verb-End. This rule will make verb stems into regular infinitives instead of ending in any random letter.

Rule Verb-End {
  Token Vowels
  Token VerbEnds
}

This is a simple rule; we just add two letters, no loops or random commands. I have the vowel added so we don’t end in two consonants. But we haven’t defined the token set “VerbEnds” yet, so let’s do that now. We need to go back the token sets we defined earlier because token sets must all come before the starting rules.

Now we add the following (you can use different letters if you want).

Tokens VerbEnds t k l

If we wanted the verbs to all in vowels, or only certain vowels plus certain consonants, we could have created just another token set for that as well.

Now that you have all the rules and tokens sets, you should have something like the following:

Tokens Onsets 6[p t k] 5[b d g] 4[s x] 3[m n] 3[l r] 2[w j] pr tr kr br dr gr
Tokens Vowels 2[i e a o u] 1[ï ë ä ö ü]
Tokens Codas 6[p t k] 5[b d g] 4[s x] 3[m n] 3[l] mb nd ng
Tokens VerbEnds t k l
StartingRule noun
StartingRule verb
Rule noun {
  Rule Syllable
  Random {
    1 Rule Animate
    1 Rule Inanimate
  }
}
Rule Syllable {
  Loop 8[1] 6[2] 4[3] 1[4] {
    Loop 4[0] 3[1] {
      Token Onsets
    }
    Token Vowels
    Loop 2[0] 1[1] {
      Token Codas
    }
  }
}
Rule Animate {
  Mark Noun Animate
  Branch Pl Pl
}
Rule Inanimate {
  Mark Noun Inanimate
  Branch Pl Pl
}
Rule Pl {
  Literal s
}
Rule verb {
  Rule Syllable
  Rule Verb-End
  Random {
    1 Mark Verb Stative
    1 Mark Verb Active
    1 Mark Verb Causative
  }
}
 
Rule Verb-End {
  Token Vowels
  Token VerbEnds
}

Try generating some words and changing weight values. You can find more commands and other hints here.

And this concludes the tutorial.

A Basic Guide to WordBuilder

En kommentar til “Quickstart guide”

Skriv et svar