Definition

The Chomsky Normal Form (CNF) for a context-free grammar has a restricted format, in that the right-side of a production has either two variables or one terminal. CNF's restrictions result in many efficient algorithms, such as improving speed in Brute Force parsing.

How to Transform a Context-Free Grammar into CNF

We will begin by loading the grammar in the file grammarTransform.jff. Then click on the Convert menu, then click on Transform Grammar. Now, there are four steps that we have to go through in order to transform the grammar into CNF. They are removing lambda, unit, and useless productions, followed by a final step. Each step will be discussed in the following sections.

Remove Lambda Productions

After clicking Transform Grammar, JFLAP will open the tab Lambda Removal and you should able to see a window similar to this.

The first step is to identify the variables that can derive a lambda production. It is fairly obvious to see that the variable “C” can derive lambda via the production “C->lambda” as it is a lambda production. Therefore, we should add the variable “C” to the set of variables that derive lambda by clicking on any “C” production. After clicking, your window should look like this :

Now JFLAP tells us that there is one more variable that we need to identify. Notice that there is production “A->C”. Since variable “C” could derive lambda, it can be concluded that “A->C” production could lead to a lambda production. Thus, we click on this production to add “A” to the set of variables that derive lambda. Your window should now look like this :

We have successfully identified all the variables that can derive lambda. Note that a copy of the grammar appears on the right-side of the window for us to modify. Get rid of all lambda productions by clicking on them and clicking Delete. There is one. Now, we have to add new productions to make sure that our grammar still accepts/rejects the same strings as it did before the lambda removal. JFLAP notifies you to add 6 more productions. A new production(s) must be created for productions with right-hand sides that contain a variable that could derive lambda, it could have disappeared before. However, do not add any lambda productions. You can click on the production on the left side and click Complete Selected to let JFLAP add new productions related to that production. You can also manually add all the productions. For instance, since variable “C” could derive lambda, we could have derived terminal “c” from variable “C” (“C->cC” and derived C->lambda). However, since we do not have lambda any more we have to add production “C->c” into our grammar. After entering all the productions, your window should look like :

Since lambda removal is complete, we can click Proceed and move on to the next step.

Remove Unit Productions

Now it's time to remove unit productions. Your unit removal window would look like this :

There is a picture with each variable from the grammar shown as a node in a graph with no edges. We will use the graph to see relationships between variables in unit productions. For each unit production “X->Y” add a directed edge(or transition) from node X to Node Y. JFLAP notifies you that you need to add 2 more transitions to the unit production visualization. The variable “A” is directly connected to variable “C” through the production “A->C”. For the similar reason, we can connect variable “C” to “B”. After we finish adding these 2 transitions, our window should look like :

A copy of the grammar you can modify is now shown in the lower right window. First remove the unit productions by selecting them in the bottom right panel. Select the production and click DELETE. After the deletion is complete, we need to add new productions. For unit productions, both implicit and explicit (such as “A->B” from “A->C” and “C->B” as shown from the visualization), additional rules must be added to handle the missing unit productions. We have to make sure this changed grammar still accepts/rejects the same strings as it did before. For example, since we removed the production from variable “A” to variable “C”, we cannot derive terminal “c” from variable “A”. So, we have to add the production “A->c” into our grammar. JFLAP kindly notifies us how many more productions we have to add. After we finish adding all the productions, we are finished with unit removal and the window should look like this :

Now, we click on Proceed button and move on to the next step.

Remove Useless Productions

After successfully removing both lambda and unit productions, your grammar window would look like :

A production is useless if no derivation can use it. Removing useless productions is a two-step process. First, find useless variables that cannot derive any string of terminals and remove productions with those variables. Second, find variables that are unreachable from the start symbol and remove productions with those variables.

As JFLAP indicates, first we click on variables that derive terminals directly. They are variables “S”, “A”, “C”, and “F”. Also, if there are productions with only these variables on the right-side, then the variable on the left-side can also derive a string of terminals, and should be added such as “D” from “D->dF”. Click on variable “D” to add it. After clicking these variables, your window should look like :

There are two new items shown. A visualization graph and below it, a copy of the grammar is shown, with rules missing that have variables that could not derive a string of terminals, productions with the variables “B” and “E”.

The visualization has a node for each variable in the remaining grammar. This graph is different then the graph for removing unit-productions. For this graph, add a transition from variable “X” to variable “Y” if there is a production with “X” on the left-side and “Y” anywhere on the right-side.

Now, we have to add 5 transitions among these variables. From the production “S->bAC”, we know that variable “S” depends on both “A” and “C”. We also know that “A” depends on “C” based on the production “A->cC”. We will add these transitions to our state diagram, by clicking the arrow in the panel and creating transitions. Add the remaining transitions. Now your window would look like :

We now need to remove rules with those variables that are not reachable from “S”. That would be rules with “D” and “F” as those nodes are not reachable from “S” in the graph. Click on them and then click DELETE. After eliminating all of the useless productions, your window should look like :

Now we have successfully removed useless productions. By clicking on the Proceed button, we will move on to our final step.

Convert to CNF

Now, we are at the final step of converting our grammar to CNF. In CNF, the right side of a production is either one terminal or two variables. If you have done everything correctly in previous steps, your window should look like this now :

Notice that the production “S->bAC” does not follow our CNF restriction and it must be converted. You can convert this production by clicking on the production on the right panel and click Convert Selected. Now your JFLAP window will look like :

We have replaced the terminal “b” in “S->bAC” with a new variable “B(b)” and a production “B(b)->b”. The production “S->B(b)AC” now has all variables on the right-side, but has too many. Clicking Convert Selected again expands this production into 2 productions “S->B(b)D(1)” and “D(1)->AC” and another new variable “D(1)”. After following this step, your window should look like :

As JFLAP indicates, we have to convert 4 more productions. Try to select those productions, and create the missing productions.

Alternatively, click on Do All to finish the conversion. After we are finished with the conversion, the grammar will be in CNF. Our final window should look like below and we can export this grammar to utilize in fast parsing.