CSE 212: Programming with Data Structures

Learning Activity: Understanding Code Using Reviews

Overview

We have frequently been asked to write code for either school or work. However, how often have we been asked to complete the companion activity of reading code? We have likely looked at code from websites and books, but there is a significant difference between looking at code and reading code. When we read code, we are attempting to understand the code like we would a book. If we open a book and read a few random pages, we might get a high level summary of what the book is about. However, to fully understand the book, we would need to not only read the book cover to cover, but we would also need to become acquainted personally with the diverse set of characters, follow a potentially winding plot, and discover the underlying messages woven in the story from the author. This type of reading takes effort. Reading code for understanding takes an equal amount of effort.

Preparation Material

There are multiple reasons why we might be asked to read code in our teams:

When we read code others have written, we refer to this activity as a review. A review should follow a methodical process. Many companies will include a review checklist for engineers to use to ensure that they both understand the code and that they have checked all the coding standards. When you review code, there are several strategies that you can use including the following:

Each of these methods will be discussed below. When working with these methods, we should try to employ principles learned from the scientific method. The scientific method requires us to form a hypothesis about what we think the code should be doing. As we read through the code, we will test our hypothesis and look at the results. If we are incorrect, then we can correct our misconceptions and form a new hypothesis. This iterative process is necessary to fully understand code. There are no shortcuts. We can be tempted to search for the answer online or in a book. While this may produce a faster answer, we will have not obtained full understanding of the code.

In this process, it is possible that we might find defects in the code that we were given. Code reviews are a common tool for increasing software quality. Performing these steps will require us to keep a good notebook to record our hypothesis, experiments, and conclusions. A good reviewer will always have a pen and paper ready to complete their task.

Read Code "Cover to Cover"

Unlike a book, code does not begin on page one. We need to find where the code begins and follow it as it calls functions, runs loops, and branches in different directions with decision statements. If the code has multiple functions, the creation of a structure chart (also called a calling tree) will be helpful. The structure chart will use boxes to represent functions and arrows to represent functions calling functions. Frequently drawn with the starting function at the top and working downwards, these diagrams can help us navigate through the code. On the arrows, we may frequently write the inputs and outputs related to each function. This will help us better understand the data in the code and which functions are responsible for creating, modifying, and using that data.

Shows a structure chart where the main function calls the readAccessRecords function and the searchAccessRecords function. The readAccessRecords function receives an accessFile input and produces accessRecords and recordCount as an output. The searchAccessRecords function receives the accessREcords, numRecords, startTime, and entTime while producing no output. The searchAccessRecords function also call the displayAccessRecord function providing accessRecord as an input with no outputs.
Structure Chart

If the code contains classes, then creating a UML (Unified Modelling Language) class diagram to show the classes (with member data and member functions) and the relationships between the classes will help us to visualize the software and enable us to read different sections of the code based on their dependencies to others. For example, in the diagram below, we can see that the Order HAS-A (object composition shown with the filled in diamond) list of Products and that each Product IS-A either a PerishableProduct or an ElectronicProduct (inheritance shown with the open triangle). With this drawn, we would probably start with understanding the Product base class first since it has no dependency on other objects. Second, we would review the different types of Products. Finally, we would review the Order class which contains the list of the Product objects that we already understand.

Shows the classes and their relationships shown in the text above. The Order class has an integer order_id and a list of Product objects called products. The Order class has an init function, an add_product function that takes a product object as an input, and a display function. The Product class has a string name and a float price with an init function and a display function. The ElectronicProduct has a string url_download and both an init and download function. The PerishableProduct has a expiry datetime object and both an init and donate function.
UML Class Diagram

When we are looking at a single function, it can be useful to diagram the behavior of the function. Simple flow charts can quickly give us a better perspective of the loops and decisions in our code. In the flow chart, use diamonds to represent decisions, boxes to represent actions, and arrows to show flow.

Shows the flow of random number guessing game. After getting the seed from the user and seeding the generator, a random number is selected and the guess counter is reset. A loop begins with the user being prompted for a guess. The guess counter will be updated and then the guess is compared with the actual number. Either the phrase higher, lower, or congratulations will be displayed. If the correct answer was guessed, the guess count is also displayed. The user is prompted to play the game again. If they want to play again, then the process above repeats starting with the selection of a random number and the resetting of the guess counter. Otherwise, the software says goodbye and the program ends.
Flow Chart

"Execute" the Code Manually

It is not always practical to run code that we are given. If we can, then running the code with inputs that we generate can be helpful to understand the software. However, the goal in this process is to understand the software without running the code on the computer. Instead, we are going to run the software in our minds and on paper.

If we created diagrams in the previous step from our reading of the code "cover to cover," then we can run the code from the diagrams. This is an incomplete approach but is very helpful in gaining more understanding of what the software will perform.

To execute code manually in our minds and on paper, we must start at the beginning (or if we are looking at one piece of the software, perhaps start at the beginning of one of the functions). If inputs are provided at the beginning (or at any other place along the way), we will have to develop useful inputs to see what will happen. For example, consider the following code:

public static string DoSomething(string text) {
        var newText = "";
        foreach (var letter in text) {
            if (letter != ' ') {
                var newLetter = (char)(letter + 1);
                newText += newLetter;
            }
            else {
                newText += letter;
            }
        }
    
        return newText;
    }
    

The method needs a string for an input, and so we will propose something like "Hello". We will then step through each line of code and record in our notebook the value of each variable. If we come across a code function that we don't understand (e.g. the (char) type casting), then go online to read about those. Each char is stored as an ASCII numeric code that represents a character. As soon as we add to a character letter + 1, it becomes an integer, so the (char) tells C# that it needs to consider whatever number the integer is as a char data type. When we finish evaluating our code, we get Ifmmp which appears to be a form of simple childhood encryption in which each letter in the original text is changed by one letter higher in the alphabet. When we ran the program, we noticed that you had to check if the letter was a space. It would be good to try to the test again with spaces. If we tried "Hello World", we end up with Ifmmp Xpsme. Not only does the function perform the encoding, but it also preserves the spaces.

text letter letter + 1 newLetter newText
Hello H 73 I I
e 102 f If
l 109 m Ifm
l 109 m Ifmm
o 112 p Ifmmp

This type of simple analysis by evaluating the variables, step-by-step, is often valuable to understand what the code is doing.

Analyze the use of Data Structures

When code contains a data structure like a list or a stack (and others that we will learn during the course), we should consider why the data structure was used. Data structures are used both for storing information but also to use the information in different ways.

For example, in this week's material you are also learning about stacks and queues. With this in mind, recognizing that a stack can be used to remember where we have been and potentially go reverse or backwards, when you see a stack in a program, you can form a hypothesis about what the code is doing. During the activities this week, you will be reading code that uses stacks and queues. Knowing the strengths, weaknesses, and common uses of these data structures will help you better understand the purpose and behavior of programs that contain them.

Activity Instructions

Complete each of the following:

  1. Review the following code for a Mystery Program.
  2. While reviewing this code:
    • Diagram the function calls that are made.
    • Take notes on the variables and their use.
  3. Without running the code, predict the output. Even if there are a few function calls that you are not familiar with, try to come up with a written description of what this program does.
  4. Try to predict what the rest of program would do if the integer array d had the values [2, 2, 3, 3, 3]
  5. When you have finished reviewing the code, compare your findings with the following solution:
Solution (Click to Expand)

This program rolls a set of 5 standard six-sided dice. It then calculates displays a score based on number number of pairs, triples, and so forth.

If the list of values were 2, 2, 3, 3, 3, the program would display a score of 30.

Key Terms

flow charts
A diagram that models the behavior of a program, algorithm, or function. Actions are shown in boxes, decisions shown in diamonds, and arrows are used to show execution flow.
review
A formal process of ensuring code is written correctly. Code is usually reviewed against the design and coding standards. Frequently, checklists are used to help the reviewer.
structure chart
A diagram showing which functions call which functions. Frequently, the arrows used to show function calls also include parameters that are passed between the functions.
UML
Unified Modeling Language. A formal modelling language to represent object-oriented designs. UML includes many types of diagrams including class diagrams, activity diagrams, and state diagrams.

Submission

When you have completed all of the learning activities for this week, you will return to I-Learn and submit the associated quiz there.

Other Links: