CSE 270 | Reading Material

6.2 Automated Unit Testing

Introduction

Developers typically write unit tests to ensure that individual units of code (such as functions, methods, or classes) work correctly in isolation. The process of writing unit tests involves several key steps:

Identify Units of Code:
- Determine the smallest testable components in your codebase. These can be individual functions, methods, or classes.
Understand Requirements:
- Have a clear understanding of the requirements and expected behavior of the unit you are testing.
Choose a Testing Framework
- Select a testing framework that suits the programming language and technology stack you are working with. Popular testing frameworks include:
  - Python: Pytest, unittest
  - JavaScript/Node.js: Jest, Mocha, Jasmine
  - Java: JUnit, TestNG
  - C#: NUnit, xUnit
Write Test Cases
- Create individual test cases for each unit, covering different scenarios and edge cases. Test cases typically consist of the following components:
  - Arrange: Set up the initial conditions and inputs for the test.
  - Act: Execute the unit or method under test.
  - Assert: Verify that the actual output matches the expected result.
- Unit test cases are classic examples of “white box testing” because we should understand at a low level what the function is doing and what paths through the code need to be examined.
Run Tests Locally
- Execute the unit tests locally (on your own computer) until they pass.
- If the unit test does not pass, evaluate if it is the test that is incorrect or the code being tested that needs to be updated.

A Simple Example in Python

In this example I have created a simple function in Python that needs to be tested. This function computes the factorial of a number.

# factorial.py
def factorial(n):
    if n < 0:
        raise ValueError
    if n == 0:
        return 1
    result = 1
    for i in range(1, n + 1):
        result *= i
    return result

The given factorial function has three main paths based on the input value n:

If n is negative, it raises a ValueError.
If n is 0, it returns 1.
For positive n, it calculates the factorial using a loop.

Let's generate test cases for each of these paths using Pytest. With Pytest, we must name our test case file with the prefix test_* so that it is read by the Pytest system. Also the test functions within our class should also be prefixed with test_* so Pytest knows to execute them as tests.

# test_factorial.py
import pytest
from factorial import factorial
def test_factorial_negative_input():
    with pytest.raises(ValueError):
        factorial(-1)
def test_factorial_zero():
    assert factorial(0) == 1
def test_factorial_positive_input():
    assert factorial(5) == 120
def test_factorial_large_input():
    assert factorial(30) == 265252859812191058636308480000000

Recall our discussion about equivalence classes from earlier units. Each of these tests exercises one particular path through the code of the function. We could write additional tests for the factorial of 3, or the factorial of 10, but these tests would simply exercise the same code path that is exercised by the test for factorial of 5. That means that all positive tests are basically in the same equivalence class. Adding tests within the same equivalence class often adds overhead to the testing process without providing additional value.

Now consider our boundary value discussion. We’ve tested negative numbers, zero and positive numbers. Is there any benefit to testing large positive or negative numbers? There might be if we are concerned about overflow conditions. A test for factorial of 30 gives us a fairly large number as output and could be considered a good boundary value test.

Code Coverage and Unit Tests

Code coverage metrics serve as a bird's-eye view, giving you an understanding of how much of the program you have written has been covered with your tests. They tell you which code paths have been evaluated and which may still be missing or untested.

Many languages have code coverage tools to test the effectiveness of your unit tests. In Python a commonly used tool is coverage.py. Using coverage.py, you can generate a report that shows you what percentage of your code is covered by a test.

As an example, if I were to remove the test test_factorial_negative_input from the test file and run the coverage tool, I would get output similar to this:

Name                Stmts   Miss  Cover   Missing
-------------------------------------------------
factorial.py            9      1    89%   5
test_factorial.py       8      0   100%
-------------------------------------------------
TOTAL                  17      1    94%

The coverage report tells me that there is no test case covering line 5 of my factorial.py program. This is the line that checks for negative numbers.

Some organizations insist on a particular code coverage percentage before code can be promoted to production. Using a code coverage tool in this way comes with its own set of advantages and disadvantages.

Pros of Using a Code Coverage Tool

Code coverage tools highlight areas of your codebase that have not been exercised by tests. This helps in identifying gaps in test coverage.
Acts as a metric for quality assurance by quantifying the extent of code coverage. Higher code coverage is often associated with more robust and reliable software.
Easily integrates with continuous integration (CI) and continuous delivery (CD) pipelines. This ensures that code coverage is checked automatically during the development lifecycle.

Cons of Using a Code Coverage Tool

While higher code coverage is generally desirable, it does not guarantee the quality of tests or the absence of defects. Code coverage metrics focus on the quantity of code exercised, not the effectiveness of tests.
Over reliance on code coverage metrics may give a false sense of security. Achieving high code coverage doesn't necessarily mean that all possible scenarios have been tested.
Developers might unintentionally focus on writing tests for code that is easier to test, leading to neglect of more complex or critical areas that may be harder to cover.
Code coverage tools may not cover all execution paths, especially in complex systems. Certain paths may only be triggered under specific conditions that are hard to replicate in tests.
Establishing a universal benchmark for what constitutes sufficient code coverage can be challenging. The ideal code coverage percentage may vary based on project requirements and industry standards.
The instrumentation required for code coverage might introduce a performance overhead, impacting the execution speed of the code.

Astute development managers will resist the urge to impose specific code coverage requirements and focus on holistic quality assurance processes within the company.

Test Driven Development

In the example above, I created a function to compute factorials, then I created tests to go along with it. There is a practice called test-driven development that does this the other way around. Using this method, the developer considers what tests need to pass in order for the function to be considered “correct” then writes the code that implements these tests.

Proponents of TDD suggest that the iterative nature of TDD encourages developers to write modular, clean, and maintainable code. The focus on passing tests ensures that code meets the specified requirements. Not everyone loves this approach, though. For example, individuals accustomed to traditional development practices may resist adopting TDD. There may be skepticism about the benefits it brings.

Using the test-driven development approach, the developer would follow these steps when implementing a function or feature.

Developers write a test that defines a function or improves an existing one. This test, however, is expected to fail as the functionality it describes has not been implemented yet.
Write the minimum amount of code required to make the failing test pass.
Once the test is passing, developers can refactor the code to improve its structure, readability, or performance. This step is optional and subject to time limitations on the project.
Create another test that checks for additional functionality that hasn’t been included yet.
Update the implementation to include that functionality as before and continue until the feature is fully developed.

Behavior Driven Development

Behavior-Driven Development (BDD) is a software development approach that encourages collaboration between different stakeholders, such as developers, testers, and non-technical individuals like product owners or business analysts. BDD aims to create a shared understanding of the software's behavior and requirements through the use of natural language specifications.

When integrating BDD with unit tests, the primary tool often employed is a testing framework that supports BDD-style syntax. One popular choice for this is Cucumber, which allows you to write feature specifications in a natural language format called Gherkin.

Here's how BDD works with unit tests:

Write Feature Specifications

In BDD, you begin by writing high-level feature specifications using natural language. These specifications describe the expected behavior of the software in a user-centric way. You will recognize the familiar given-when-then syntax of acceptance criteria.

Feature: Login Functionality

Scenario: Successful login

Given the user is on the login page

When they enter valid credentials

Then they should be redirected to the dashboard

Convert Specifications to Step Definitions

Each line in the feature specification is associated with a step definition, which is implemented in code. These step definitions translate the natural language into executable code.

from behave import given, when, then
from some_module import login_functionality  
@given('the user is on the login page')
def step_given_user_on_login_page(context):
    # Implement code for navigating to the login page
@when('they enter valid credentials')
def step_when_user_enters_valid_credentials(context):
    # Implement code for entering valid credentials
@then('they should be redirected to the dashboard')
def step_then_redirected_to_dashboard(context):
    assert login_functionality(context.credentials) == 'dashboard_url'

Implement Unit Tests

Behind the scenes, each step definition often corresponds to a unit test. These unit tests focus on testing individual units of code that fulfill the specified behavior. Here’s an example:

def test_successful_login():
# Arrange: Set up test data and environment
setup_login_page()
valid_credentials = {'username': 'user', 'password': 'pass'}
# Act: Perform the action being tested
result = login_functionality(valid_credentials)
# Assert: Verify the expected outcome
assert result == 'dashboard_url'

Execute Tests

Execute the tests using a testing framework, and the BDD specifications are transformed into executable tests. If the tests fail, the natural language specifications help in understanding what the failure was.

Benefits of BDD

BDD promotes collaboration between team members by providing a common language for discussing and specifying requirements. Team members can iterate on the feature specifications and associated tests as the software evolves.
BDD scenarios guide the creation of unit tests, ensuring that the specified behaviors are thoroughly tested.
BDD scenarios provide traceability between high-level requirements and the corresponding code.

Challenges of BDD

Team members may need time to become accustomed to the BDD syntax and workflow.
Ensuring that feature specifications and corresponding step definitions stay in sync with the evolving code can be a challenge.

Useful Links: ←Unit 6.1 | Unit 6.3→ | Table of Contents