CSE 370 | Learning Activities

W05 Learning Activity: Release and Maintenance

Overview

When a new product or feature is completed, it needs to be released to the customer. When something breaks, it needs to be fixed. These situations require us to carefully consider how we update and maintain our software systems.

By the end of this lesson you will understand how to work through the complexities of releasing software updates and how to keep a software system running well.

Key Points to Remember:

Once a piece of software is deployed, it cannot be forgotten. It requires proper care and feeding.
Preventative maintenance is critical to enable flexibility for times that changes are needed in the future.
Technical debt, the cost associated with making choices to get something done quickly, is a natural part of the software development process. You need to be aware of it and have a plan for it, or you will live with those choices for a very long time.

Preparation Material

Maintenance is a critical part of the Software Development Lifecycle. It involves making changes to the software system to improve its performance, adapt it to new environments, fix bugs, and enhance its functionality.

Types of Software Maintenance

There are four main types of maintenance in a software project:

Corrective Maintenance: This type of maintenance involves fixing defects or bugs identified in the software after it has been deployed. Corrective maintenance aims to restore the software to its proper functioning state. It typically involves identifying the root cause of the issue, developing a fix, testing it, and deploying the patch. Corrective maintenance is crucial for ensuring the reliability and stability of the software system.
Adaptive Maintenance: Adaptive maintenance involves making changes to the software to adapt it to new environments or requirements. This could include updating the software to be compatible with new operating systems, hardware platforms, or third-party software components. Adaptive maintenance ensures that the software remains functional and effective in evolving technological landscapes. It may also involve modifying the software to comply with new regulations or industry standards.
Perfective Maintenance: Perfective maintenance involves enhancing the functionality of the software to meet changing user needs or improve its performance. This could include adding new features, optimizing existing algorithms, improving user interfaces, or enhancing system security. The goal of perfective maintenance is to make the software more efficient, user-friendly, and competitive in the marketplace. It often involves gathering feedback from users and stakeholders to prioritize enhancements that provide the most value.
Preventive Maintenance: Preventive maintenance aims to proactively identify and address potential issues before they occur. This could involve refactoring code to improve its maintainability, performing regular system audits to identify vulnerabilities, or implementing monitoring tools to detect performance bottlenecks. It can also include updating third-party libraries and other dependencies. Preventive maintenance helps prevent costly downtime, security breaches, and performance degradation by addressing issues early on. It also contributes to the long-term stability and sustainability of the software system.

Technical Debt

A major part of both perfective and preventative maintenance is removing technical debt.

Technical debt refers to the implied cost or consequences that arise from choosing an easy or quick solution over a more robust or sustainable one when developing software. The term debt is meant to convey the consequences of shortcuts, trade-offs, or compromises made during the development process.

Shortcuts and Quick Fixes

Technical debt often occurs when developers opt for quick fixes or shortcuts to meet deadlines or deliver features faster. These shortcuts may involve using inefficient code, bypassing proper documentation, or neglecting testing procedures. This is a natural process as you try to find a way to get something to work. The trouble is that once the task is complete, it can be difficult to make time to circle back and fill in the missing pieces.

Accumulation Over Time

Like financial debt, technical debt accumulates interest over time. Every shortcut or quick fix adds complexity and decreases the overall quality of the software. As more features are added and the codebase grows, technical debt can become a significant burden that slows down development and increases the risk of errors and bugs.

Types of Technical Debt

There are several types of technical debt, including:

Design Debt: Arises from poor architectural decisions or inadequate system design.
Code Debt: Occurs due to writing code that is hard to understand, maintain, or extend.
Testing Debt: Results from inadequate test coverage or skipping proper testing procedures.
Documentation Debt: Arises when documentation is incomplete, outdated, or non-existent.

Impact on Development

Technical debt can have various impacts on the development process, including:

Reduced Productivity: Developers spend more time fixing bugs, refactoring code, or understanding complex systems, leading to decreased productivity.
Increased Maintenance Costs: As technical debt accumulates, it becomes more costly to maintain and extend the software, requiring more time and resources.
Higher Risk of Failure: Technical debt increases the likelihood of errors, bugs, and system failures, which can negatively impact user experience and business operations.
Difficulty in Scaling: Accumulated technical debt can hinder the scalability of the software, making it harder to add new features or accommodate increased user demand.

Managing Technical Debt

To mitigate technical debt, teams must prioritize refactoring, code reviews, and quality assurance practices. It's not easy for teams and companies to do, but it is essential to strike a balance between delivering features quickly and maintaining the long-term health and sustainability of the software. Regularly addressing technical debt helps prevent it from becoming overwhelming and ensures that the software remains agile, reliable, and maintainable over time.

Keep in mind that technical debt is a natural part of the software development process, especially as you try to create rapid prototypes or get features working quickly. The important thing is to be aware of its existence and actively manage it to avoid long-term negative consequences.

Change Control Processes

As software projects grow and gain active users, it becomes very important to define and follow a rigorous process for making changes. These change control processes include approvals for rolling out or rolling back features, as well as updating documentation and artifacts.

Depending on the software lifecycle model you are using, the process can be more deliberate and methodical or more nimble and spontaneous. It can require more formal documentation or be more informal in nature.

Processes that are more agile in nature have become very popular, because they reduce the size and impact of any one deployment, making each one less likely to cause a major problem, and easier to rollback if an issue does arise. In addition, because deployments are smaller, they happen more frequently, making the whole process more automatic.

Continuous Integration and Continuous Delivery/Deployment

Most modern systems employ the principle of Continuous Integration and Continuous Delivery/Deployment, often abbreviated CI/CD or even CICD. The goal of CI/CD is to automate and streamline the process of building, testing, and deploying software.

Continuous Integration (CI)

CI is a development practice where developers integrate their code changes into a shared repository frequently, preferably several times a day. Each integration triggers an automated build process, which includes compiling the code, running automated tests, and generating build artifacts. The primary goal of CI is to detect integration errors early and ensure that the codebase remains in a working state at all times. CI systems often provide feedback to developers quickly, allowing them to address issues promptly and maintain code quality.

Continuous Delivery (CD)

CD extends CI by automating the process of deploying code changes to either production or staging environments after the compilation and tests have finished. With CD, every successful build that passes the automated tests is considered deployment-ready. The deployment process in CD typically involves deploying the build artifacts to a staging environment for further testing and validation. Once validated, the same artifacts can be deployed to production with minimal manual intervention, enabling rapid and reliable software releases.

Continuous Deployment

Continuous Deployment takes Continuous Delivery a step further by automating the deployment of code changes directly to production without human intervention. In a continuous deployment setup, every successful build that passes the automated tests is automatically deployed to production. Continuous Deployment requires a high level of confidence in the automated tests and deployment process to ensure that new changes do not introduce regressions or disruptions in the production environment.

Key Components of CI/CD

Version Control System (VCS): A central repository where developers commit their code changes. Git is the most common VCS tool, though a few organizations use other systems, such as Subversion, and Mercurial.
CI Server: A server that orchestrates the CI process, including triggering builds, running tests, and generating build artifacts. Popular CI servers include Jenkins, GitHub Actions, Travis CI, CircleCI, and GitLab CI/CD.
Automated Tests: Tests that are executed automatically as part of the CI/CD pipeline to validate the correctness and stability of the software. These tests can include unit tests, integration tests, and end-to-end tests.
Deployment Pipeline: A series of automated steps that packages, tests, and deploys code changes from the VCS to the target environment. The deployment pipeline typically includes stages for building, testing, staging, and production deployment.
Infrastructure as Code (IaC): The practice of defining and managing infrastructure configurations using code. IaC tools like Terraform or AWS CloudFormation enable the automation of infrastructure provisioning and configuration, making it easier to maintain consistent environments across different stages of the deployment pipeline.

Benefits of CI/CD

Faster Time to Market: CI/CD enables faster delivery of software updates by automating repetitive tasks and reducing manual overhead.
Improved Code Quality: Automated testing and deployment help identify and address bugs and integration issues early in the development process, resulting in higher code quality.
Reduced Risk: By automating deployment and testing, CI/CD reduces the risk of human error and ensures that changes are thoroughly validated before reaching production.
Increased Collaboration: CI/CD encourages collaboration among team members by providing a centralized platform for integrating, testing, and deploying code changes.

Overall, CI/CD practices play a crucial role in enabling agile and DevOps principles, allowing organizations to deliver software faster, more reliably, and with higher quality.

Drawbacks of CI/CD

While CI/CD is very popular and brings a number of important benefits. There are some potential drawbacks as well. For example, because traditional processes make deployment more of an event, each deployment requires more thought and discussion and can help you consider the larger, big picture ramifications of each change. More deliberate processes could also cause more scrutiny of technical debt, where the more rapid nature of CI/CD can potentially focus on the short-term goal of getting something working, and then moving on to the next feature.

Similarly, when using a CI/CD approach, it can be easy to forget to update important documentation such as changing requirements documents, or other ways of capturing the institutional knowledge of the evolving system. Without proper documentation, important knowledge becomes solely held by engineers that may leave the company or be assigned to other tasks.

All of these issues can be overcome in a CI/CD approach as well, but it is important to recognize them to avoid their pitfalls.

Regression Testing

As mentioned above, and also in the lesson on testing and verification, regression testing is a critical component of a well-maintained system. The focus of regression testing is to ensure that new features or bug fixes do not cause unintended consequences in other parts of the system. Regression testing is the process of retesting the unchanged parts of the software to verify that they still work as intended after the modifications.

Regression testing can be performed manually, where testers re-run the tests themselves, or it can be automated. Automated regression testing is often preferred for its efficiency and ability to quickly detect regressions in large and complex software systems. In particular, regression testing is a natural part of a CI/CD system where the expected functionality can be verified in automated test cases that run every time new code is added to the project.

Rolling out New Features

It takes careful thought and planning to roll out new features to make sure your enhancements have the desired outcomes. One technique to help with this rollout is to use feature flags (also known as feature toggles, or feature switches). Feature flags define a process where a new feature in the code is deployed, but there is conditional logic controlled by certain conditions at runtime that can enable or disable the feature, or control its visibility to certain users.

Using feature flags, you can deploy a feature, but leave it completely turned off until an appropriate time, or you can also roll out the feature gradually and monitor its behavior. The following are potential benefits of a feature flags approach:

Dynamic Control: Feature flags allow developers to dynamically control the activation or deactivation of features without modifying the codebase. This flexibility enables gradual rollouts, A/B testing, and can mitigate risks associated with deploying new features.
Gradual Rollouts: By gradually enabling features for a subset of users, developers can monitor their performance, gather feedback, and ensure that they function as expected before rolling them out to the entire user base. This approach helps identify and address issues early in the development cycle.
A/B Testing: Feature flags facilitate A/B testing, where different variants of a feature are tested against each other to determine which one yields better results. By segmenting users and enabling different feature variations, developers can collect data and make data-driven decisions to optimize user experience and business outcomes.
Risk Mitigation: Feature flags allow developers to mitigate risks associated with deploying new features by controlling their visibility or behavior in production environments. If a newly released feature causes performance issues or unexpected behavior, developers can quickly disable it without having to roll back the entire release.
Continuous Delivery: Feature flags align with the principles of Continuous Delivery by enabling smaller, more frequent releases. Developers can continuously deploy code changes to production while controlling the exposure of new features using feature flags. This approach reduces the time-to-market and enhances agility.
Codebase Simplification: Feature flags help keep the codebase clean and maintainable by avoiding the need for conditional statements scattered throughout the code. Instead, feature flag logic is centralized, making it easier to manage and understand which features are enabled or disabled in different environments.
User Personalization: Feature flags can be used to personalize user experiences by selectively enabling features based on user attributes, preferences, or behaviors. This enables developers to tailor the application's functionality to specific user segments, enhancing user satisfaction and engagement.

Overall, feature flags are a powerful tool for managing feature releases, conducting experiments, and maintaining control over the software's behavior in production environments. By leveraging feature flags effectively, development teams can iterate rapidly, gather feedback iteratively, and deliver value to users more efficiently.

Supporting Multiple Versions

While feature flags help roll out incremental feature additions, in some circumstances, it is necessary to truly support multiple live versions of a product. This is especially true for products that are downloaded and used offline, or for services that are used by other systems that may not all be able to update at the same time as your release.

When using a microservices architecture, multiple versions can be easily maintained by including a version number in the API call. This ensures that systems that call that service can specify and receive the appropriate functionality.

Maintaining multiple live versions of a product can be challenging. Here are some principles to keep in mind:

Versioning Scheme: Establish a clear versioning scheme for your software. Semantic Versioning (SemVer) is a popular choice, where versions are represented as MAJOR.MINOR.PATCH (for example, 1.0.0). This scheme helps users understand the significance of each update.
Long-Term Support (LTS) Releases: Designate certain versions as Long-Term Support (LTS) releases, which receive extended support and maintenance beyond regular versions. This ensures stability for users who prefer not to upgrade frequently.
Version Control: Use version control systems like Git to manage code changes for different versions. Create branches for each major version and maintain separate codebases for them.
Isolated Environments: Maintain isolated environments for each version of the software, including development, testing, staging, and production environments. This prevents conflicts between different versions and facilitates testing and deployment.
Backward Compatibility: Strive to maintain backward compatibility whenever possible. New versions should be able to work with data and configurations from older versions, ensuring a smooth upgrade process for users.
Documentation: Provide comprehensive documentation for each version, including release notes, upgrade guides, and compatibility information. This helps users understand changes between versions and how to migrate to newer releases.
Parallel Support: Allocate resources for supporting multiple versions simultaneously, including technical support, bug fixes, and security updates. Prioritize critical issues and allocate resources based on the popularity and importance of each version.
Communication: Keep users informed about the status of each version, including upcoming releases, end-of-life dates, and recommended upgrade paths. Regularly communicate through release announcements, newsletters, and community forums.
Automated Testing: Implement automated testing pipelines for each version to ensure that updates and patches do not introduce regressions. Continuous Integration (CI) and Continuous Deployment (CD) pipelines can help automate testing and deployment processes across multiple versions.
Migration Paths: Provide clear migration paths for users to upgrade from older versions to newer ones. Offer tools, scripts, and utilities to assist with data migration, configuration changes, and compatibility checks.

The Life and Death of a Software System

When you are first developing a product, it is hard to consider that at some point, it will need to be retired. But, whether it is phased out completely, or replaced by a subsequent version, planning for the End of Life (EOL) of software, or Sunsetting it, is an important step of maintaining it. Managing the EOL process is essential to address various issues and ensure a smooth transition for users and stakeholders.

One of the most important elements of the EOL process is clear communication with stakeholders. The more they understand the timing, process, and rationale, the better they can prepare for a transition to the next solution. In many cases, users are reluctant to leave the old system due to familiarity, convenience, or the cost of making a change. You can try to proactively address user's concerns by highlighting benefits of the new system including enhanced security, compatibility, and sustainability. If necessary, you may also supply tools, guidance, and best practices for migrating. In any case, you need to clearly articulate the timing and the level of support that will be continued as the product is discontinued.

End of Life for Dependent Libraries

In addition to managing the end of life of your system, it is likely that you will need to work through the EOL or upgrade process for libraries or other systems that yours depends on. The two main issues you will need to navigate are the costs and risks associated with either upgrading or staying with an older library.

The most obvious cost of upgrading a dependent library is that you will have to devote development resources to understanding the changes and integrating them into your codebase, ensuring that no regression bugs are introduced. In addition, upgrading one library may trigger the need to upgrade others, leading to a cascading effect of changes.

On the other hand, there are also costs and risks associated with not upgrading. One of the most significant risks is increased security vulnerability. Outdated systems are often more vulnerable to security flaws, increasing the risk of data breaches, malware attacks, and unauthorized access. In addition, over time, vendors may provide less support and maintenance for the older products, reducing compatibility or compliance.

From a strictly cost perspective, delaying updates contributes to accumulating technical debt, that makes future enhancements more challenging and costly.

Migrating Infrastructure

In addition to working through issues surrounding software-related migrations, you may also need to move a system from one infrastructure to another. This is a challenging process that involves various technical, operational, and organizational considerations. The following are the general steps you will need to follow:

Assessment and Planning:

Evaluate Current Infrastructure: Assess the existing infrastructure, including hardware, software, network, and dependencies.
Identify Goals and Requirements: Determine the objectives of the migration, such as improving performance, scalability, or cost efficiency.
Define Migration Scope: Identify the software components, data, and services that need to be migrated.
Risk Assessment: Evaluate potential risks and challenges associated with the migration and develop mitigation strategies.
Create a Migration Plan: Develop a detailed plan outlining the migration process, timeline, responsibilities, and resources required.

Preparation:

Backup Data: Ensure that all data and configurations are backed up securely to prevent data loss during the migration.
Document Dependencies: Document all dependencies, including software libraries, frameworks, databases, and third-party services.
Update Documentation: Update documentation to reflect any changes or configurations required for the new infrastructure.
Notify Stakeholders: Communicate the migration plan and timeline to stakeholders, including users, IT staff, and management.

Setup New Infrastructure:

Provision Resources: Provision hardware, virtual machines, or cloud instances for the new infrastructure based on the requirements identified in the assessment phase.
Install Software Dependencies: Install and configure necessary software dependencies, including libraries, frameworks, databases, and middleware.
Network Configuration: Configure networking settings, including IP addresses, subnets, DNS, firewalls, and load balancers.

Migration Execution:

Data Migration: Transfer data from the old infrastructure to the new one, ensuring data integrity, consistency, and security.
Software Installation: Install and configure the software components on the new infrastructure, following best practices and vendor guidelines.
Testing: Conduct comprehensive testing to validate the functionality, performance, and security of the migrated software.
User Acceptance Testing (UAT): Involve end-users in UAT to ensure that the migrated software meets their requirements and expectations.

Validation and Verification:

Functional Testing: Verify that all software features and functionalities work as expected in the new environment.
Performance Testing: Assess the performance of the software on the new infrastructure under various load conditions.
Security Testing: Conduct security testing to identify and address any vulnerabilities or compliance issues.

Deployment and Go-Live:

Go-Live Planning: Schedule the migration during a low-traffic period to minimize disruption to operations.
Deployment: Deploy the migrated software to the production environment, following the deployment plan and rollback procedures if necessary.
Monitoring: Monitor the software closely after deployment to detect any issues or anomalies and take corrective actions as needed.

Post-Migration Support:

Training and Support: Provide training and support to IT staff and end-users to familiarize them with the new infrastructure and address any questions or concerns.
Documentation Update: Update documentation to reflect any changes or configurations made during the migration process.
Performance Optimization: Continuously monitor and optimize the performance of the software on the new infrastructure to ensure optimal operation.

Post-Migration Evaluation:

Lessons Learned: Conduct a post-migration review to identify successes, challenges, and areas for improvement.
Feedback Collection: Gather feedback from stakeholders, including users and IT staff, to inform future migrations and improvements.

It is interesting to note how similar these steps are to the Software Development Lifecycle, because this migration project is a technical project itself, so all of the same principles apply.

Monitoring

Another critical part of maintenance is to carefully monitor the system as it runs, to identify any problems that arise, or hopefully, identify these issues before they become majors problems.

Performance Monitoring

Performance monitoring focuses on assessing the system's efficiency, responsiveness, and resource utilization. It may involve monitoring metrics in the following areas:

Response Time: Measure the time taken for the system to respond to user requests or events.
Throughput: Monitor the rate at which the system processes requests or transactions.
Resource Utilization: Track the usage of CPU, memory, disk I/O, and network bandwidth to identify potential bottlenecks or resource constraints.
Error Rates: Monitor the frequency and types of errors or exceptions occurring in the system.
Availability: Measure the uptime and downtime of the system to ensure high availability and reliability.

The following are some of the tools and techniques used in performance monitoring:

Logging: Capture and analyze logs generated by the software components to identify performance issues or anomalies.
Metrics Collection: Use monitoring tools to collect and visualize performance metrics in real-time or over time to identify trends and patterns.
Alerting: Set up alerts and notifications to notify stakeholders when performance metrics exceed predefined thresholds or indicate abnormal behavior.
Profiling and Tracing: Use profiling tools to analyze code execution and identify performance hotspots or inefficiencies. Tracing tools can track the flow of requests through the system to pinpoint performance bottlenecks.

It is important to not only watch for problems, but also to do prognostic, or preventative, monitoring. This means that you watch for anomalies or make predictions about what your system will need in the future, so that you can begin to prepare for it now.

Continue the conversation

After completing this reading, ask 3-5 follow up questions about software release and maintenance to an AI system of your choice. (You may use ChatGPT, Bing, Claude AI, Gemini, or any other system of your choosing.)

Good questions may include:

Explain _______ (for example, CI/CD, technical debt, feature flags)
What are the benefits or drawbacks of ________ ?
How can a software product be upgraded while it is being used?
What should you do when you want to retire a software product?
What kind of monitoring should be done for a live software product?
Why is software maintenance difficult?

Submission

After you are comfortable with these topics, return to Canvas to take the associated quiz.

Footnotes

See this 2023 article on Stack Overflow for a breakdown of version control system popularity. Back to content ↩