Advanced Git bisect and blame techniques for tracking down bugs and regressions in your codebase

June 13, 2023

We have all experienced the slowest and most laborious method of locating a problematic git commit. It involves checking out an old commit, ensuring the faulty code is absent, proceeding to a slightly more recent commit, conducting another inspection, and repeating this process iteratively until the flawed commit is discovered.

What is git bisect and when to use it

Git bisect is a powerful tool within git designed to help developers efficiently trace bugs and regressions in their codebase. It automates the process of searching through the git commit history, employing a binary search algorithm to quickly identify the specific commit responsible for introducing an issue.

By specifying the known good and bad commits, git bisect divides the search space in half at each step, automatically checking out and prompting the developer to test the midpoint commit. Based on the test result, git progresses to the next commit, continuing the process until the problematic commit is pinpointed. This streamlined approach significantly reduces the time and effort required to identify the culprit commit, particularly in larger codebases. Since it is a binary search, your bad commit should be found in about log2(n) tests. So if you have 20,000 commits to test, you will find your bug within about 15 checks!!

F = Fail
W = Working
? = Unknown
        > HEAD                                                 > First Commit
        |                                                      |
        F----?----?----?----?----?----?----?----?----?----?----W
        |                                                      |
        F----F----F----F----F----F----F----?----?----?----?----W
        |                             ^                        |
(T1)                                  | First test verifies commit is bad
        F----F----F----F----F----F----F----?----?----W----W----W
        |                                            ^         |
(T2)                                                 | Second test is good
        F----F----F----F----F----F----F----W----W----W----W----W
        |                                  ^                   |
(T3)                                       | Third test is good
        F----F----F----F----F----F----F----W----W----W----W----W
        |                             ^                        |
                                      | Done bisecting, found first failing commit

Reproduce a problem scenario with grey box testing

Grey box testing, with its blend of knowledge about the internal structure and external behavior of the software, is particularly effective when it comes to reproducing problem scenarios. It enables testers to utilize their understanding of the software's internals to design targeted test cases that replicate the specific problem scenario.

Identify the feature or functionality area where the issue occurs.
Gather functional test data by analyzing user inputs, outputs, actions taken, possible paths, and system constraints. This helps form an API test strategy that establishes communication between different software components.
Execute tests separately or simultaneously to single out and replicate the error sequence until it's reproducible across all conditions.
Make sure the test is compatible and executable across all commits in the range of git commit history you are planning to test.

Remember that problem reproducibility is pivotal for a successful and efficient git bisect and test process.

Let's consider a scenario where a web application is experiencing intermittent issues with a particular API endpoint. Testers, armed with knowledge of the API implementation, can design grey box tests that exercise various input combinations, simulate different response scenarios, and observe how the system behaves. By leveraging their understanding of the internal workings of the API, testers can create targeted functional tests that specifically aim to reproduce the problem scenario.

Here's an example of a grey box test for the problematic API endpoint:

pythonCopy codedef test_api_endpoint():
    # Set up test data
    ...
    
    # Invoke the API endpoint with specific inputs
    response = make_api_request(input_data)
    
    # Check the response and compare with expected results
    assert response.status_code == expected_status_code
    assert response.json() == expected_response_data

In this grey box test, testers can manipulate the input data and observe how the API endpoint behaves. They can intentionally craft scenarios that trigger the reported problem, monitor the response status code and payload, and compare them against the expected results.

Grey box testing and how to use it with git bisect - finding the code changes that caused the problem

Grey box testing is a great testing strategy for git bisect as it enhances the ability to reproduce problem scenarios accurately across the commit history.

By leveraging the power of both techniques, developers and testers can streamline the regression testing process and quickly pinpoint problematic code alterations.

To employ grey box testing with git bisect, go through the following steps:

Establish a set of test cases that cover the problem scenario. Make sure the tests are compatible and executable across all commits in the range of git commit history you are planning to test.
Bisect your codebase: Use Git Bisect to split your code into halves and identify which version has the problematic change.
Run the set of tests to verify whether the particular commit passes of fails. By running the grey box test cases as part of the git bisect process, developers can systematically identify the commit that introduced the problem.
Once you have found the fault, it is essential to create additional unit and/or functional tests that verify the offending code and, in the future, avoid regressions when they are run as part of an automated test suite.

Here's an example of how grey box testing can be combined with git bisect:

bashCopy code$ git bisect start
$ git bisect bad  # Specify the latest known bad commit
$ git bisect good <commit>  # Specify the last known good commit

# At each step, execute the grey box test cases and assess the behavior
$ git bisect run python run_grey_box_tests.py

# Git bisect will guide the process, checking out commits and prompting for test results

$ git bisect reset  # Once the problematic commit is identified, reset the bisect process

Gray Box Testing.png

Git blame & bisect - who did what, when and why!

It is crucial to have visibility into the history of code changes and understand who made each modification. Git blame is a powerful command that allows teams to determine the author and the commit details for each line of code within a file. By using git blame, developers can effectively track the origin of specific code changes and gain insights into the evolution of the codebase.

When the commit that introduced the problem is tracked down with git bisect and grey box testing, use git blame to identify the exact files, code changes and authors of the specific commit that may have contributed to the problem.

This information is invaluable when trying to understand the context and reasoning behind a specific change or when investigating issues introduced by certain code alterations.

The value of adding grey box tests to an automated testing pipeline to avoid further regressions in the future

Test automation is a key aspect of modern software development, enabling teams to detect and address issues quickly while maintaining code quality. Continuous integration (CI) practices further enhance this process by automatically running tests on every code change, ensuring that the software remains stable and functional.

Grey box test cases that focus on specific modules and their internal behaviors exercise different code paths, edge cases, and boundary conditions, verify not only a bug fix but may detect any unintended side effects or regressions.

The work you have done scripting the grey box test sequence for the git bisect sequence should not be lost. Any test scenario that can be automated should be added to the automated test suite, protecting your code base against future regressions that can easily happen after refactoring, adding new features or even fixing seemingly unrelated bugs.

Incorporating grey box tests into the automated testing pipeline leads to higher test coverage, fewer regressions and improved code stability and confidence. It empowers teams to proactively identify and prevent issues that might arise due to internal changes, ensuring that the software maintains its intended functionality.

We recommend watching Fireship’s video, where he talks all about Git Techniques, including yes you guessed it, Git Bisect.

Making your code future-proof from day 1: Why adopting a test-driven development approach improves code quality, stability, and reliability now and in the future

In the fast-paced world of software development, ensuring code quality, stability, and reliability is paramount. One effective approach to achieve these goals is through test-driven development (TDD), a methodology that emphasizes writing tests before writing the actual module code. By adopting a TDD approach, developers can reason about the component's functionality, interfaces, dependencies and scope early in the process. TDD forces development teams to design their code in a clean, modular, decoupled fashion with limited dependencies and scope. If you can't write a test for a method or a suite of tests for a component - change the design, scope and dependencies of the module until it is easy to write tests.

TDD is not limited to white-box and unit-testing testing only. Adding grey-box tests for APIs and interfaces while you develop them will validate interfaces work exactly how they were intended.

As the development progresses, new features or refactoring of the codebase may introduce bugs in existing code. With a comprehensive suite of regression tests already in place, TDD helps to ensure that previously implemented functionality continues to work correctly. This reduces the risk of regressions and provides confidence when making modifications.

TDD also promotes collaboration among team members. Tests act as a common language between developers, testers, and other stakeholders, enabling clearer communication about requirements and expected behavior.

Test automation and continuous integration (CI) are an absolute no-brainer. Fast-paced projects that release early and release often must also test early and test often.

By embracing test-driven development and test automation, developers improve the immediate quality and reliability of their code and establish a foundation for long-term maintainability. With a robust suite of tests, future changes or enhancements can be made with confidence, knowing that the existing functionality will remain intact. This proactive approach to testing empowers development teams to deliver high-quality software, increase productivity, and build a codebase that is future-proof.

Conclusion: every problem is an opportunity to improve automated test coverage!

Tracking down bugs and regressions in your codebase can be a daunting task, but it also presents an opportunity to enhance your automated test coverage. Git blame and bisect techniques in combination with grey-box testing offer powerful tools for isolating problematic commits and identifying the root cause of issues. However, it's important to recognize that relying solely on this approach is reactive in nature. By investing in comprehensive test development (TDD) and automated testing (CI) from day 1 of your project, your team can proactively catch issues before they make their way into the commit history or, even worse, the product or service.

Embrace the power of testing and turn every problem into building stronger code that you can confidently run in production and still watch your favourite show on Netflix before you have a deep and peaceful 8h sleep at night.

Previous: Managing Commit History Like a Pro: Tips and Techniques for Streamlined Code History

Next up: Advanced Git merge conflict resolution techniques

Until then, happy coding everyone!

Sean Manwarring

If you enjoyed this article and want to read more like it, visit our blog space 📰

Visit our flagship app Workzone - Automate and control your Pull Request Workflow, Reviewers, Compliance, Approvals and Merge Process.

Product Q&A

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

Get product advice from experts

Join a community group

Advance your career with learning paths

Earn badges and rewards

Connect and share ideas at events

Advanced Git bisect and blame techniques for tracking down bugs and regressions in your codebase

What is git bisect and when to use it

Reproduce a problem scenario with grey box testing

Grey box testing and how to use it with git bisect - finding the code changes that caused the problem

Git blame & bisect - who did what, when and why!

The value of adding grey box tests to an automated testing pipeline to avoid further regressions in the future

Making your code future-proof from day 1: Why adopting a test-driven development approach improves code quality, stability, and reliability now and in the future

Conclusion: every problem is an opportunity to improve automated test coverage!

0 comments

Comment

Was this helpful?

Thanks!

About this author

TAGS

Atlassian Community Events