Gerrit/Code review

This is a guide to reviewing and merging contributions to Wikimedia code repositories, written primarily for developers performing code reviews.

Goals

edit
  • Provide quick reviews to avoid bitrot and feelings of abandonment. Fast feedback encourages developers to continue contributing and hence helps broadening our contributor base.
  • Be nice. Contributed patches are gifts. Reviews influence the perception a volunteer will have about the entire project.
  • In the long run, encourage code contributors to also become reviewers.

Finding patches to review

edit
 
Some developers may ask you to review their code via drone.

There is a lot of code to review. How to break up the task into more manageable portions?

There are several basic patterns:

By request

edit

Contributors can request code review in Gerrit by adding a reviewer and clicking "Add to attention set". This is a helpful way to indicate to a user that you wish them to do a review.

 
In the Gerrit UI patch contributors can request reviewer by adding their user name and clicking "Add to attention set"

Reviewers should therefore bookmark and monitor requests for code review in this form using the Gerrit URL query has:attention.

When a patch has your attention you can either:

  • Provide a review for the patch as requested (which will move the attention back to the patch contributor) OR
  • If you are not the right person to review the patch, comment and/or suggest alternative reviewers using the "add to attention set" feature. You can use the Maintainers page to identify alternative reviewers.

By project

edit

If you are a maintainer, consider setting up email notifications for new patchsets in your projects (repositories) via "Watched Projects" in Gerrit. Alternatively you can add yourself to the Gerrit Reviewer Bot which will automatically add you as a reviewer to each new patchset.

  • Identify major pieces of work or sets of related revisions. A quick chronological review can be a good way to do this; or you can choose a repository with many open changesets.
  • Open all changeset pages as tabs in a browser window. Open relevant files in a text editor.
  • Review new or unfamiliar files in their entirety. Pick a changeset with a major change to the relevant file and review.

By author

edit
  • Pick an author with (many) open changesets, load them in Gerrit.
  • Work through the revisions chronologically, or proceed one topic/repository at a time.
  • This method allows the reviewer to get to know individual developers: their skills, faults and interests. The work has a sense of progress and continuity.

If someone already added you as a potential reviewer and you know you will not review the patch, remove yourself from the list of reviewers.[1]

New contributors to our projects

edit

You can add (some of) the queries below to your menu by editing your user preferences. A "new contributor" is defined as a person who has contributed five or less changesets in total.

Chronological (and Reverse Chronological)

edit
  • Start at the oldest open changeset, review until you finish the queue. Alternately, start at the latest revision and read each diff in turn until you reach the end. This approach is good for minor revisions, but requires constant switching between projects and their contexts.

Review checklist

edit

Is it wanted

edit
  • The very first question is whether the contribution is a good idea. If the contribution is not helpful or does not align with the direction and scope of the project, explain and provide feedback on better ideas.[2]

General

edit
  • Contributed code should work as advertised, that is, any bugs found in the code should be fixed. (But be careful not to blame the current developer for code written by a previous developer.)
  • Maintain backwards compatibility for stable interfaces if this is relatively simple to do.
  • If a breaking change is required in order to make significant improvements, make sure the Stable interface policy is followed.
  • Read relevant bug reports or documentation.
  • Familiarise yourself with any relevant technical issues. Read relevant specifications or manual sections.

Performance

edit
  • Code that is run many times in a request, or code that is run on startup, should be reviewed for performance (e.g. by a member of the Wikimedia Performance Team). Suspicious code may need to be benchmarked.
  • Any web-accessible code which is very inefficient (in time, memory, query count, etc.) should be flagged for a fix (e.g. by creating a task for Performance Team in Phabricator).
  • Database schema changes or changes to high-traffic queries should be reviewed by a database expert. (A corresponding Phabricator task should have the tag "schema-change" associated.)

Design

edit
  • Does this change make the user experience better or worse for end users? If it has a user experience or visual design impact, consider consulting #wikimedia-design connect or the design mailing list, or one of the design maintainers.

Style

edit

Readability

edit
  • Functions should do what their names say. Choosing the correct verb is essential, a get*() function should get something, a set*() function should set something.
  • Variables names should use English, abbreviations should be avoided where possible.
  • Doc comments on functions are preferred.
  • Overly-complicated code should be simplified. This usually means making it longer. For instance:
    • Ternary operators (?:) may need to be replaced with if/else.
    • Long expressions may need to be broken up into several statements.
    • Clever use of operator precedence, shortcut evaluation, assignment expressions, etc. may need to be rewritten.
  • Duplication, whether within files or across files, should be avoided.
    • It is bad for readability since it's necessary to play "spot the difference" to work out what has changed. Reading many near-copies of some code necessarily takes longer than reading a single abstraction.
    • It is bad for maintainability, since if a bug (or missing feature) is found in the copied code, all instances of it have to be fixed.
    • Some new developers might copy large sections of code from other extensions or from the core, and change some minor details in it. If a developer seems to be writing code which is "too good" for their level of experience, try grep'ing the code base for some code fragments, to identify the source. Guide the developer towards either rewriting or refactoring.
    • Taking shortcuts can be counterproductive, since the amount of time spent figuring out the shortcut and verifying that it works could have been spent just typing out the original idea in full.

Security

edit
  • The reviewer should have read and understood the security guide and should be aware of the security issues discussed there.
  • There should not be the remotest possibility of arbitrary server-side code execution. This means that there should be no eval() or create_function(), and no /e modifier on preg_replace().
  • Check for text protocol injection issues (XSS, SQL injection, etc.) Insist on output-side escaping.
  • Check any write actions for CSRF.
  • Be wary of special entry points which may bypass the security provisions in WebStart.php.
  • Be wary of unnecessary duplication of security-relevant MW core functionality, such as using $_REQUEST instead of $wgRequest, or escaping SQL with addslashes() instead of $dbr->addQuotes().
  • Only if you work on ancient code: Check for register_globals issues, especially classic arbitrary inclusion vulnerabilities. (Register Globals has been removed in PHP 5.4.0 and MediaWiki ≥1.27 requires PHP ≥5.5.9.)
  • If in doubt, consider contacting the Wikimedia Security Team.

Architecture

edit
  • Names which are in a shared (global) namespace should be prefixed (or otherwise specific to the extension in question) to avoid conflicts between extensions. This includes:
    • Global variables
    • Global functions
    • Class names
    • Message names
    • Table names
  • The aim of modularity is to separate concerns. Modules should not depend on each other in unexpected or complex ways. The interfaces between modules should be as simple as possible without resorting to code duplication.
  • Check against the Architecture Principles.

Logic

edit
  • Point out shortcuts and ask the developer to do a proper job.

Complete the review

edit
  Warning: Please do not use "Verified +2". This is a "force-merge" that bypasses tests. See also Gerrit/Privilege_policy#Merging_without_review

Giving positive feedback

edit
  • If you want to help to review the code, but don't feel comfortable (yet) making the final decision, you can use Code-Review +1 in Gerrit and indicate whether you've "verified" and/or "inspected" the code.
  • If the revision is good and has passed all tests above, mark it Code-Review +2 in Gerrit. If you are particularly impressed by someone's work, say so in a comment. When you mark a commit with Code-Review +2, you're saying:
    • I've inspected this code, and
    • the code makes sense, and
    • the code works and does something that we want to do, and
    • the code doesn't do anything that we don't want it to do, and
    • the code follows our development guidelines, and
    • the code will still make sense in five years.

Giving negative feedback

edit
  • If the revision is trivial, broken and has no obvious value, mark the commit as "Code-Review -2"
  • If the revision has issues but also has some utility, or is showing signs of heading in the right direction, mark it Code-Review -1 and add a comment explaining what is wrong and how to fix it. Never mark something Code-Review -1 without adding a comment. If someone marks your code Code-Review -1 it means that your code is good, but needs improvement.

You have to weigh up the costs and benefits of each course of action. If you reject the change completely (Code-Review -2), then that change will be lost, and the developer may be discouraged. If you tolerate the fault, the end product will suffer. If you fix it yourself, then you're letting yourself get distracted, and perhaps you're making the developer believe that it is acceptable to submit low-quality code and then let someone else worry about the details.

General guidelines on comment style, especially when giving negative feedback:

  1. Focus your comments on the code and any objectively-observed behavior, not motivations; for example, don't state or imply assumptions about motivating factors like whether the developer was too lazy or unexperienced to do things right. Ask questions instead of making demands to foster a technical discussion: "What do you think about...?" "Did you consider...?" "Can you clarify...?"[3]
  2. Be empathetic and kind. Recognise that the developer has probably put a lot of work in their idea, and thank them for their contribution if you feel comfortable and sincere in doing so. "Why didn't you just..." provides a judgement, putting people on the defensive.[3] Be positive.
  3. Let them know where they can appeal your decision. For example, invite them to discuss the issue on mail:wikitech-l or on IRC.
  4. Be clear. Don't sugarcoat things so much that the central message is obscured.
  5. Most importantly, give the feedback quickly. Don't just leave negative feedback to someone else or hope they aren't persistent enough to get their contribution accepted.

Contacts

edit

During the review you may have some questions or problems. Don't worry! We can try to help you.

For questions related to Wikimedia Gerrit and code review or specific patches, feel free to see the list of IRC Channels and choose a relevant one. See also the Communication page for additional platforms.

For example, for questions related to MediaWiki patches, feel free to join #mediawiki connect.

If you have problems with the jenkins-bot, feel free to kindly contact #wikimedia-releng connect.

See also

edit

References

edit