Code Style and Conventions#

Make your code consistent through style conventions. For interactive reading and executing code blocks Binder and find b08-pystyle.ipynb, or install Python and JupyterLab locally.

Take a deep breath, take off, and look at what you have learned so far from a new perspective. After this chapter, it will be worth having another look at your old code and formatting it robustly. The style guidelines presented here go beyond visual aesthetics and aid in writing effective code.

incubinate to solve Python programming problems

Background and PEP#

This style guide highlights parts of the PEP 8 - Style Guide for Python Code by Guido van Rossum, Barry Warsaw, and Nick Coghlan. The full document is available at python.org and only aspects with relevance for the applications shown in this eBook are featured in this chapter.

What is PEP? PEP stands for Python Enhancement Proposals, in which Python developers communicate features and developments of Python. At the time of writing these lines, there are twelve (minus two) PEPs dedicated to the development of Python modules, bug fix releases, and style guides (read the full and current list of PEPs at python.org). This section features recommendations stated in PEP 8, the style guide for Python code.

Many IDEs, including PyCharm or Atom, provide auto-completion and tooltips with PEP style guidance to aid consistent programming. Thus, when your IDE underlines anything in your script, check the reason for that and consider modifying the code accordingly.

The Zen of Python#

Are we getting spiritual? Far from it. The Zen of Python is an informational PEP (20) by Tim Peters to guide programmers. It is a couple of lines summarizing good practice in coding. Python’s Easter Egg import this prints the Zen of Python in any Python interpreter:

import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Code Layout#

Maximum Line Length#

The maximum length of a line is 79 characters and in-line comments, including docstrings, should not exceed 72 characters.

Indentation#

Indentation designates the sifting of code (blocks) to the right. Indentation is necessary, for example, in loops or functions to assign code blocks to a for or def statement. In a broader sense, indentations represent namespaces, where local variables defined in the indented region are valid only here. Multiple levels of indentation occur when nested statements are used (e.g., an if condition nested in a for loop). One level of indentation corresponds to 4 spaces.

for i in range(1,2):
    print("I'm one level indented.")
    if i == 1:
        print("I'm two levels indented.")
I'm one level indented.
I'm two levels indented.

Because long lines of code are bad practice, we sometimes need to use line breaks when assigning for example a list or calling a function. In these cases, the next, continuing is also indented and there are different options to indent multi-line assignments. Here, we want to use the style code of using an opening delimiter for indentation:

a_too_long_word_list = ["Do", "not", "hard-code", "something", "like", "this.",
                        "There", "are", "better", "ways."]
a_better_indented_list = [
    "Do",
    "not",
    "hard-code",
    "something",
    "like",
    "this.",
    "...",
    ]

Recall: PyCharm, Atom, and many other IDEs automatically layout indentation.

Line Breaks of Expressions with Binary Operators#

When binary operators are part of an expression that exceeds the maximum line length of 79 characters, the line break should be before the binary operators.

dummy_df = pd.get_dummies(pd.Series(['variable1', 'parameter2', 'sensor3']))
print(dummy_df.head(3))

dum_sum = (dummy_df['variable1']
           + dummy_df['parameter2']
           - dummy_df['sensor3'])
   parameter2  sensor3  variable1
0           0        0          1
1           1        0          0
2           0        1          0

Blank Lines#

To separate code blocks, hitting the Enter key many times is a very inviting option. However, the random and mood-driven use of blank lines results in unstructured code. This is why PEP 8 authors provide guidance also on the use of blank lines:

  • Surround class definitions and top-level functions (i.e., functions where the def-line is not indented) with two blank lines.

  • Surround methods (e.g., functions within a class) with one blank line.

  • Use blank lines sparsely in all other code to indicate logical sections.

# blank 1 before top-level function
# blank 2 before top-level function
def top_level_function():
    pass
# blank 1 after top-level function
# blank 2 after top-level function

Blanks (Whitespaces)#

Whitespaces aid to relax the code layout, but too many whitespaces should be avoided as for example:

  • In parentheses, brackets or braces (no: list( e1, e2 ) vs. yes: list(e1, e2))

  • In parentheses with tailing commas (no: a_tuple = (1, ) vs. yes: a_tuple = (1,))

  • Immediately before a comma

  • Between function name and argument parentheses (no: fun (arg) vs. yes: fun(arg)) and similar for list or dictionary elements

  • Around the = sign of unannotated function parameters indicating a default value (no: def fun(arg = 0.0) vs. yes: def fun(arg=0.0))

  • Before : unless parentheses or brackets follow the : (e.g., a_dict = {a_key: a_value})

Whitespaces should be added:

  • Around any operator, boolean, or (augmented) assignment (e.g., ==, <, >, !=, <>, <=, >=, in, not in, is, is not, and, or, not, +=, -=)

  • After colons : if a value antecedes the : and no parentheses or brackets follow immediately after the : (e.g., a_dict = {a_key: a_value})

Packages and Modules#

Imports#

Imports are at the top of the script, right after any docstrings or other module comments. Import libraries first, then third-party packages, and lastly locally stored (own) modules. Preferably use absolute imports (e.g., import package.module or from package import module) and avoid wildcard imports (from module import *). Every import should have its own line and avoid using the comma sign for multiple imports:

# DO:
import os
import numpy as np
# DO NOT:
import os, sys

Naming Packages and Script#

New, custom packages or modules should have short and all-lowercase names, where underscores may be used to improve readability (discouraged for packages).

Important

Never use a minus - sign in a Python file name, because the minus sign may cause import errors.

Comments#

Block and Inline Comments#

Block comments start with a single # at the first place of a line, followed by a whitespace and the comment text.

Inline comments follow an expression and are indented with two whitespaces. However, the usage of inline comments is deprecated (i.e., do not use them or be sparse on their usage).

Docstrings#

Docstrings are short text descriptions within a module, function, class, or method with specifications of arguments, usage, and output. When instantiating a standard object, or referencing a class method, the __doc__ attribute will print the object’s docstring information. For example:

a_list = [1, 2]
print(a_list.__doc__)
Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list.
The argument must be an iterable if specified.

When writing a Python function, docstrings are introduced immediately after the def ... line with triple double-apostrophes:

def let_there_be_light(*args, **kwargs):
    """
    Bright function accepting any input argument with indifferent behavior.
    :param an_input_argument: STR or anything else
    :param another_input_argument: FLOAT or anything else
    :return: True (in all cases)
    """
    print("Sunrise")
    return True

print(let_there_be_light.__doc__)
    Bright function accepting any input argument with indifferent behavior.
    :param an_input_argument: STR or anything else
    :param another_input_argument: FLOAT or anything else
    :return: True (in all cases)
    

Note that the recommendations on docstrings are provided with PEP 257 rather than PEP 8.

Name Conventions#

Definition of Name Styles#

The naming conventions use the following styles (source: python.org):

  • b (single lowercase letter)

  • B (single uppercase letter)

  • lowercase

  • lower_case_with_underscores

  • UPPERCASE

  • UPPER_CASE_WITH_UNDERSCORES

  • CamelCase or CapWords or CapitalizedWords or StudlyCaps.
    Note: When using acronyms in CapWords, capitalize all the letters of the acronym (e.g., HTTPResponse is better than HttpResponse).

  • mixedCase (differs from CapitalizedWords by initial lowercase character!)

  • Capitalized_Words_With_Underscores (deprecated)

Some variable name formats trigger a particular behavior of Python:

  • _single_leading_underscore variables indicate weak internal use and will not be imported with from module import *

  • __double_leading_underscore variables invoke name mangling in classes (e.g., a method called __dlu of the class MyClass will be mangled into _MyClass__dlu)

  • __double_leading_and_tailing_underscore__ variables are magic objects or attributes in user-controlled namespaces (e.g., __init__ or __call__ in classes)
    Only use documented magic attributes and never invent them. Read more about magic methods in the chapter on Python classes.

  • single_tailing_underscore_ variables are used to avoid conflicts with Python keywords (e.g., MyClass(class_='AnotherClass'))

Object Names#

Use the above-defined styles for naming Python items as follows:

  • Classes: CamelCase (CapWords) letters only such as MyClass

  • Constants: UPPERCASE letters only, where underscores may improve readability (e.g., use at a module level for example to assign water density RHO = 1000)

  • Exceptions: CamelCase (CapWords) letters only (exceptions should be predefined Error classes; typically use the suffix Error (e.g., TypeError)

  • Functions: lowercase letters only, where underscores may improve readability; sometimes mixedCase applies to ensure backward compatibility of prevailing styles

  • Methods (class function, non-public): _lowercase letters only with a leading underscore, where underscores may improve readability

  • Methods (class function, public): lowercase letters only, where underscores may improve readability

  • Modules: lowercase letters only, where underscores may improve readability

  • Packages: lowercase letters only, where underscores are discouraged

  • Variables: lowercase letters only, where underscores may improve readability

  • Variables (global): lowercase letters only, where underscores may improve readability; note that “global” should limit to variable usage within one module only.

Important

Never start a variable name with a number. Do use array_2d, but do not use 2d_array.

More Code Style Recommendations#

To ensure code compatibility and program efficiency, the PEP 8 style guide provides other general recommendations (read more in the Python docs):

  • When defining a function, prefer def statements over lambda expressions, which are reasonable for one-time usage only

  • When exceptions are expected, use try - except clauses (see the errors and exceptions section)

  • Ensure that methods and functions return objects consistently, for example:

def a_function_with_return(x):
    if x > 0:
        return np.sqrt(x)
    else:
        return np.nan

Learning Success Check-up#

Take the learning success test for this Jupyter notebook.