Code Styles and Conventions¶
Summary: Make your code consistent through style conventions.
The html version of this notebook is hosted at https://hydro-informatics.github.io/hypy_pystyle.html.
Take a deep breath, take off and look at what you have learned so far from a new perspective. After this chapter it is worth to have another look at old codes and to format them robustly. The style guidelines presented here go far beyond visual aesthetics and aid in writing effective codes.
Background and PEP¶
This style guide highlights parts of the PEP 8 - Style Guide for Python Code by Guido van Rossum, Barry Warsaw and Nick Coghlan. The full document is available at python.org and only aspects with relevance for the applications shown at hydro-informatics.github.io are featured on this page.
So what is PEP? PEP stands for PythonEnhancementProposals, in which Python developers communicate features and developments of Python. At the time of writing these lines, there are twelve (minus two) PEPs dedicated to the development of Python modules, bug fix releases, and also style guides (read the full and current list of PEPs at python.org). Here, we will use recommendations of PEP 8, the style guide for Python code.
Many IDEs, including PyCharm provide auto-completion and tool tips with PEP style guidance to aid consistent programming. So if PyCharm underlines anything in your script, check the reason for that and consider to modify the code accordingly.
The Zen of Python¶
Are we getting spiritual now? Far from it. The Zen of
Python is an
informational PEP (20) by Tim Peters to guide programmers. It is a
couple of lines summarizing good practice in coding. The Easter Egg
import this
prints the Zen of Python in any Python interpreter:
import this
The Zen of Python, by Tim Peters Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than right now. If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea -- let's do more of those!
Code layout¶
Maximum line length¶
The maximum length of a line is 79 characters and in-line comments, including docstrings, should not exceed 72 characters.
Indentation¶
Indentation designates the sifting of code (blocks) to the right.
Indentation is necessary for example in loops or functions to assign
code blocks to a for
or def
statement. Multiple levels of
indentation occur when nested statements are used (e.g., an if
condition nested in a for
loop). One level of indentation
corresponds to 4 spaces.
for i in range(1,2):
print("I'm one level indented.")
if i == 1:
print("I'm two levels indented.")
I'm one level indented.
I'm two levels indented.
Because long lines of code are bad practice, we sometimes need to use line breaks when assigning for example a list or calling a function. In these cases, the next, continuing is also indented and there are different options to indent multi-line assignments. Here, we want to use the style code of using an opening delimiter for indentation:
a_too_long_list = ["Do", "not" "hard-code", "something", "like", "this.",
"There", "are", "better", "ways."]
Recall that PyCharm and many other IDEs automatically lays indentation out.
Line breaks of expressions with binary operators¶
When binary operators are part of an expression that exceeds the maximum line length of 79 characters, the line break should be before binary operators.
dummy_df = pd.get_dummies(pd.Series(['variable1', 'parameter2', 'sensor3']))
dum_sum = (dummy_df['variable1']
+ dummy_df['parameter2']
- dummy_df['sensor3'])
Blank lines¶
To separate code blocks, hitting the Enter key many times is a very inviting option. However, the random and mood-driven use of blank lines results in unstructured code. This is why the PEP 8 Authors provide guidance also on the use of blank lines:
Surround class definitions and top-level functions (i.e., functions where the
def
-line is not indented) with two blank lines.Surround methods (e.g., functions within a class) with one blank line.
Use blank lines sparsely in all other code to indicate logical sections.
# blank 1 before top-level function
# blank 2 before top-level function
def top_level_function():
pass
# blank 1 after top-level function
# blank 2 after top-level function
Blanks (whitespaces)¶
Whitespaces aid to relax the code layout, but too many white spaces should be avoided as for example:
in parentheses, brackets or braces (no:
list( e1, e2 )
vs. yes:list(e1, e2)
)in parentheses with tailing commas (no:
a_tuple = (1, )
vs. yes:a_tuple = (1,)
)immediately before any comma
between function name and argument parentheses (no:
fun (arg)
vs. yes:fun(arg)
) and similar for list or dictionary elementsaround the
=
sign of unannotated function parameters indicating a default value (no:def fun(arg = 0.0)
vs. yes:def fun(arg=0.0)
)before
:
unless parentheses or brackets follow the:
(e.g.,a_dict = {a_key: a_value}
)
Whitespaces should be added:
around any operator, boolean, or (augmented) assignment (e.g.,
==, <, >, !=, <>, <=, >=, in, not in, is, is not, and, or, not, +=, -=
)after colons
:
if a value antecedes the:
and no parentheses or brackets follow immediately after the:
(e.g.,a_dict = {a_key: a_value}
)
Packages and modules¶
Imports¶
Imports are at the top of the script, right after any
docstrings
or other module comments. Import libraries first, then third party
packages, and lastly locally stored (own) modules. Preferably use
absolute import (e.g., import package.module
or
from package import module
) and avoid wild card imports
(from module import *
). Every import should have an own line and
avoid using the comma sign for multiple imports:
# DO:
import os
import numpy as np
# DO NOT:
import os, sys
Name convention¶
New, custom packages or modules should have short and all-lowercase names, where underscores may be used to improve readability (discouraged for packages).
Comments¶
Block and inline comments¶
Block comments start with a single #
at the first place of a line,
followed by a whitespace and the comment text.
Inline comments follow an expression and are indented with two whitespaces. The usage of inline comments is deprecated (i.e., do not use them or be sparse on their usage)
Docstrings¶
Docstrings are short text descriptions within a module, function, class
or method with specifications of arguments, usage and output. When
instantiating a standard object, or referencing to a class method, the
__doc__
attribute will print the object’s docstring information. For
example:
a_list = [1, 2]
print(a_list.__doc__)
Built-in mutable sequence.
If no argument is given, the constructor creates a new empty list.
The argument must be an iterable if specified.
When writing a Python, docstrings are introduced immediately after the
def ...
line with triple double-apostrophes:
def let_there_be_light(*args, **kwargs):
"""
Bright function accepting any input argument with indifferent behavior.
:param an_input_argument: STR or anything else
:param another_input_argument: FLOAT or anything else
:return: True (in all cases)
"""
print("Sunrise")
return True
print(let_there_be_light.__doc__)
Bright function accepting any input argument with indifferent behavior.
:param an_input_argument: STR or anything else
:param another_input_argument: FLOAT or anything else
:return: True (in all cases)
Note that the recommendations on docstringsare provided with PEP 257 rather than PEP 8.
Name conventions¶
Definition of name styles¶
The naming conventions use the following styles (source: python.org):
b
(single lowercase letter)B
(single uppercase letter)lowercase
lower_case_with_underscores
UPPERCASE
UPPER_CASE_WITH_UNDERSCORES
CamelCase
orCapWords
orCapitalizedWords
orStudlyCaps
. Note: When using acronyms inCapWords
, capitalize all the letters of the acronym (e.g.,HTTPResponse
is better thanHttpResponse
).mixedCase
(differs fromCapitalizedWords
by initial lowercase character!)Capitalized_Words_With_Underscores
(deprecated)
Some variable name formats imply a particular behavior of Python:
_single_leading_underscore
variables indicate weak internal use and will not be imported withfrom module import *
__double_leading_underscore
variables invoke name mangling in classes (e.g., a method called__dlu
within the classMyClass
will be mangled into_MyClass__dlu
)__double_leading_and_tailing_underscore__
variables are magic objects or attributes in user-controlled namespaces (e.g.,__init__
or__call__
in classes) Only use documented magic object/attributes and never invent them. Read more about magic methods on the page on Python classes.single_tailing_underscore__
variables are used to avoid conflicts with Python keywords (e.g.,MyClass(class_='AnotherClass')
)
Object names¶
Use the following styles for naming
Classes:
CamelCase
(CapWords
) letters only such asMyClass
Constants:
UPPERCASE
letters only, where underscores may improve readability (e.g., use at a module level for example to assign water densityRHO = 1000
)Exceptions:
CamelCase
(CapWords
) letters only (exceptions should be classes and typically use the suffixError
(e.g.,TypeError
)Functions:
lowercase
letters only, where underscores may improve readability; sometimesmixedCase
applies to ensure backwards compatibility of prevailing stylesMethods (class function, non-public):
_lowercase
letters only with leading underscore, where underscores may improve readabilityMethods (class function, public):
lowercase
letters only, where underscores may improve readabilityModules:
lowercase
letters only, where underscores may improve readabilityPackages:
lowercase
letters only, where underscores are discouragedVariables:
lowercase
letters only, where underscores may improve readabilityVariables (global):
lowercase
letters only, where underscores may improve readability; note that “global” should limit to variable usage within a module only …Important: Never start a variable name with a number. Do use ``array_2d``, but do not use ``2d_array``.
More code style recommendations¶
In order to ensure code compatibility and programm efficiency, the PEP 8 style guide provides some general recommendations (read more):
Prefer
is
oris not
over equality operatorsPrefer
is not
overnot ... is
expressionsWhen defining a function, prefer
def
statements overlambda
expressions, which are only reasonable for one-time usageWhen exceptions are expected, use
try
-except
clauses (see errors and exceptions)Ensure that methods and functions return objects consistently - for example:
def a_function_with_return(x):
if x > 0:
return np.sqrt(x)
else:
return None