Top 9 Nooby Python habit

Top habits that a python developer should avoid, either if you are a beginner looking to improve, or if you a more experienced developer that just want to catch any habits you still hold on to.

Not using Context Managers

Context managers allow you to allocate and release resource precisely when you want to. So whenever you need to manage resources such as files, sockets or databases, don’t use if statements or try and catch to close them. Just use the context manager. Let’s checkout a typically nooby behavior and how we can rewrite like a pro.

def close_file_manually(filename):
  f = open(filename, "w")
  f.write("Something here\n")
  f.close()

The code above is one of the most common I’ve seen beginners do. Instead of explicitly closing the file one can just rewrite the code as follow:

def handle_file_like_a_pro(filename):
  with open(filename, "w") as f:
    f.write("Something here\n")

We leave the responsibility of closing the file to the context manager, no need to explicitly close the file, we just put our code inside the scope of the context manager. Let’s examine another common code that can leverage the use of the context manager:

def send_message(host, port):
  s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
  try:
    s.connect((host, port))
    s.sendall(b'Hello World!')
  finally:
    s.close()

In python very often all resources that need to be closed has its context manager, you can even create you own custom context manager if needed. Rewriting the code above:

def send_message(host, port):
  with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.connect((host, port))
    s.sendall(b'Hello World!')

Avoid default mutable arguments

Python, like as many other programming languages, let you define a default argument when defining a function or a method. For instance the following functions powers x to some value power, which by default is 2 (quadratic).

def power(x, power=2):
  return x**power

This is the Pythonic way of implementing functions and methods that have some default arguments in case none was given. However, this should be avoided for mutable types such as lists or dictionaries. Default arguments are created when the function is defined, not when it is executed, so mutable variables will not be created every time you call a function, instead it will reuse the default argument (still pointing to the list).

Let’s examine the following code:

def  create_or_append(n, lst=[]):
  lst.append(n)
  return lst

lst1 = create_or_append(10)
lst2 = create_or_append(9)
lst3 = create_or_append(5, lst2)

Suppose for whatever reason you need to create a function that appends a new element to a copy of a list or create a new one if none was passed. So if you run the code below the value for lst1, lst2 and lst3 is [10, 9, 5]. All variable point to the same list, try figuring out why this happened!

Instead, if you need to deal with default mutable arguments, you could code something like this:

def  create_or_append(n, lst=None):
  if lst is None:
    lst = []
  else:
    lst = lst.copy()
  lst.append(n)
  return lst

lst1 = create_or_append(10)
lst2 = create_or_append(9)
lst3 = create_or_append(5, lst2)

Know when to use comprehension

I believe that one of the coolest features on python is the ability to create a complex list using just one line of code, aka list comprehension. Check out an example of how to create a list comprised of just even numbers from 2 to 1000 using this powerful feature:

primes = [ x for x in range(2,1000) if x % 2 == 0 ]

Cool ha? You can do this for dictionaries, tuples, sets and even generators comprehensions, and is fundamental to know how to use them to write cleaner code. However, if the statement is getting too complex and long, you should stop and rewrite the code using traditional block of codes. For instance, the following list comprehension is considered a nightmare and should never be used in your code. Whenever legible code is getting compromised, you should rethink about what are you doing.

lst = [[1,2,3],[-1,-5,6]]
flatten = [item
           for sublist in lst
           for item in sublist
           if item > 0]

Instead, a cleaner code will be like this:

lst = [[1,2,3],[-1,-5,6]]

flatten = []

for sublist in lst:
  for item in sublist:
    if item > 0:
      flatten.append(item)

Much Better!

Equality vs Identity

Beginners can get a bit confused about the difference between == (equality) and is (identity) operators. I used to think that they are the same thing and started to switch between them without actually knowing what they are used for, and in most cases start giving up the use of the identity operator is and stick with the equality operator ==.

Equality is about whether two objects represents essentially the same value, for instance if A=2 and B=2, then A==B -> True. On the other hand, identity is all about comparing if two objects point to the thing, no value comparison is done. For instance, if A=[] and B=A, then A is B == True.

In most cases you will use the equivalent identity, it is rare the need to use the identity operator except you are doing some specific low level programming. However for singletons such as None, True and False you should use the identity operator instead. It is more clear and demonstrate that you really know what you are doing.

Improve Iteration Patterns

Most of beginners abuse the len-range pattern to iterate over a collection. Checkout the following code:

b = [1, 2, 3]
for i in range(len(b)):
  v = b[i]
  ...
  ...

There’s nothing wrong with the code, it will work as expected but it is not Pythonic. Here we are using the index of the list to access the object within it, just that! Nothing more! Instead of you can rewrite this as follow:

b = [1, 2, 3]
for v in b:
  ...
  ...

The Rule of thumb is to avoid the use of len-range pattern, even if you need to get the index you could use enumerate instead. For example:

b = [1, 2, 3]

for index, value in enumerate(b):
  ...
  ...

If you want the index to access the values of different lists that you know have the same length, you can use zip instead. Suppose you want to add the value of three list in the same index number, normally a beginner would think of coding something like:

a = [1, 2, 3]
b = [10, 15, 20]
c = [2, 6, 9]
result = []
for index in range(len(a)):
  result.append(a[index] + b[index] + c[index])

Instead you should rewrite this and use zip:

a = [1, 2, 3]
b = [10, 15, 20]
c = [2, 6, 9]
result = []
for av, bv, cv in zip(a, b, c):
  result.append(av+bv+cv)

Even in the event that you still want the index for some complex reasons, you can even use enumerate +zip:

for index, (av, bv, cv) in enumerate(zip(a, b, c)):
  # do something
  ...
  ...

Use print for debugging or logging

So this is natural, every python programmer use print() to start debugging or logging events, eventually one should leave behind this habit and start using the built-in logging module.

Not learning how to package

Many beginners don’t bother learning how to package a Python Project, they often create modules and start importing it right away depending solely on system paths. Though most of the time it’s the common case, sometimes you need to use scripts that are not in the same directory as your current working directory which can potentially lead to an issue with your project. For that reason it is fundamental to learn how packaging is done.

Not using virtual environment

So you are started a new project and your re still searching for frameworks and libraries that you can use, you start installing a bunch of them and your Python environment is now fill of unnecessary liberarieis. Also you now want to generate a requirements.txt but you notice that other packages used in other projects are listed as well. Moreover, you noticed that you need to pin a specific package version to your new project, but you have another one that needs the newer version of the same package. What a mess ha?

Virtual environment to the rescue! Don’t start a project without creating a virtual environment. This should become a habit for you. I recommend to starting with virtualenvwrappers since it is really easy to configure and to use.

Do your unit tests

Please help DevOps and SysOps with that, it is really frustrating to see a production environment break because some components of the software were not tested before hand. Every method, function, component, class and even modules should have their unit tests.

Moreover, don’t be that type of developer that waits til tons of feature to be written to push all that code at once. Learn about continuous integration and why it pays a great role in software development.