I would like to explain why it is better to keep the output type as constant as possible regardless of value of inputs by implementing a similar function in two different ways.
If we need a function which returns a list of multiples of three from given list of integer, a function is simply like this:
def extract_multiples_of_three(values: List[int]) -> List[int]:
return [v for v in values if v % 3 == 0]
and this works like
>>> extract_multiples_of_three(values=[9, 5, 2, 0, -6, 9])
[0, 9, 3, -6, 9]
>>> extract_multiples_of_three(values=[9, 5, 2])
[9]
This implementation returns empty list []
if given list does not contain any multiples of three.
>>> extract_multiples_of_three(values=[5, 2])
[]
If we re-implement extract_multiples_of_three()
to return an integer when given list have only one multiples of three, and reutnrs None
when given list does not contain any multiples of three, what will happen?
def extract_multiples_of_three(values: List[int]) -> Optional[Union[int, List[int]]]:
out = [v for v in values if v % 3 == 0]
if len(out) == 0:
return None
elif len(out) == 1:
return out[0]
else:
return out
>>> extract_multiples_of_three(values=[9, 5, 2, 0, -6, 9])
[9, 0, -6, 9]
>>> extract_multiples_of_three(values=[9, 5, 2])
9
>>> extract_multiples_of_three(values=[5, 2])
None
As you see type annotations of the second implementation, the function can returns different types of object depend on input values.
Problems of this implementation are:
- Increasing complexity of function implementation
- Testing becomes more complicated.
- Type checking by linter is not possible
In the above cases, newly if-else
statements are inserted. This example is quite simple, but even though, user takes time to check every if
and return
statement to know
what output of the function will be. And also when user use the function, user have to add to check which type was returned if following process
depends on the return type.
>>> multiples_of_threes = [-30, 21, 999]
>>> input_values = [9, 5, 2, 0, -6, 9]
>>> new_members = extract_multiples_of_three(values=input_values)
>>> if isinstance(new_members, list):
... multiples_of_threes.extend(new_members)
... elif isinstance(new_members, int):
... multiples_of_threes.append(new_members)
... else:
... pass
>>> multiples_of_threes
[-30, 21, 999, 9, 0, -6, 9]
Since the type of the output depends on the input, we need more test case for each conditions.
User can expect second implementation returns always list
if input values
includes at least 2 multiples of three values. Therefore, because list(range(10))
always includes at least 2 multiples of three values, user can use the function like this.
multiples_of_threes = extract_multiples_of_three(values=list(range(10)))
for number in multiples_of_threes:
# .. do something ..
pass
But because linter doesn't know the rule, linter raise type error such as "int" is not iterable
and Object of type "None" cannot be used as iterable value
.
To avoid this linter error, again user needs to add if-else
statements when use it.
If we need to return different objects depend on the input, most of the case, putting multiple different functionalities into a single function.
In such case, it's better to redesign the logic before adding new if-else
statement in the function.
No, I recommend to design a function to return always same type of value.
It means, I recommend the way of the first implementations.
First implementation always returns a list of integer.
-> Optional[Union[int, List[int]]]
means, output can beNone
,int
, or list of integers, and I don't recommend this.