结合源码分析下defaultdict(type)的执行流程

背景

Python中为了获取或操作字典中某个某个不存在的键的时候,引入了两种方式。

一种是dict.setdefault(),另外一种是defaultdict(type),今天来重点学习记录下defaultdict()的底层原理。


分析Python源码

defaultdict:
通过构造defaultdict字典,能够有效的处理找不到的键,具有核心方法missing(),调用default_factory

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

from collections import defaultdict
class defaultdict(dict):

def __init__(self, default_factory=None, **kwargs): # known case of _collections.defaultdict.__init__
"""
defaultdict(default_factory[, ...]) --> dict with default factory

The default factory is called without arguments to produce
a new value when a key is not present, in __getitem__ only.
A defaultdict compares equal to a dict with the same items.
All remaining arguments are treated the same as if they were
passed to the dict constructor, including keyword arguments.

# (copied from class doc)
"""
pass

def __missing__(self, key): # real signature unknown; restored from __doc__
"""
__missing__(key) # Called by __getitem__ for missing key; pseudo-code:
if self.default_factory is None: raise KeyError((key,))
self[key] = value = self.default_factory()
return value
"""
pass

default_factory = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
"""Factory for default value called by __missing__()."""


举个例子说明:

1
2
3
4
5
6
from collections import defaultdict

k = defaultdict(list)
print(k['params']) # 结果:[]
print(k.get('params')) # 结果 : None
print('params' in k) # 结果 :False

分析:

首先实例化defaultdict(list),传进去了列表类型,然后接下来要获取字典中的键为params的值,实际上调用了__getitem__方法,很明显,字典中并没有键为params的键值对存在,那么接下来就会调用__missing__方法,然后调用default_factory只读特性的lambda函数,实例化一个空的list对象,为[]。将其作为键的默认值返回。

注意:

1.default_factory = property(lambda self: object(), lambda self, v: None, lambda self: None) 中的self并不是defaultdict实例,而是之前传进来的default_factory本身,也就是list,然后调用object(),实例化default_factory类型,重新赋值给default_factory,作为键的默认值。

2.如果在实例化default_factory的时候没有传入默认类型,则回抛出keyerror异常。

3.__missing__方法只会对__getitem__的调用作出响应,__get____contains__调用不会触发__missing__方法。也就是indict.get()并不会返回default_factory的值!