如何确定Python中对象的大小？

如何确定Python中对象的大小？答案是，“只使用sys.getsize of”不是一个完整的答案。这个答案是吗？直接为内置对象工作，但它不考虑这些对象可能包含哪些类型，特别是自定义对象、元组、列表、数据集和集合包含的类型。它们可以包含彼此的实例，以及数字、字符串和其他对象。更完整的答案使用Anaconda发行版中的64位Pythonda版本中的64位Python 3.6和sys.getsize of，我确定了以下对象的最小大小，并注意到SET和DECT预先分配了空间，这样空间才会在设置数量之后才会再次增长(这可能随着语言的实现而不同)：Python 3：Empty Bytes  type        scaling notes 28     int         +4 bytes about every 30 powers of 2 37     bytes       +1 byte per additional byte 49     str         +1-4 per additional character (depending on max width) 48     tuple       +8 per additional item 64     list        +8 for each additional 224    set         5th increases to 736; 21nd, 2272; 85th, 8416; 341, 32992 240    dict        6th increases to 368; 22nd, 1184; 43rd, 2280; 86th, 4704; 171st, 9320 136    func def    does not include default args and other attrs 1056   class def   no slots  56     class inst  has a __dict__ attr, same scaling as dict above 888    class def   with slots 16     __slots__   seems to store in mutable tuple-like structure                    first slot grows to 48, and so on.你怎么解释这个？好吧，假设你有一套，里面有10件物品。如果每个项目都是100个字节，那么整个数据结构有多大？集合本身是736个，因为它将大小调整到736字节。然后，将项目的大小相加，总共为1736字节。函数和类定义的一些注意事项：注每个类定义都有一个代理。__dict__(48字节)类的结构。每个插槽都有一个描述符(类似于property)在类定义中。时隙实例从其第一个元素上的48个字节开始，每增加8个字节。只有空时隙对象有16个字节，没有数据的实例没有什么意义。而且，每个函数定义都有代码对象，可能是docstring和其他可能的属性，甚至是__dict__.Python 2.7分析，与guppy.hpy和sys.getsizeof:Bytes  type        empty + scaling notes 24     int         NA 28     long        NA 37     str         + 1 byte per additional character 52     unicode     + 4 bytes per additional character 56     tuple       + 8 bytes per additional item 72     list        + 32 for first, 8 for each additional 232    set         sixth item increases to 744; 22nd, 2280; 86th, 8424 280    dict        sixth item increases to 1048; 22nd, 3352; 86th, 12568 * 120    func def    does not include default args and other attrs 64     class inst  has a __dict__ attr, same scaling as dict above 16     __slots__   class with slots has no dict, seems to store in                     mutable tuple-like structure. 904    class def   has a proxy __dict__ structure for class attrs 104    old class   makes sense, less stuff, has real dict though.请注意字典(但没有设定)有更紧表示在Python3.6中我认为，在64位机器上，每增加一个项目引用8个字节是很有意义的。这8个字节指向内存中包含的项所在的位置。如果我没记错的话，Python 2中的Unicode的4个字节是固定的宽度，但是在Python 3中，str变成了宽度等于字符最大宽度的Unicode。(至于更多的插槽，见这个答案 )更完整的函数我们需要一个函数来搜索列表、元组、集合、切分中的元素，obj.__dict__和obj.__slots__以及其他我们可能还没有想到的事情。我们想依靠gc.get_referents执行此搜索，因为它在C级别工作(使其非常快)。缺点是GET_Referents可以返回冗余成员，因此我们需要确保不进行双重计数。类、模块和函数是单个的-它们在内存中存在过一次。我们对它们的大小不太感兴趣，因为我们对它们无能为力-它们是项目的一部分。因此，如果它们碰巧被引用，我们将避免计算它们。我们将使用黑名单的类型，所以我们不包括整个程序在我们的大小计数。import sysfrom types import ModuleType, FunctionTypefrom gc import get_referents# Custom objects know their class.# Function objects seem to  know way too much, including modules.# Exclude modules as well.BLACKLIST = type, ModuleType, FunctionTypedef getsize(obj):     """sum size of object & members."""     if isinstance(obj, BLACKLIST):         raise TypeError('getsize() does not take argument of type: '+ str(type(obj)))     seen_ids = set()     size = 0     objects = [obj]     while objects:         need_referents = []         for obj in objects:             if not isinstance(obj, BLACKLIST) and id(obj) not in seen_ids:                 seen_ids.add(id(obj))                 size += sys.getsizeof(obj)                 need_referents.append(obj)         objects = get_referents(*need_referents)     return size与以下白名单中的函数相比，大多数对象都知道如何为垃圾收集的目的遍历自己(当我们想知道某些对象在内存中的开销时，这几乎就是我们要寻找的。使用此功能的gc.get_referents)然而，如果我们不小心，这项措施的范围将比我们打算的要大得多。例如，函数非常了解在其中创建的模块。另一个对比点是，作为字典中键的字符串通常是内嵌的，因此它们不会被复制。检查id(key)也将允许我们避免计算重复数，这是我们在下一节中所做的。黑名单解决方案跳过计算字符串的键。白色类型，递归访问者(旧实现)为了自己涵盖这些类型中的大多数，我编写了这个递归函数，而不是依赖于GC模块，以尝试估计大多数Python对象的大小，包括大多数内置器、集合模块中的类型和自定义类型(时隙和其他类型)。这类函数对我们将要计算的内存使用类型提供了更细粒度的控制，但有将类型排除在外的危险：import sysfrom numbers import Numberfrom collections import Set, Mapping, dequetry: # Python 2     zero_depth_bases = (basestring, Number, xrange, bytearray)     iteritems = 'iteritems'except NameError: # Python 3     zero_depth_bases = (str, bytes, Number, range, bytearray)     iteritems = 'items'def getsize(obj_0):     """Recursively iterate to sum size of object & members."""     _seen_ids = set()     def inner(obj):         obj_id = id(obj)         if obj_id in _seen_ids:             return 0         _seen_ids.add(obj_id)         size = sys.getsizeof(obj)         if isinstance(obj, zero_depth_bases):             pass # bypass remaining control flow and return         elif isinstance(obj, (tuple, list, Set, deque)):             size += sum(inner(i) for i in obj)         elif isinstance(obj, Mapping) or hasattr(obj, iteritems):             size += sum(inner(k) + inner(v) for k, v in getattr(obj, iteritems)())         # Check for custom object instances - may subclass above too         if hasattr(obj, '__dict__'):             size += inner(vars(obj))         if hasattr(obj, '__slots__'): # can have __slots__ with __dict__             size += sum(inner(getattr(obj, s)) for s in obj.__slots__ if hasattr(obj, s))         return size    return inner(obj_0)我对它的测试相当随意(我应该对它进行单元测试)：>>> getsize(['a', tuple('bcd'), Foo()])344>>> getsize(Foo())16>>> getsize(tuple('bcd'))194>>> getsize(['a', tuple('bcd'), Foo(), {'foo':  'bar', 'baz': 'bar'}])752>>> getsize({'foo': 'bar', 'baz': 'bar'})400>>> getsize({})280>>> getsize({'foo':'bar'})360>>> getsize('foo')40>>>  class Bar():...     def baz():...         pass>>> getsize(Bar())352>>> getsize(Bar().__dict__)280>>> sys.getsizeof(Bar())72>>> getsi  ze(Bar.__dict__)872>>> sys.getsizeof(Bar.__dict__)280这个实现对类定义和函数定义进行了分解，因为我们并不追求它们的所有属性，但是由于它们应该只在进程的内存中存在一次，所以它们的大小并不重要。

如何确定Python中对象的大小？

3回答