Python 代码优化

Author: yihuang
email:yi.codeplayer@gmail.com
company:深圳云悦科技

今天不探讨

大纲

  1. 好的编码习惯
  2. CPython的执行
  3. Cython简明教程

正确使用数据结构[1/2]

正确使用数据结构[2/2]

tuple VS list

小对象缓存和freelist

预先计算[1]

预先计算[2]

其他预先计算时机:

KISS

大纲

  1. 好的编码习惯
  2. CPython的执行
  3. Cython简明教程

name resolution[1]

局部变量:

def test(a):
    a
LOAD_FAST 0

name resolution[2]

LOAD_FAST i

PyObject *PyEval_EvalCodeEx(...) {
    register PyObject **fastlocals;
    ...
    fastlocals = f->f_localsplus;
    ...
    fastlocals[i]
    ...

name resolution[3]

模块变量:

a = 1
def test():
    a
LOAD_GLOBAL 0

name resolution[4]

LOAD_GLOBAL 0

PyObject *PyEval_EvalCodeEx(...) {
    PyObject *names;
    ...
    names = co->co_names;
    ...
    w = PyTuple_GetItem(names, i);
    x = PyDict_GetItem(f->f_globals, w);
    if (x == NULL) {
        x = PyDict_GetItem(f->f_builtins, w);
        if (x == NULL) {
          load_global_error:
    ...

function call[1]

test(1, 2, 3, a=1, b=2)
PyObject *func = LOAD_NAME 'test';
PyObject *args = PyTuple_New(3);
PyTuple_SET_ITEM(args, 0, 1);
PyTuple_SET_ITEM(args, 1, 2);
PyTuple_SET_ITEM(args, 2, 3);
PyObject *kwargs = PyDict_New();
PyDict_SetItem(kwargs, "a", 1);
PyDict_SetItem(kwargs, "b", 2);
PyObject_Call(func, args, kwargs);
...

function call[2]

优化方法:

object model - 对象的消耗

obj.a

object model - 对象的消耗

object PyObject_GenericGetAttr(object obj, object name):
    # 从class中查找descriptor
    descr = PyType_Lookup(Py_TYPE(obj), name)
    if PyDescr_IsData(descr):
        # 如果是data descriptor,直接使用
        return descr.__get__(descr, obj, obj->obj_type)
    else:
        # 否则使用对象字典
        r = obj.__dict__[name]
        if r is not None:
            return r
        elif descr is not None:
            # 最后使用 Non-data descriptor
            return descr.__get__(...)

延迟计算 - Non-data descriptors

简化设计

大纲

啥是Cython

编译纯Python,消除解释执行的开销

cdef 消除名字查找和函数调用的开销

cdef 消除名字查找和函数调用的开销

给Python加入类型签名,无限接近纯C

给Python加入类型签名,无限接近纯C

类Python语法的c程序

cdef object int_to_decimal_string(Py_ssize_t n):
    cdef char buf[32], *p, *bufend
    cdef unsigned long absn
    cdef char c = '0'
    p = bufend = buf + sizeof(buf)
    if n < 0:
        absn = 0UL - n
    else:
        absn = n

    ...

使用外部c库

使用外部c库

cdef extern from "Python.h":
    Py_ssize_t PyByteArray_GET_SIZE(object array)

    ctypedef class __builtin__.bytearray [object PyByteArrayObject]:
        cdef Py_ssize_t ob_alloc
        cdef char *ob_bytes
        cdef Py_ssize_t ob_size

Extension Type

cdef class Connection(object):
    cdef public int port
    cdef public object _sock

    cdef send_command(self, tuple args):
        self._sock.sendall(
            self._pack_command(args))

Cython基本精神传达完毕

细节请看手册

http://docs.cython.org/

Mockup

内存泄漏[1]

内存泄漏[2]

Thanks

Q & A