python3有两种字符的类型,byte和str,前者的实例包含原始的8位值,后者的实例包含Unicode字符。把unicode字符转化为二进制数据常见方法就是utf-8。unicode 字符转为二进制,用encode方法,反过来,二进制到unicode字符,要用decode。
python2 允许随机像文件中写入一些二进制数据,但是python3是不孕讯这样做的,python3在open函数中设置了名为encoding的新参数,这个新参数的默认值是‘utf-8’。这要求编程者必须在写文件的时候传入包含unicode字符的str,而不是接受byte。但是采用wb 和 rb的方式读写二进制字符。
with open(file, 'wb') as f: f.write(os.urandom(10)) with open(file, 'rb') as f: pass
复杂表达式虽然正确,但是不易读;
# 辅助函数写法 def get_first_int(values, key, default=0): found = values.get(key, ['']) if found[0]: found = int(found[0]) else: found = default return found green = get_first_int(my_values, 'green') #复杂表达式写法 green = my_values.get('red', [''])[0] or 0
b = a[::2] # ['a', 'c', 'e', 'g'] c = b[1:-1] # ['c', 'e']
dic = {rank : name for name, rank in chile_ranks.items()} chile_len_set = {len(name) for name in chile_ranks.values()}
将列表推导所用的写法放在一对圆括号中,构成了生成器表达式;
# list it = [len(x) for x in open('./tmp_fike')] # generator it = (len(x) for x in open('./tmp_file')) print(next(it)) roots = ((x, x ** 0.5) for x in it) print(next(roots))
python解释器会按照下面的顺序来遍历各个作用域:
def decode(data, default={}) def log(message, when=datetime.now())
这两种都是有问题的默认值,因为参数默认值只会被初始化一次,第一次调用的时候,when使用默认值;第二次再调用的时候,when仍旧是第一次的默认值;所以如果要每次都使用新的时间,应该写成下面的形式,用文档字符串来描述:
def log(message, when=None): """ log a message with a timestamp Args: message: Message to print. when: datetime of when the message occurred. Default to the present time. """ when = datetime.now() if when is None else when print('%s: %s' %(when, message))
python 的全局解释器锁(GIL)使得没办法通过线程来实现真正的并行;
不仅如此,多线程进行计算密集的任务时,甚至会比单线程还要慢,因为线程之间的切换也有时间消耗。
ProcessPoolExecutor, ThreadPoolExecutor这两个的比较时间在下面:
其中ProcessPoolExecutor要更快些,因为这个类会用multiporcessing提供的底层机制:
from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor import time def gcd(pair): a, b = pair for i in range(min(a, b), 0, -1): if a % i == 0 and b % i == 0: return i def test_futures_pool(paris): start_time = time.time() pool = ProcessPoolExecutor(max_workers=2) results = list(pool.map(gcd, paris)) print(results) print(f'cost time: {time.time() - start_time}') def test_thread_pool(pairs): start_time = time.time() pool = ThreadPoolExecutor(max_workers=2) results = list(pool.map(gcd, pairs)) print(results) print(f'cost time: {time.time() - start_time}') if __name__ == '__main__': pairs = [(1963309, 2265973), (20305646, 45862136)] test_thread_pool(pairs) test_futures_pool(pairs) ''' results: [1, 2] cost time: 1.1633341312408447 [1, 2] cost time: 1.0789978504180908 '''
*args用来表示函数接收可变长度的非关键字参数列表作为函数的输入。
def test_args(normal_arg, *args): print("first normal arg:" + normal_arg) for arg in args: print("another arg through *args :" + arg) test_args("normal", "python", "java", "C#")
**kwargs表示函数接收可变长度的关键字参数字典作为函数的输入
def test_kwargs(**kwargs): if kwargs is not None: for key in kwargs: print("{} = {}".format(key, kwargs[key])) test_kwargs(name="python", value="5")
例子:
def test_args(name, age, gender): print(f'name: {name}, age: {age}, gender: {gender}') if __name__ == '__main__': info = ['Tom', 10, 'male'] test_args(*info) info = {'name': 'Tom', 'age': 10, 'gender': 'male'} test_args(**info) # info不能有多余的key value对,会报错没有某个关键字