常用的python模块-原创手记-慕课网

一、time与datetime模块

1、在Python中，通常有这几种方式来表示时间：

时间戳(timestamp)：通常来说，时间戳表示的是从1970年1月1日00:00:00开始按秒计算的偏移量

格式化的时间字符串(Format String)

结构化的时间(struct_time)：struct_time元组共有9个元素共九个元素:(年，月，日，时，分，秒，一年中第几周，一年中第几天，夏令时)

import time

print(time.time())                    
# 时间戳:1515302219.4076796
print(time.strftime("%Y-%m-%d %X")) 
#格式化的时间字符串:2018-01-07 13:16:59
print(time.localtime()) 
#本地时区的struct_time，time.struct_time(tm_year=2018, tm_mon=1, tm_mday=7, tm_hour=13, tm_min=16, tm_sec=59, tm_wday=6, tm_yday=7, tm_isdst=0)
print(time.gmtime())    
#UTC时区的struct_time，time.struct_time(tm_year=2018, tm_mon=1, tm_mday=7, tm_hour=5, tm_min=16, tm_sec=59, tm_wday=6, tm_yday=7, tm_isdst=0)

2、三种时间表示方法之间的转换

（1）将一个时间戳转换为当前时区的struct_time

print(time.localtime(1473525444.037215))     
#结果time.struct_time(tm_year=2016, tm_mon=9, tm_mday=11, tm_hour=0, tm_min=37, tm_sec=24, tm_wday=6, tm_yday=255, tm_isdst=0)

（2）将一个时间戳转换为UTC时区的struct_time

print(time.gmtime(1473525444.037215))      
#结果time.struct_time(tm_year=2016, tm_mon=9, tm_mday=10, tm_hour=16, tm_min=37, tm_sec=24, tm_wday=5, tm_yday=254, tm_isdst=0)

（3）将一个struct_time转化为时间戳

print(time.mktime(time.localtime()))       
#当前时间转换为时间戳1515302770.0

（4）将一个struct_time转化为格式化的时间字符串

print(time.strftime("%Y-%m-%d %X", time.localtime()))   
#当前时间转换为格式化的时间字符串2018-01-07 13:34:20

(5)将一个格式化的时间字符串转换为struct_time

print(time.strptime('2018-01-07 13:34:26', '%Y-%m-%d %X'))     
#time.struct_time(tm_year=2018, tm_mon=1, tm_mday=7, tm_hour=13, tm_min=34, tm_sec=26, tm_wday=6, tm_yday=7, tm_isdst=-1)

(6)将一个格式化的时间字符串转换为时间戳

print(time.mktime(time.strptime('2018-01-07 13:34:26', "%Y-%m-%d %H:%M:%S")))  #结果1515303266.0

(7)将时间戳转换为格式化的时间字符串

a = 1515302770.0
b = time.localtime(a)
print(time.strftime("%Y-%m-%d %H:%M:%S", b))     #结果2018-01-07 13:26:10

3、把一个表示时间的元组或者struct_time表示为这种形式：'Sun Jun 20 23:21:05 1993'

print(time.asctime())       #结果Sun Jan  7 14:03:39 2018
print(time.ctime())          #结果Sun Jan  7 14:03:39 2018

4、模块datetime

import datetime
print(datetime.datetime.now()) #返回 2018-01-07 14:08:16.536681
print(datetime.date.fromtimestamp(time.time()) )  # 时间戳直接转成日期格式2018-01-07
print(datetime.datetime.now() )                      #2018-01-07 14:08:16.536681
print(datetime.datetime.now() + datetime.timedelta(3)) #当前时间+3天
print(datetime.datetime.now() + datetime.timedelta(-3)) #当前时间-3天
print(datetime.datetime.now() + datetime.timedelta(hours=3)) #当前时间+3小时
print(datetime.datetime.now() + datetime.timedelta(minutes=30)) #当前时间+30分
c_time  = datetime.datetime.now()
print(c_time.replace(minute=3,hour=2)) #时间替换，把当前时间分钟变为3，小时变为2，结果2018-01-07 02:03:16.536681

二、 logging模块

1、日志级别
CRITICAL = 50
ERROR = 40
WARNING = 30   #默认日志级别
INFO = 20
DEBUG = 10
NOTSET = 0    #不设置

2、format参数中可能用到的格式化串说明

%(name)s：Logger的名字，并非用户名，详细查看
%(levelno)s：数字形式的日志级别
%(levelname)s：文本形式的日志级别
%(pathname)s：调用日志输出函数的模块的完整路径名，可能没有
%(filename)s：调用日志输出函数的模块的文件名
%(module)s：调用日志输出函数的模块名
%(funcName)s：调用日志输出函数的函数名
%(lineno)d：调用日志输出函数的语句所在的代码行
%(created)f：当前时间，用UNIX标准的表示时间的浮 点数表示
%(relativeCreated)d：输出日志信息时的，自Logger创建以 来的毫秒数
%(asctime)s：字符串形式的当前时间。默认格式是 “2003-07-08 16:49:45,896”
%(thread)d：线程ID。可能没有
%(threadName)s：线程名。可能没有
%(process)d：进程ID。可能没有
%(message)s：用户输出的消息

实例：日志打印到access.log文件里

import logging
logging.basicConfig(filename='access.log',
                    format='%(asctime)s - %(name)s - %(levelname)s -%(module)s:  %(message)s',
                    datefmt='%Y-%m-%d %H:%M:%S %p',
                    level=10)
logging.debug('调试debug')
logging.info('消息info')
logging.warning('警告warn')
logging.error('错误error')
logging.critical('严重critical')

3、logging模块的Formatter，Handler，Logger，Filter对象

logger：产生日志的对象

Filter：过滤日志的对象

Handler：接收日志然后控制打印到不同的地方，FileHandler用来打印到文件中，StreamHandler用来打印到终端

Formatter对象：可以定制不同的日志格式对象，然后绑定给不同的Handler对象使用，以此来控制不同的Handler的日志格式

4、输出日志流程

'''
critical=50
error =40
warning =30
info = 20
debug =10
'''
import logging
(1)logger对象：负责产生日志，然后交给Filter过滤，然后交给不同的Handler输出
logger=logging.getLogger(__file__)
(2)Filter对象：不常用，略
(3)Handler对象：接收logger传来的日志，然后控制输出方式
h1=logging.FileHandler('t1.log') #打印到文件
h2=logging.FileHandler('t2.log') #打印到文件
h3=logging.StreamHandler() #打印到终端
(4)Formatter对象：日志格式
formmater1=logging.Formatter('%(asctime)s - %(name)s - %(levelname)s -%(module)s:  %(message)s',
                    datefmt='%Y-%m-%d %H:%M:%S %p',)
formmater2=logging.Formatter('%(asctime)s :  %(message)s',
                    datefmt='%Y-%m-%d %H:%M:%S %p',)
formmater3=logging.Formatter('%(name)s %(message)s',)
(5)为Handler对象绑定格式
h1.setFormatter(formmater1)
h2.setFormatter(formmater2)
h3.setFormatter(formmater3)
(6)将Handler添加给logger并设置日志级别
logger.addHandler(h1)
logger.addHandler(h2)
logger.addHandler(h3)
logger.setLevel(10)
(7)测试
logger.debug('debug')
logger.info('info')
logger.warning('warning')
logger.error('error')
logger.critical('critical')

5、设置级别

logger是第一级过滤，然后才能到handler，我们可以给logger和handler同时设置level，但是需要注意的是：

logger的级别小于等于handler的级别，否则日志会在logger时就被过滤

6、logger应用

（1）logging配置文件

import os
import logging.config
standard_format = '[%(asctime)s][%(threadName)s:%(thread)d][task_id:%(name)s][%(filename)s:%(lineno)d]' \
                  '[%(levelname)s][%(message)s]' #其中name为getlogger指定的名字
simple_format = '[%(levelname)s][%(asctime)s][%(filename)s:%(lineno)d]%(message)s'
id_simple_format = '[%(levelname)s][%(asctime)s] %(message)s'
logfile_dir = os.path.dirname(os.path.abspath(__file__))  # log文件的目录
logfile_name = 'access.log'  # log文件名
# 如果不存在定义的日志目录就创建一个
if not os.path.isdir(logfile_dir):
    os.mkdir(logfile_dir)
logfile_path = os.path.join(logfile_dir, logfile_name)
# log配置字典
LOGGING_DIC = {
    'version': 1,
    'disable_existing_loggers': False,
    'formatters': {
        'standard': {
            'format': standard_format
        },
        'simple': {
            'format': simple_format
        },
    },
    'filters': {},
    'handlers': {
        #打印到终端的日志
        'console': {
            'level': 'DEBUG',
            'class': 'logging.StreamHandler',  # 打印到屏幕
            'formatter': 'simple'
        },
        #打印到文件的日志,收集info及以上的日志
        'default': {
            'level': 'DEBUG',
            'class': 'logging.handlers.RotatingFileHandler',  # 保存到文件
            'formatter': 'standard',
            'filename': logfile_path,  # 日志文件
            'maxBytes': 1024*1024*5,  # 日志大小 5M
            'backupCount': 5,
            'encoding': 'utf-8',  # 日志文件的编码，再也不用担心中文log乱码了
        },
    },
    'loggers': {
        #logging.getLogger(__name__)拿到的logger配置
        '': {
            'handlers': ['default', 'console'],  # 这里把上面定义的两个handler都加上，即log数据既写入文件又打印到屏幕
            'level': 'DEBUG',
            'propagate': True,  # 向上（更高level的logger）传递
        },
    },
}
def load_my_logging_cfg():
    logging.config.dictConfig(LOGGING_DIC)  # 导入上面定义的logging配置
    logger = logging.getLogger(__name__)  # 生成一个log实例
    logger.info('It works!')  # 记录该文件的运行状态
if __name__ == '__main__':
    load_my_logging_cfg()

（2）logger使用

import time
import logging
import my_logging  # 导入自定义的logging配置
logger = logging.getLogger(__name__)  # 生成logger实例
def demo():
    logger.debug("start range... time:{}".format(time.time()))
    logger.info("中文测试开始。。。")
    for i in range(10):
        logger.debug("i:{}".format(i))
        time.sleep(0.2)
    else:
        logger.debug("over range... time:{}".format(time.time()))
    logger.info("测试结束。。。")
if __name__ == "__main__":
    my_logging.load_my_logging_cfg()  # 在你程序文件的入口加载自定义logging配置
    demo()

三、re模块（正则模块）

1、正则就是用一些具有特殊含义的符号组合到一起（称为正则表达式）来描述字符或者字符串的方法。

2、正则表达式元字符说明

. 匹配除换行符以外的任意字符

^ 匹配字符串的开始

$ 匹配字符串的结束

[] 用来匹配一个指定的字符类别

？对于前一个字符字符重复0次到1次

* 对于前一个字符重复0次到无穷次

{} 对于前一个字符重复m次

{m，n} 对前一个字符重复为m到n次

\d 匹配数字，相当于[0-9]

\D 匹配任何非数字字符，相当于[^0-9]

\s 匹配任意的空白符，相当于[ fv]

\S 匹配任何非空白字符，相当于[^ fv]

\w 匹配任何字母数字字符，相当于[a-zA-Z0-9_]

\W 匹配任何非字母数字字符，相当于[^a-zA-Z0-9_]

\b 匹配单词的开始或结束,print(re.findall(r'er\b','never hello word 123'))

3、正则匹配

import re
#\w与\W
print(re.findall('\w','hello\t word\n 123')) #['h', 'e', 'l', 'l', 'o', 'w', 'o', 'r', 'd', '1', '2', '3']
print(re.findall('\W','hello\t egon\n 123')) #['\t', ' ', '\n', ' ']
#\s与\S
print(re.findall('\s','hello\t word\n 123'))  #['\t', ' ', '\n', ' '],\n \t都是空,都可以被\s匹配
print(re.findall('\S','hello\t word\n 123'))  #['h', 'e', 'l', 'l', 'o', 'w', 'o', 'r', 'd', '1', '2', '3']
#\n与\t
print(re.findall('\n','hello\t word\n 123'))   #['\n']
print(re.findall('\t','hello\t word\n 123'))   #['\t']
#\d与\D
print(re.findall('\d','hello\t word\n 123'))  #['1', '2', '3']
print(re.findall('\D','hello\t word\n 123'))   #['h', 'e', 'l', 'l', 'o', '\t', ' ', 'w', 'o', 'r', 'd', '\n', ' ']
#\A与\Z
print(re.findall('\Ahe','hello\t word\n 123'))  #['he'],相当于^
print(re.findall('123\Z','hello\t word\n 123'))  #['123'],相当于$
#^与$
print(re.findall('\Ah','hello\t word\n 123'))   #['h']
print(re.findall('123\Z','hello\t word\n 123')) #['123']

4、重复匹配：| . | * | ? | .* | .*? | + | {n,m} |

#.
print(re.findall('a.b','a1b a*b a b aaab')) #['a1b', 'a*b', 'a b', 'aab']
print(re.findall('a.b','a\nb'))       #[]，不匹配换行符
print(re.findall('a.b','a\nb',re.S))  #['a\nb']
print(re.findall('a.b','a\nb',re.DOTALL)) #['a\nb']同上一条意思一样
#*
print(re.findall('ab*','abbbb bbbbbbb a')) #['abbbb', 'a']
#?
print(re.findall('ab?','abbb a bbbbbb')) #['ab', 'a']
#匹配所有包含小数在内的数字
print(re.findall('\d+\.?\d*',"adf123as1.13dfa12adsf1asdf3")) #['123', '1.13', '12', '1', '3']
#.* 默认为贪婪匹配
print(re.findall('a.*b','a1b22222222b')) #['a1b22222222b']
#.*? 为非贪婪匹配：推荐使用
print(re.findall('a.*?b','a1b22222222b')) #['a1b']
#+
print(re.findall('ab+','a')) #[]
print(re.findall('ab+','abbb a ab')) #['abbb', 'ab']
#{n,m} 指定匹配字符的次数
print(re.findall('ab{2}','abbb')) #['abb']
print(re.findall('ab{2,4}','abbb')) #['abb']
print(re.findall('ab{1,}','abbb')) #'ab{1,}',匹配ab，其中b至少有1个，相当于 'ab+'
print(re.findall('ab{0,}','abbb')) #'ab{0,}' ，匹配ab，其中b可以没有，也可以有多个，'ab*'
#[]
print(re.findall('a[1*-]b','a1b a*b a-b')) #[]内的都为普通字符了，且如果-没有被转意的话，应该放到[]的开头或结尾,['a1b', 'a*b', 'a-b']
print(re.findall('a[^1*-]b','a1b a*b a-b a=b')) #[]内的^代表的意思是取反，所以结果为['a=b']
print(re.findall('a[a-z]b','a1b a*b a-b a=b aeb')) #结果为['aeb']
print(re.findall('a[a-zA-Z]b','a1b a*b a-b a=b aeb aEb')) #['aeb', 'aEb']
print(re.findall(r'a\\c','a\c')) #r代表告诉解释器使用rawstring，即原生字符串，把我们正则内的所有符号都当普通字符处理，不要转义,['a\\c']
#():分组
print(re.findall('ab+','ababab123')) #['ab', 'ab', 'ab']
print(re.findall('(ab)+123','ababab123')) #['ab']，匹配到末尾的ab123中的ab
print(re.findall('(?:ab)+123','ababab123')) #findall的结果不是匹配的全部内容，而是组内的内容,?:可以让结果为匹配的全部内容
print(re.findall('href="(.*?)"','<a href="http://www.baidu.com">点击</a>'))#['http://www.baidu.com']
print(re.findall('href="(?:.*?)"','<a href="http://www.baidu.com">点击</a>'))#['href="http://www.baidu.com"']
#|
print(re.findall('compan(?:y|ies)','companies my company comparies'))   #结果是['companies', 'company']

5、re模块提供的方法介绍

print(re.findall('e','make love') )      #['e', 'e'],返回所有满足匹配条件的结果,放在列表里
print(re.search('e','make love').group()) #e,只到找到第一个匹配然后返回一个包含匹配信息的对象,该对象可以通过调用group()方法得到匹配的字符串,如果字符串没有匹配，则返回None。
print(re.match('e','alex make love'))    #None,同search,不过在字符串开始处进行匹配,完全可以用search+^代替match
print(re.split('[ab]','abcd'))            #['', '', 'cd']，先按'a'分割得到''和'bcd',再对''和'bcd'分别按'b'分割
print(re.sub('a','A','alex make love'))     #Alex mAke love，不指定n，默认替换所有
print(re.sub('a','A','alex make love',1))    #Alex make love,最后的1指的是替换的个数
print(re.sub('^(\w+)(.*?\s)(\w+)(.*?\s)(\w+)(.*?)$',r'\5\2\3\4\1','alex make love')) #love make alex
print(re.subn('a','A','alex make love'))    #('Alex mAke love', 2),结果带有总共替换的个数
obj=re.compile('\d{2}')
print(obj.search('abc123eeee').group())   #12
print(obj.findall('abc123eeee'))           #['12'],重用了obj