首页 > 编程语言> 文章详细

python模块

2021-03-05 21:02:08 阅读：125 来源： 互联网

标签：匹配 python re 模块 import print path findall

一、模块的定义

Python模块是一个python文件，以.py结尾,包含了python对象定义和python语句。

模块让你能够有逻辑的组织你的python代码段，把相关的代码分配到一个模块里能让你的代码更好用，更易懂。模块能定义函数，类和变量，模块里也能包含可执行的代码。

二、模块的导入

1.import语句

如 import time

2.from …import *语句

从一个模块里导入所有项目如导入time模块下所有类、函数、变量等

from time import *

3.from …import…语句

语法 from modname import name1，name2…nameN

例如：from time import sleep

三、内置模块

1.time

# time()  返回当前时间的时间戳（1970纪元后经过的浮点秒数）
# from time import time
# print(time())
import time
print(time.time())  # 1614844968.122427

# ctime() 把一个时间戳（按秒计算的浮点数）转化为time.asctime()的形式
from time import ctime
print(ctime())  # Thu Mar  4 16:03:13 2021

# asctime() 函数接受时间元组并返回一个可读的形式为"Tue Dec 11 18:07:14 2008"（2008年12月11日 周二18时07分14秒）的24个字符的字符串。
from time import asctime
print(asctime())  # Thu Mar  4 16:03:13 2021

# strftime() 函数接收以时间元组，并返回以可读字符串表示的当地时间，格式由参数 format 决定。
import time
print(time.strftime('%Y-%m-%d %H:%M:%S'))  # 2021-03-04 16:08:44
%y 两位数的年份表示（00-99）
%Y 四位数的年份表示（000-9999）
%m 月份（01-12）
%d 月内中的一天（0-31）
%H 24小时制小时数（0-23）
%I 12小时制小时数（01-12）
%M 分钟数（00=59）
%S 秒（00-59）
%a 本地简化星期名称
%A 本地完整星期名称
%b 本地简化的月份名称
%B 本地完整的月份名称
%c 本地相应的日期表示和时间表示
%j 年内的一天（001-366）
%p 本地A.M.或P.M.的等价符
%U 一年中的星期数（00-53）星期天为星期的开始
%w 星期（0-6），星期天为星期的开始
%W 一年中的星期数（00-53）星期一为星期的开始
%x 本地相应的日期表示
%X 本地相应的时间表示
%Z 当前时区的名称

2.random

# random() 随机生成以0开头的浮点数
import random 
print(random.random())  # 0.7915018961358002

#randint() 指定范围内随机生成一位
import random
print(random.randint(1,2))

#randrange()只取一个值，返回指定递增基数集合中的一个随机数，基数默认值为1。
import random
print(random.randrange(1,100,2)) #打印结果：69

# sample()  从指定的可迭代对象里，随机选取指定数量的值，返回的结果是列表
import random
# a = (1,2,3,4,5)
# a = '123456'
# a = [1,2,3,4,5]
a = {1,2,3,4,5,6，'b'}
print(random.sample(a,2)) 

# choice() 从可以索引的数据类型中随机选取一个值
import random
a = ['a','b','c']
# a = (1,2,3,4)
# a = 'abcd'
print(random.choice(a))

# shuffle  把列表的值无规律的输出，没有返回值
import random
a = ['a','b','c']
# a = [1,2,3,4]
random.shuffle(a)
print(a)

3.string

# digits 生成0-9的整数
import string
print(string.digits)  # 0123456789

# hexdigits  生成0-9的整数+大小写的a-f和A-F
import string
print(string.hexdigits)  # 0123456789abcdefABCDEF

# ascii_uppercase 生成26个大写字母
import string
print(string.ascii_uppercase)  # ABCDEFGHIJKLMNOPQRSTUVWXYZ

# ascii_lowercase  生成26个小写字母
import string
print(string.ascii_lowercase)  # abcdefghijklmnopqrstuvwxyz

4.base64、hashlib

import base64
a = base64.b64encode(b"1234")
print(a)  # b'MTIzNA=='
b = base64.b64decode(a)
print(b)  # b'1234'

DM5加密：md5,是一种算法.可以将一个字符串,或文件,或压缩包,执行md5后,就可以生成一个固定长度为128bit的串,这个串,基本上是唯一的

import hashlib
md5 = hashlib.md5()
md5.update('123456'.encode('utf-8'))
print(md5.hexdigest())

# 计算结果如下：
'e10adc3949ba59abbe56e057f20f883e'

# 验证：相同的bytes数据转化的结果一定相同
import hashlib

md5 = hashlib.md5()
md5.update('123456'.encode('utf-8'))
print(md5.hexdigest())

# 计算结果如下：
'e10adc3949ba59abbe56e057f20f883e'

# 验证：不相同的bytes数据转化的结果一定不相同
import hashlib

md5 = hashlib.md5()
md5.update('12345'.encode('utf-8'))
print(md5.hexdigest())

# 计算结果如下：
'827ccb0eea8a706c4c34a16891f84e7b'

如果数据量很大，可以分块多次调用update()，最后计算的结果是一样的：

md5 = hashlib.md5()
md5.update('how to use md5 in ')
md5.update('python hashlib?')
print md5.hexdigest()

MD5是最常见的摘要算法，速度很快，生成结果是固定的128 bit字节，通常用一个32位的16进制字符串表示。另一种常见的摘要算法是SHA1，调用SHA1和调用MD5完全类似：

import hashlib
sha1 = hashlib.sha1()
sha1.update('how to use sha1 in ')
sha1.update('python hashlib?')
print sha1.hexdigest()

SHA1的结果是160 bit字节，通常用一个40位的16进制字符串表示。比SHA1更安全的算法是SHA256和SHA512，不过越安全的算法越慢，而且摘要长度更长。

5.os模块

# getcwd()方法用于返回当前工作目录，绝对路径
import os
print(os.getcwd())  # F:\flask_projects\dcs

# os.path.isfile() 判断当前是否为文件，返回布尔值
import os
a_path = r'F:\flask_projects\dcs'
b_path = r'F:\flask_projects\dcs\python模块.py'
print(os.path.isfile(a_path))  # False
print(os.path.isfile(b_path))  # True

# os.path.exists() 判断文件夹或文件是否存在
import os
a_path = r'F:\flask_projects\dcs'
b_path = r'F:\flask_projects\dcs\dcs1'
print(os.path.exists(a_path))  # True
# print(os.path.exists(b_path))  # False
# 不存在创建文件夹
if not os.path.exists(b_path):
    os.mkdir(b_path)

a_path = r'F:\flask_projects\dcs\python模块.py'
b_path = r'F:\flask_projects\dcs\python模块1.py'
print(os.path.exists(a_path))  # True
print(os.path.exists(b_path))  # False
# 不存在创建文件
import os
import codecs
# 专门用作编码转换，当我们要做编码转换的时候可以借助codecs很简单的进行编码转换
if not os.path.exists(b_path):
    with codecs.open(b_path,'a+',encoding='utf-8') as f:
        f.write()
# mkdir() 创建文件夹，没有创建有则报错 无返回值
import os
a_path = r'F:\flask_projects\dcs\shishi'
os.mkdir(a_path)

# remove() 移除文件
import os
a_path = r'F:\flask_projects\dcs\haha.txt'
os.remove(a_path)

# path.isdir() 判断是不是文件夹 返回布尔值
import os
a_path = r'F:\flask_projects\dcs\haha'
print(os.path.isdir(a_path)) # True

# listdir() 获取指定目录下所有的文件和目录 返回列表形式
import os
a_path = r'F:\flask_projects\dcs'
print(os.listdir(a_path))
# ['.idea', 'dcs01.py', 'dcs02.py', 'dcs03.py', 'haha', 'python函数.py', 'python模块.py', '作业.py']

# rename() 重命名目录和文件
import os
a_path = r'F:\flask_projects\dcs\haha'
b_path = r'F:\flask_projects\dcs\xixi'
os.rename(a_path,b_path)


# path.split() 分割路径中文件和路径,返回元组形式
import os
a_path = r'F:\flask_projects\dcs\xixi'
b_path = r'F:\flask_projects\dcs\dcs03.py'
print(os.path.split(a_path))  # ('F:\\flask_projects\\dcs', 'xixi')
print(os.path.split(b_path))  # ('F:\\flask_projects\\dcs',
'dcs03.py')  # ('F:\\flask_projects\\dcs\\dcs03', '.py')
print(os.path.splitext(b_path)) # 
# path.join()  拼接路径
a_path = 'F:\\flask_projects\\dcs'
b_path = 'dcs03.py'
print(os.path.join(a_path,b_path))  # F:\flask_projects\dcs\dcs03.py

6.re模块

正则就是用一些具有特殊含义的符号组合到一起（称为正则表达式）来描述字符或者字符串的方法。或者说：正则就是用来描述一类事物的规则。（在Python中）它内嵌在Python中，并通过 re 模块实现。正则表达式模式被编译成一系列的字节码，然后由用 C 编写的匹配引擎执行。

元字符	匹配内容
\w	匹配字母（包含中文）或数字或下划线
\W	匹配非字母（包含中文）或数字或下划线
\s	匹配任意的空白符
\S	匹配任意非空白符
\d	匹配数字
\D	p匹配非数字
\A	从字符串开头匹配
\z	匹配字符串的结束，如果是换行，只匹配到换行前的结果
\n	匹配一个换行符
\t	匹配一个制表符
^	匹配字符串的开始
$	匹配字符串的结尾
.	匹配任意字符，除了换行符，当re.DOTALL标记被指定时，则可以匹配包括换行符的任意字符。
[...]	匹配字符组中的字符
[^...]	匹配除了字符组中的字符的所有字符
*	匹配0个或者多个左边的字符。
+	匹配一个或者多个左边的字符。
？	匹配0个或者1个左边的字符，非贪婪方式。
{n}	精准匹配n个前面的表达式。
{n,m}	匹配n到m次由前面的正则表达式定义的片段，贪婪方式
a\	b
()	匹配括号内的表达式，也表示一个组

# import re
# \w 与 \W
# print(re.findall('\w', '世界jx 12*() _'))  # ['世', '界', 'j', 'x', '1', '2', '_']
# print(re.findall('\W', '世界jx 12*() _'))  # [' ', '*', '(', ')', ' ']

# \s 与\S
# print(re.findall('\s','世界barry*(_ \t \n'))  # [' ', '\t', ' ', '\n']
# print(re.findall('\S','世界barry*(_ \t \n'))  # ['世', '界', 'b', 'a', 'r', 'r', 'y', '*', '(', '_']


# \d 与 \D
# print(re.findall('\d','1234567890 alex *（_'))  # ['1', '2', '3', '4', '5', '6', '7', '8', '9', '0']
# print(re.findall('\D','1234567890 alex *（_'))  # [' ', 'a', 'l', 'e', 'x', ' ', '*', '（', '_']

# \A 与 ^
# print(re.findall('\Ahel','hello 世界 -_- 666'))  # ['hel']
# print(re.findall('^hel','hello 世界 -_- 666'))  # ['hel']


# \Z、\z 与 $
# print(re.findall('666\Z','hello 世界 *-_-* 666'))  # ['666']
# print(re.findall('666\z','hello 世界 *-_-* 666'))  # []
# print(re.findall('666$','hello 世界 *-_-* 666'))  # ['666']

# \n 与 \t
# print(re.findall('\n','hello \n 世界 \t*-_-*\t \n666'))  # ['\n', '\n']
# print(re.findall('\t','hello \n 世界 \t*-_-*\t \n666'))  # ['\t', '\t']


# 重复匹配

# . ? * + {m,n} .* .*?

# . 匹配任意字符，除了换行符（re.DOTALL 这个参数可以匹配\n）。
# print(re.findall('a.b', 'ab aab a*b a2b a牛b a\nb'))  # ['aab', 'a*b', 'a2b', 'a牛b']
# print(re.findall('a.b', 'ab aab a*b a2b a牛b a\nb',re.DOTALL))  # ['aab', 'a*b', 'a2b', 'a牛b']


# ？匹配0个或者1个由左边字符定义的片段。
# print(re.findall('a?b', 'ab aab abb aaaab a牛b aba**b'))  # ['ab', 'ab', 'ab', 'b', 'ab', 'b', 'ab', 'b']


# * 匹配0个或者多个左边字符表达式。 满足贪婪匹配
# print(re.findall('a*b', 'ab aab aaab abbb'))  # ['ab', 'aab', 'aaab', 'ab', 'b', 'b']
print(re.findall('ab*', 'ab aab aaab abbbbb'))  # ['ab', 'a', 'ab', 'a', 'a', 'ab', 'abbbbb']

# + 匹配1个或者多个左边字符表达式。 满足贪婪匹配
# print(re.findall('a+b', 'ab aab aaab abbb'))  # ['ab', 'aab', 'aaab', 'ab']


# {m,n}  匹配m个至n个左边字符表达式。 满足贪婪匹配
# print(re.findall('a{2,4}b', 'ab aab aaab aaaaabb'))  # ['aab', 'aaab' 'aaaab']


# .* 贪婪匹配 从头到尾.
# print(re.findall('a.*b', 'ab aab a*()b'))  # ['ab aab a*()b']


# .*? 此时的?不是对左边的字符进行0次或者1次的匹配,
# 而只是针对.*这种贪婪匹配的模式进行一种限定:告知他要遵从非贪婪匹配 推荐使用!
# print(re.findall('a.*?b', 'ab a1b a*()b, aaaaaab'))  # ['ab', 'a1b', 'a*()b', 'aaaaaab']


# []: 括号中可以放任意一个字符,一个中括号代表一个字符
# - 在[]中表示范围,如果想要匹配上- 那么这个-符号不能放在中间.
# ^ 在[]中表示取反的意思.
# print(re.findall('a.b', 'a1b a3b aeb a*b arb a_b'))  # ['a1b', 'a3b', 'a4b', 'a*b', 'arb', 'a_b']
# print(re.findall('a[abc]b', 'aab abb acb adb afb a_b'))  # ['aab', 'abb', 'acb']
# print(re.findall('a[0-9]b', 'a1b a3b aeb a*b arb a_b'))  # ['a1b', 'a3b']
# print(re.findall('a[a-z]b', 'a1b a3b aeb a*b arb a_b'))  # ['aeb', 'arb']
# print(re.findall('a[a-zA-Z]b', 'aAb aWb aeb a*b arb a_b'))  # ['aAb', 'aWb', 'aeb', 'arb']
# print(re.findall('a[0-9][0-9]b', 'a11b a12b a34b a*b arb a_b'))  # ['a11b', 'a12b', 'a34b']
# print(re.findall('a[*-+]b','a-b a*b a+b a/b a6b'))  # ['a*b', 'a+b']
# - 在[]中表示范围,如果想要匹配上- 那么这个-符号不能放在中间.
# print(re.findall('a[-*+]b','a-b a*b a+b a/b a6b'))  # ['a-b', 'a*b', 'a+b']
# print(re.findall('a[^a-z]b', 'acb adb a3b a*b'))  # ['a3b', 'a*b']

# 练习:
# 找到字符串中'alex_sb ale123_sb wu12sir_sb wusir_sb ritian_sb' 的 alex wusir ritian
# print(re.findall('([a-z]+)_sb','alex_sb ale123_sb wusir12_sb wusir_sb ritian_sb'))


# 分组:

# () 制定一个规则,将满足规则的结果匹配出来
# print(re.findall('(.*?)_sb', 'alex_sb wusir_sb 日天_sb'))  # ['alex', ' wusir', ' 日天']

# 应用举例:
# print(re.findall('href="(.*?)"','<a href="http://www.baidu.com">点击</a>'))#['http://www.baidu.com']


# | 匹配 左边或者右边
# print(re.findall('alex|太白|wusir', 'alex太白wusiraleeeex太太白odlb'))  # ['alex', '太白', 'wusir', '太白']
# print(re.findall('compan(y|ies)','Too many companies have gone bankrupt, and the next one is my company'))  # ['ies', 'y']
# print(re.findall('compan(?:y|ies)','Too many companies have gone bankrupt, and the next one is my company'))  # ['companies', 'company']
# 分组() 中加入?: 表示将整体匹配出来而不只是()里面的内容。

常用方法

import re

#1 findall 全部找到返回一个列表。
print(re.findall('a', 'alexwusirbarryeval'))  # ['a', 'a', 'a']


# 2 search 只到找到第一个匹配然后返回一个包含匹配信息的对象,该对象可以通过调用group()方法得到匹配的字符串,如果字符串没有匹配，则返回None。
print(re.search('sb|alex', 'alex sb sb barry 日天'))  # <_sre.SRE_Match object; span=(0, 4), match='alex'>
print(re.search('alex', 'alex sb sb barry 日天').group())  # alex


# 3 match：None,同search,不过在字符串开始处进行匹配,完全可以用search+^代替match
print(re.match('barry', 'barry alex wusir 日天'))  # <_sre.SRE_Match object; span=(0, 5), match='barry'>
print(re.match('barry', 'barry alex wusir 日天').group()) # barry


# 4 split 分割 可按照任意分割符进行分割
print(re.split('[ ：:,;；， ]','xubin xiaosir,日天， 女神;世界：男神'))  # ['xubin', 'xiaosir', '日天', '', '女神', '世界', '男神']


# 5 sub 替换

print(re.sub('男神', '世界', '男神太帅了'))
# 世界太帅了

# print(re.sub('([a-zA-Z]+)([^a-zA-Z]+)([a-zA-Z]+)([^a-zA-Z]+)([a-zA-Z]+)', r'\5\2\3\4\1', r'alex is sb'))
# sb is alex

# 6
# obj=re.compile('\d{2}')
# print(obj.search('abc123eeee').group()) #12
# print(obj.findall('abc123eeee')) #['12'],重用了obj


# import re
# ret = re.finditer('\d', 'ds3sy4784a')   #finditer返回一个存放匹配结果的迭代器
# print(ret)  # <callable_iterator object at 0x10195f940>
# print(next(ret).group())  #查看第一个结果
# print(next(ret).group())  #查看第二个结果
# print([i.group() for i in ret])  #查看剩余的左右结果

命名分组举例

# 命名分组匹配：
ret = re.search("<(?P<tag_name>\w+)>\w+</(?P=tag_name)>","<h1>hello</h1>")
# #还可以在分组中利用?<name>的形式给分组起名字
# #获取的匹配结果可以直接用group('名字')拿到对应的值
# print(ret.group('tag_name'))  #结果 ：h1
# print(ret.group())  #结果 ：<h1>hello</h1>
#
ret = re.search(r"<(\w+)>\w+</\1>","<h1>hello</h1>")
#如果不给组起名字，也可以用\序号来找到对应的组，表示要找的内容和前面的组内容一致
#获取的匹配结果可以直接用group(序号)拿到对应的值
print(ret.group(1))  # h1
print(ret.group())  #结果 ：<h1>hello</h1>

7.xlrd

xlrd是可用于读取Excel表格数据(不支持写操作，写操作需要xlwt模块实现)
支持xlsx和xls格式Excel表格(不支持csv文件，CSV文件可用python自带的csv模块操作)

import xlrd
data = xlrd.open_workbook('info.xlsx')  # 打开一个xlsx文件，实例化一个对象
print(data.sheets()[0])  # 通过索引顺序获取sheet对象
print(data.sheet_by_index(1))  # 通过索引顺序获取sheet对象
print(data.sheet_by_name("Sheet1"))  # 通过名称获取sheet对象
print(data.sheet_names())  # 返回book中所有工作表的名字
s = data.sheets()[0]  # 选择第一张表
print(s.name)  # 表名  sheet1
print(s.nrows)  # 获取页面编辑行数 10
print(s.ncols)  # 获取页面列数  4

# sheet.row_values(0)  获取第一行所有内容，合并单元格，首行显示值，其它为空。
# sheet.row(0)   获取单元格值类型和内容
# sheet.row_types(0)  获取单元格数据类型
print(s.row_values(0))  # 获取第一行所有内容 ['Host', 'User', 'PassWord', 'Port']
print(s.row_values(0, 0, 3))  # 获取第一行 第一列到第三列的内容
print(s.row(0))  # [text:'Host', text:'User', text:'PassWord', text:'Port']
print(s.row_types(1))  # array('B', [1, 1, 1, 1])


print(s.row_values(0, 6, 10))   # 取第1行，第7~10列
print(s.col_values(0, 0, 5))    # 取第1列，第1~5行
print(s.row_slice(2, 0, 2))    # 获取3行1-2列单元格值类型和内容

# 特定单元格的读取
print(s.cell_value(1,2))  # 123.0 获取二行 三列的值
print(s.cell(1,2).value)  # 123.0 获取二行 三列的值
print(s.row(1)[2].value )  # 123.0

print(s.cell(1, 2).ctype)  # 2 第二行第三列的值的类型
print(s.cell_type(1, 2))  # 2
print(s.row(1)[2].ctype)  # 2

print(s.nrows)  # 获取该sheet中的有效函数
print(s.row(1)) # 获取第二行每个单元格对应的类型和值
# [text:'192.168.1.1', text:'root', number:123.0, number:21.0]
print(s.row_slice(1))  # 和row()用法相同
print(s.row_values(0,0)) # 获取当前行指定列的值
print(s.row_len(0))  # 指定行的有效单元格有几个

# 获取表格中指定一列的(通过for循环)
for i in range(s.nrows):
    print(s.row_values(i)[0])

标签：匹配,python,re,模块,import,print,path,findall
来源： https://www.cnblogs.com/shijiekuaile/p/14488472.html

本站声明： 1. iCode9 技术分享网（下文简称本站）提供的所有内容，仅供技术学习、探讨和分享；
2. 关于本站的所有留言、评论、转载及引用，纯属内容发起人的个人观点，与本站观点和立场无关；
3. 关于本站的所有言论和文字，纯属内容发起人的个人观点，与本站观点和立场无关；
4. 本站文章均是网友提供，不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属；如您发现该文章侵犯了您的权益，可联系我们第一时间进行删除；
5. 本站为非盈利性的个人网站，所有内容不会用来进行牟利，也不会利用任何形式的广告来间接获益，纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

ICode9