博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Python 查找binlog文件
阅读量:4984 次
发布时间:2019-06-12

本文共 4869 字,大约阅读时间需要 16 分钟。

经常需要在 binlog 中查找一些日志信息,于是写了一个简单的脚本。对于非常巨大的 binlog 文件,该脚本可能会速度慢,毕竟还是用的 list,暂时没想到好办法。

详细看代码:

#/usr/bin/python#2016-04-12#search string in the binlogs#usage:#put this file into binlog-dir,exec as:#"python test.py 111 123 update" or #"python test.py 111 123 update and insert" or #"python test.py 111 123 update or delete"#the nums are the binlog-num.import sysimport subprocess def find_str(files):    for file in files:        comm = "mysqlbinlog {0}".format(file)        lines = subprocess.Popen(comm, shell=True, stdout=subprocess.PIPE)        lines = lines.stdout.readlines()        for line in lines:            line = line.lower()            if len(sys.argv) == 4:                if sys.argv[3] + ' ' in line:                    yield line            elif len(sys.argv) == 6 and sys.argv[4] == "and":                if sys.argv[3] + ' ' in line and sys.argv[5] + ' ' in line:                    yield line            elif len(sys.argv) == 6 and sys.argv[4] == "or":                if sys.argv[3] + ' ' in line or sys.argv[5] + ' ' in line:                    yield line    if __name__ == "__main__":        start = sys.argv[1]    end = sys.argv[2]    files = ["updatelog.{0:06d}".format(i) for i in range(int(start), int(end)+1)]        f = find_str(files)    for i in f:        print(i)

 

第二次改进版本,由于 py2 不支持 yield from 语句,gen_concatenate() 可能有点绕;详细看代码:

#/usr/bin/python#2016-04-12#search string in the binlogs#usage:#put this file into binlog-dir,exec as "python test.py 111 123 update" or "python test.py 111 123 update and insert" or "python test.py 111 123 update or delete"#the nums are the binlog-num.import sysimport subprocessdef find_str(files):    print(sys.argv)    for file in files:        comm = "mysqlbinlog {0}".format(file)        lines = subprocess.Popen(comm, shell=True, stdout=subprocess.PIPE)        lines = lines.stdout.readlines()        yield lines  #此处返回的是生成器对象def gen_concatenate(lines):   #将多个生成器对象迭代返回    for i in lines:        for it in i:            yield itdef gen_grep(lines):    for line in lines:        line = line.lower()        if len(sys.argv) == 4:            if sys.argv[3] + ' ' in line:                yield line        elif len(sys.argv) == 6 and sys.argv[4] == "and":            if sys.argv[3] + ' ' in line and sys.argv[5] + ' ' in line:                yield line        elif len(sys.argv) == 6 and sys.argv[4] == "or":            if sys.argv[3] + ' ' in line or sys.argv[5] + ' ' in line:                yield lineif __name__ == "__main__":    start = sys.argv[1]    end = sys.argv[2]    files = ["updatelog.{0:06d}".format(i) for i in range(int(start), int(end)+1)]    f = find_str(files)    lines = gen_concatenate(f)    greplines = gen_grep(lines)    for i in greplines:        print(i)

 

脚本1理解起来更加容易,将实现功能全部封装在一个函数体内;

 

更新:

使用了 re 正则匹配,有时候日志里面记录的表名是带反引号的,比如`user`这样,见代码:

#/usr/bin/python#2016-04-27#search string in the binlogs#usage:#put this file into binlog-dir,exec as "python test.py 111 123 update" or "python test.py 111 123 update and insert" or "python test.py 111 123 update or delete"#the nums are the binlog-num.import sysimport subprocessimport redef find_str(files):    print(sys.argv)    for file in files:        comm = "mysqlbinlog {0}".format(file)        lines = subprocess.Popen(comm, shell=True, stdout=subprocess.PIPE)        lines = lines.stdout.readlines()        yield linesdef gen_concatenate(lines):    for i in lines:        for it in i:            yield itdef gen_grep(lines):    for line in lines:        line = line.lower()        if len(sys.argv) == 4:            # if sys.argv[3] + ' ' in line:            #     yield line            str1 = sys.argv[3]            str1_match = re.search(str1, line, re.I)            if str1_match is not None:                yield line        elif len(sys.argv) == 6 and sys.argv[4] == "and":            # if sys.argv[3] + ' ' in line and sys.argv[5] + ' ' in line:            #     yield line            str1 = sys.argv[3]            str2 = sys.argv[5]            if re.search(r'\b{}\b'.format(str1), line, re.I) is not None and re.search(r'\b{}\b'.format(str2), line, re.I):                yield line        elif len(sys.argv) == 6 and sys.argv[4] == "or":            # if sys.argv[3] + ' ' in line or sys.argv[5] + ' ' in line:            #     yield line            str1 = sys.argv[3]            str2 = sys.argv[5]            if re.search(r'\b{}\b'.format(str1), line, re.I) is not None and re.search(r'\b{}\b'.format(str2), line, re.I):                yield lineif __name__ == "__main__":    start = sys.argv[1]    end = sys.argv[2]    files = ["updatelog.{0:06d}".format(i) for i in range(int(start), int(end)+1)]    f = find_str(files)    lines = gen_concatenate(f)    greplines = gen_grep(lines)    for i in greplines:        print(i)search.py
search.py

 

posted on
2016-04-12 22:21 阅读(
...) 评论(
...)

转载于:https://www.cnblogs.com/bvac/p/5384756.html

你可能感兴趣的文章
UINavigationController的视图层理关系
查看>>
LeetCode 题解之Add Digits
查看>>
hdu1502 , Regular Words, dp,高精度加法
查看>>
iOS 电话在后台运行时,我的启动图片被压缩
查看>>
js --基本语法3 函数,数组,堆棧
查看>>
正则表达式
查看>>
楼房重建(分块优化)
查看>>
斐波那契数列(矩阵加速递推)
查看>>
HTTP笔记之一
查看>>
Gradle 学习一
查看>>
hiho #1223 不等式
查看>>
Synchronized关键字
查看>>
webfont 字体
查看>>
lua快速入门
查看>>
FullCalendar 官方文档翻译
查看>>
plsql 操纵表数据的2种方式
查看>>
输出日期
查看>>
hibernate中实体与数据库中属性对应的类型
查看>>
多线程池以及futures python新的线程包
查看>>
3389无法连接的5种原因分析
查看>>