首页 热点资讯 义务教育 高等教育 出国留学 考研考公

如何快速获取大量文本文件的行数

发布网友 发布时间:2022-04-25 20:55

我来回答

2个回答

懂视网 时间:2022-05-10 09:12

简单的做法:

需要在python中获取大文件(数十万行)的行数。

def file_len(fname):
 with open(fname) as f:
 for i, l in enumerate(f):
  pass return i + 1

有效的方法(缓冲区读取策略):

首先看下运行的结果:

mapcount : 0.471799945831
simplecount : 0.634400033951
bufcount : 0.468800067902
opcount : 0.602999973297

因此,对于Windows/Python2.6来说,缓冲区读取策略似乎是最快的。

以下是代码:

from __future__ import with_statement
import time
import mmap
import random
from collections import defaultdict
def mapcount(filename):
 f = open(filename, "r+")
 buf = mmap.mmap(f.fileno(), 0)
 lines = 0
 readline = buf.readline
 while readline():
 lines += 1
 return lines
def simplecount(filename):
 lines = 0
 for line in open(filename):
 lines += 1
 return lines
def bufcount(filename):
 f = open(filename)   
 lines = 0
 buf_size = 1024 * 1024
 read_f = f.read # loop optimization
 buf = read_f(buf_size)
 while buf:
 lines += buf.count('
')
 buf = read_f(buf_size)
 return lines
def opcount(fname):
 with open(fname) as f:
 for i, l in enumerate(f):
  pass
 return i + 1
counts = defaultdict(list)
for i in range(5):
 for func in [mapcount, simplecount, bufcount, opcount]:
 start_time = time.time()
 assert func("big_file.txt") == 1209138
 counts[func].append(time.time() - start_time)
for key, vals in counts.items():
 print key.__name__, ":", sum(vals) / float(len(vals))

热心网友 时间:2022-05-10 06:20


把文本写入文件,用line
input读出每一行的内容,判断首字母计数就可以了.
====最新回复=====
需要三个文件框:Text1,Text2,Text3,及一个按钮command1
代码如下,供参考:
Private
Sub
Form_Load()
Text1.Text
=
"在此输入要处理的文件路径"
Text2.Text
=
"在此输入要查找的首字母"
Text3.Text
=
"此处显示文件各行的内容"
End
Sub
Private
Sub
Command1_Click()
Dim
everyline()
As
String
Dim
n
As
Integer
Call
dealtext(Text1.Text,
everyline(),
Text2.Text,
n)

声明声明:本网页内容为用户发布,旨在传播知识,不代表本网认同其观点,若有侵权等问题请及时与本网联系,我们将在第一时间删除处理。E-MAIL:11247931@qq.com