ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

加速将rtf转换为纯文本

2019-11-02 12:07:26  阅读:281  来源: 互联网

标签:optimization richtextbox c net-4-0


我必须将以RTF格式保存在数据库中的大量文本更改为纯文本.我正在使用described in this MSDN article方法,但是我想我发现了一个障碍(我不认为这是在我的代码中,而是.NET框架本身).

我有以下功能

    //convert RTF text to plain text
    public static string RtfTextToPlainText(string FormatObject)
    {
        System.Windows.Forms.RichTextBox rtfBox = new System.Windows.Forms.RichTextBox();
        rtfBox.Rtf = FormatObject;
        FormatObject = rtfBox.Text; //This is line 494 for later reference for the stack traces.
        rtfBox.Dispose();

        return FormatObject;
    }

它应该完全是独立的,不应阻塞任何东西.我正在做的项目有数百万条需要处理的记录,所以我要分批处理工作,并使用任务进行并行处理.它仍然运行得很慢,所以我闯入代码并找到了它.

这是等待任务的调用堆栈

[In a sleep, wait, or join] 
System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.CreateHandle(System.Windows.Forms.CreateParams cp) + 0x242 bytes 
System.Windows.Forms.dll!System.Windows.Forms.Control.CreateHandle() + 0x2b2 bytes  
System.Windows.Forms.dll!System.Windows.Forms.TextBoxBase.CreateHandle() + 0x54 bytes   
System.Windows.Forms.dll!System.Windows.Forms.RichTextBox.Rtf.set(string value) + 0x68 bytes    
>CvtCore.dll!CvtCore.StandardFunctions.Str.RtfTextToPlainText(object Expression) Line 494   C#

这是线程816的调用堆栈

[Managed to Native Transition]  
System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.DefWndProc(ref System.Windows.Forms.Message m) + 0x9e bytes  
System.Windows.Forms.dll!System.Windows.Forms.Control.WmWindowPosChanged(ref System.Windows.Forms.Message m) + 0x39 bytes   
System.Windows.Forms.dll!System.Windows.Forms.Control.WndProc(ref System.Windows.Forms.Message m) + 0x51b bytes 
System.Windows.Forms.dll!System.Windows.Forms.RichTextBox.WndProc(ref System.Windows.Forms.Message m) + 0x5c bytes  
System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.DebuggableCallback(System.IntPtr hWnd, int msg, System.IntPtr wparam, System.IntPtr lparam) + 0x15e bytes    
[Native to Managed Transition]  
[Managed to Native Transition]  
System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.DefWndProc(ref System.Windows.Forms.Message m) + 0x9e bytes  
System.Windows.Forms.dll!System.Windows.Forms.Control.WmCreate(ref System.Windows.Forms.Message m) + 0x1c bytes 
System.Windows.Forms.dll!System.Windows.Forms.Control.WndProc(ref System.Windows.Forms.Message m) + 0x50b bytes 
System.Windows.Forms.dll!System.Windows.Forms.RichTextBox.WndProc(ref System.Windows.Forms.Message m) + 0x5c bytes  
System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.DebuggableCallback(System.IntPtr hWnd, int msg, System.IntPtr wparam, System.IntPtr lparam) + 0x15e bytes    
[Native to Managed Transition]  
[Managed to Native Transition]  
System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.CreateHandle(System.Windows.Forms.CreateParams cp) + 0x44c bytes 
System.Windows.Forms.dll!System.Windows.Forms.Control.CreateHandle() + 0x2b2 bytes  
System.Windows.Forms.dll!System.Windows.Forms.TextBoxBase.CreateHandle() + 0x54 bytes   
System.Windows.Forms.dll!System.Windows.Forms.RichTextBox.Rtf.set(string value) + 0x68 bytes    
>CvtCore.dll!CvtCore.StandardFunctions.Str.RtfTextToPlainText(object Expression) Line 494   C#

为什么任务2在494行的任务4上受阻,难道它们都不完全相互独立吗?

注意

在发布模式下,我抓住了这些堆栈轨迹和屏幕快照,似乎无法在正确的时间暂停以使在调试模式下发生相同的事情.也可能是我动作缓慢的原因吗?探查器说,我的程序将其时间的83.2%花费在`System.Windows.Forms.RichTextBox.set_Rtf(string)(这是第494行调用的子函数)中

任何有关如何加快去除rtf格式的过程的建议,将不胜感激.

附言

我目前正在重写它,所以每个线程都会有一个不会被丢弃的文本框,而不是每次调用该函数时都会创建一个新的文本框,我希望可以大大提高它的速度,我将在执行后更新详细信息它.

更新

我解决了自己的问题(请参见下面的答案),但这是我开始任务的方式

//create start consumer threads
for (int i = 0; i < ThreadsPreProducer; i++)
{
    //create worked and thread
    WorkerObject NewWorkerObject = new WorkerObject(colSource, FormatObjectEvent, UpdateModule);
    Task WorkerTask = new Task(NewWorkerObject.DoWork);
    WorkerTasks.Add(WorkerTask);
    WorkerTask.Start();
}


//create/start producer thread
ProducerObject NewProducerObject = new ProducerObject(colSource, SourceQuery, ConnectionString, PreProcessor, UpdateModule, RowNameIndex);
Task ProducerTask = new Task(NewProducerObject.DoWork);
WorkerTasks.Add(ProducerTask);
ProducerTask.Start();


//block while producer runs
ProducerTask.Wait();

//create post producer threads
for (int i = 0; i < ThreadsPostProducer; i++)
{
    //create worked and thread
    WorkerObject NewWorkerObject = new WorkerObject(colSource, FormatObjectEvent, UpdateModule);
    Task WorkerTask = new Task(NewWorkerObject.DoWork);
    WorkerTasks.Add(WorkerTask);
    WorkerTask.Start();
}

//block until all tasks are done
Task.WaitAll(WorkerTasks.ToArray());

在我的例子中,它使用的是生产者/消费者模型,其中有1个生产者和4个消费者(从生产者开始2个开始,在完成生产者之后2个开始,以在从生产者释放系统资源后加快工作).

解决方法:

将功能更改为

static ThreadLocal<RichTextBox> rtfBox = new ThreadLocal<RichTextBox>(() => new RichTextBox());
//convert RTF text to plain text
public static string RtfTextToPlainText(string FormatObject )
{
     rtfBox.Value.Rtf = FormatObject;
     FormatObject = rtfBox.Value.Text;
     rtfBox.Value.Clear();

     return FormatObject;
}

我的运行时间从几分钟更改为几秒钟.

我不会处理这些对象,因为它们将在程序的整个生命周期中使用.

标签:optimization,richtextbox,c,net-4-0
来源: https://codeday.me/bug/20191102/1991258.html

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有