中国Linux公社论坛's Archiver

peterdocter 发表于 2012-8-17 09:28

如何删除以下格式内容?

处理格式:
<h2 class="def-header">
<span>Definition of <em>-ABILITY</em>
</span>
</h2>



<span class="ssens"> <strong>:</strong> capacity, fitness, or tendency to act or be acted on in a (specified) way <span class="vi">&lt;agglutin<em>ability</em>&gt;</span> </span>

<h2>
<span>Browse</span>
</h2>

<span>Next Word in the Dictionary: <a href="/dictionary/Ahom">Ahom</a>
</span>
<br />
<span>Previous Word in the Dictionary: <a href="/dictionary/aholehole">aholehole</a>
</span>
<br />
<span>All Words Near: <a href="/browse/dictionary/-aholic">-aholic</a>
</span>


<h2>
<span>

<span class="ssens"> <strong>:</strong> replacing carbon especially in a ring <span class="vi">&lt;az<em>a-</em>&gt;</span> </span>



<h2>
<span>Browse</span>
</h2>

<span>Next Word in the Dictionary: <a href="/dictionary/algic acid">algic acid</a>
</span>
<br />
<span>Previous Word in the Dictionary: <a href="/dictionary/algeroba">algeroba</a>
</span>
<br />
<span>All Words Near: <a href="/browse/dictionary/-algia">-algia</a>
</span>


<h2>
<span>

<h2>-ability</h2> <span class="main-fl"> <em xmlns:mwref="http://www.m-w.com/mwref">noun suffix</em> </span>

注意:
开头格式必须:
<h2>
<span>Browse</span>
</h2>
结束格式必须:
<h2>
<span>
符合以上条件时才删除内容。

结果:
删除以上格式内容,变成如下:
<h2 class="def-header">
<span>Definition of <em>-ABILITY</em>
</span>
</h2>


<span class="ssens"> <strong>:</strong> capacity, fitness, or tendency to act or be acted on in a (specified) way <span class="vi">&lt;agglutin<em>ability</em>&gt;</span> </span>

<span class="ssens"> <strong>:</strong> replacing carbon especially in a ring <span class="vi">&lt;az<em>a-</em>&gt;</span> </span>



<h2>-ability</h2> <span class="main-fl"> <em xmlns:mwref="http://www.m-w.com/mwref">noun suffix</em> </span>

本人常用的是sed,当然如果sed无法完成。麻烦写一下awk或perl

页: [1]

Powered by Discuz! Archiver 6.1.0F  © 2001-2007 Comsenz Inc.