ICode9

精准搜索请尝试: 精确搜索
首页 > 编程语言> 文章详细

C++ 各种数字类型的正则表达式

2021-01-17 21:58:45  阅读:659  来源: 互联网

标签:10 数字 正则表达式 pattern character C++ 89 same string


C++ 11 开始支持正则表达式了,我用的编译器是 TDM-GCC 4.9.2 64-bit,要在菜单命令:“工具-->编译选项->编译器”中设置配置,编译时加入以下命令文本框里添加 “-std=c++11”,如下图所示:

刚接触正则表达式,初步学了点皮毛记录一下。直接上代码:

#include <iostream>
#include <string>
#include <vector>
#include <regex>
using namespace std;

int regexSplit(string&,const string,vector<string>&,int);
 
int main(void)
{
 	vector <string> vect;
	string str = "(12.3e+10-0.018e-5)+(11.006-7.)+.89";
	string reg[11]={
		"(\\d+)",				/*整数,包括0开头的 */ 
		"([1-9]\\d*)",			/*错:整数,但取不到0 */ 
		"(0|[1-9]\\d*)",		/*全部整数 */ 
		"(\\d+\\.\\d+)",		/*小数不包括整数 */ 
		"(\\d*\\.?\\d+)",		/*错:整数或小数,但包括7. */ 
		"(\\d+\\.?\\d*)",		/*错:整数或小数,但包括.89 */ 
		"(\\d+|\\d+\\.\\d+)",	/*整数或小数 */ 
		"-?(\\d+|\\d+\\.\\d+)",	/*正数或负数 */ 
		"\\([^()]*\\)",			/*匹配成对的括号 */ 
		"-?(\\d+\\.\\d+)e[+-]\\d+",	/*科学记数法 */	
		"-?((\\d+|\\d+\\.\\d+)|(\\d+\\.\\d+)e[+-]\\d+)"	/*实数 */
		};
	cout<<str<<endl<<"--------------"<<endl;
	for (auto a:reg){
		regexSplit(str,a,vect,0);
		cout<<"pattern:"<<a<<endl<<"string: ";
		for(auto v:vect) cout<<v<<" ";
		vect.clear();
		cout<<endl<<"=============="<<endl;
	}
	return 0;
 }
 
int regexSplit(string &str,const string str_reg,vector<string>&vect,int pos)
{
	int i=0;
	if (pos!=-1) pos=0;  //pos=0 匹配到的位置,pos=-1匹配位置的前一字串 
	regex Pattern(str_reg); 
    sregex_token_iterator it(str.begin(),str.end(),Pattern, pos); 
    sregex_token_iterator end;
    for(;it!=end;++it,i++) vect.push_back(*it); 
    return i;  //if (i==0) 没有匹配到,else 匹配到的个数 i
 } 

输出结果:

(12.3e+10-0.018e-5)+(11.006-7.)+.89
--------------
pattern:(\d+)
string: 12 3 10 0 018 5 11 006 7 89
==============
pattern:([1-9]\d*)
string: 12 3 10 18 5 11 6 7 89
==============
pattern:(0|[1-9]\d*)
string: 12 3 10 0 0 18 5 11 0 0 6 7 89
==============
pattern:(\d+\.\d+)
string: 12.3 0.018 11.006
==============
pattern:(\d*\.?\d+)
string: 12.3 10 0.018 5 11.006 7 .89
==============
pattern:(\d+\.?\d*)
string: 12.3 10 0.018 5 11.006 7. 89
==============
pattern:(\d+|\d+\.\d+)
string: 12.3 10 0.018 5 11.006 7 89
==============
pattern:-?(\d+|\d+\.\d+)
string: 12.3 10 -0.018 -5 11.006 -7 89
==============
pattern:\([^()]*\)
string: (12.3e+10-0.018e-5) (11.006-7.)
==============
pattern:-?(\d+\.\d+)e[+-]\d+
string: 12.3e+10 -0.018e-5
==============
pattern:-?((\d+|\d+\.\d+)|(\d+\.\d+)e[+-]\d+)
string: 12.3e+10 -0.018e-5 11.006 -7 89
==============

--------------------------------
Process exited after 0.5831 seconds with return value 0
请按任意键继续. . .

附录:

特殊字符:

charactersdescriptionmatches
.not newlineany character exceptline terminators(LF, CR, LS, PS).
\ttab (HT)a horizontal tab character (same as\u0009).
\nnewline (LF)a newline (line feed) character (same as\u000A).
\vvertical tab (VT)a vertical tab character (same as\u000B).
\fform feed (FF)a form feed character (same as\u000C).
\rcarriage return (CR)a carriage return character (same as\u000D).
\clettercontrol codea control code character whosecode unit valueis the same as the remainder of dividing thecode unit valueofletterby 32.
For example:\cais the same as\u0001,\cbthe same as\u0002, and so on...
\xhhASCII charactera character whosecode unit valuehas an hex value equivalent to the two hex digitshh.
For example:\x4cis the same asL, or\x23the same as#.
\uhhhhunicode charactera character whosecode unit valuehas an hex value equivalent to the four hex digitshhhh.
\0nulla null character (same as\u0000).
\intbackreferencethe result of the submatch whose opening parenthesis is theint-th (intshall begin by a digit other than0). Seegroupsbelow for more info.
\ddigita decimal digit character
\Dnot digitany character that is not a decimal digit character
\swhitespacea whitespace character
\Snot whitespaceany character that is not a whitespace character
\wwordan alphanumeric or underscore character
\Wnot wordany character that is not an alphanumeric or underscore character
\charactercharacterthe charactercharacteras it is, without interpreting its special meaning within a regex expression.
Anycharactercan be escaped except those which form any of the special character sequences above.
Needed for:^ $ \ . * + ? ( ) [ ] { } |
[class]character classthe target character is part of the class
[^class]negated character classthe target character is not part of the class

数量:

characterstimeseffects
*0 or moreThe preceding atom is matched 0 or more times.
+1 or moreThe preceding atom is matched 1 or more times.
?0 or 1The preceding atom is optional (matched either 0 times or once).
{int}intThe preceding atom is matched exactlyinttimes.
{int,}intor moreThe preceding atom is matchedintor more times.
{min,max}betweenminandmaxThe preceding atom is matched at leastmintimes, but not more thanmax.

分组:

charactersdescriptioneffects
(subpattern)GroupCreates a backreference.
(?:subpattern)Passive groupDoes not create a backreference.

其他:

charactersdescriptioncondition for match
^Beginning of lineEither it is the beginning of the target sequence, or follows aline terminator.
$End of lineEither it is the end of the target sequence, or precedes aline terminator.
|SeparatorSeparates two alternative patterns or subpatterns..

 

标签:10,数字,正则表达式,pattern,character,C++,89,same,string
来源: https://blog.csdn.net/boysoft2002/article/details/112758172

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有