syntax.txt 12 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463
  1. RE2 regular expression syntax reference
  2. -------------------------­-------­-----
  3. Single characters:
  4. . any character, possibly including newline (s=true)
  5. [xyz] character class
  6. [^xyz] negated character class
  7. \d Perl character class
  8. \D negated Perl character class
  9. [[:alpha:]] ASCII character class
  10. [[:^alpha:]] negated ASCII character class
  11. \pN Unicode character class (one-letter name)
  12. \p{Greek} Unicode character class
  13. \PN negated Unicode character class (one-letter name)
  14. \P{Greek} negated Unicode character class
  15. Composites:
  16. xy «x» followed by «y»
  17. x|y «x» or «y» (prefer «x»)
  18. Repetitions:
  19. x* zero or more «x», prefer more
  20. x+ one or more «x», prefer more
  21. x? zero or one «x», prefer one
  22. x{n,m} «n» or «n»+1 or ... or «m» «x», prefer more
  23. x{n,} «n» or more «x», prefer more
  24. x{n} exactly «n» «x»
  25. x*? zero or more «x», prefer fewer
  26. x+? one or more «x», prefer fewer
  27. x?? zero or one «x», prefer zero
  28. x{n,m}? «n» or «n»+1 or ... or «m» «x», prefer fewer
  29. x{n,}? «n» or more «x», prefer fewer
  30. x{n}? exactly «n» «x»
  31. x{} (== x*) NOT SUPPORTED vim
  32. x{-} (== x*?) NOT SUPPORTED vim
  33. x{-n} (== x{n}?) NOT SUPPORTED vim
  34. x= (== x?) NOT SUPPORTED vim
  35. Implementation restriction: The counting forms «x{n,m}», «x{n,}», and «x{n}»
  36. reject forms that create a minimum or maximum repetition count above 1000.
  37. Unlimited repetitions are not subject to this restriction.
  38. Possessive repetitions:
  39. x*+ zero or more «x», possessive NOT SUPPORTED
  40. x++ one or more «x», possessive NOT SUPPORTED
  41. x?+ zero or one «x», possessive NOT SUPPORTED
  42. x{n,m}+ «n» or ... or «m» «x», possessive NOT SUPPORTED
  43. x{n,}+ «n» or more «x», possessive NOT SUPPORTED
  44. x{n}+ exactly «n» «x», possessive NOT SUPPORTED
  45. Grouping:
  46. (re) numbered capturing group (submatch)
  47. (?P<name>re) named & numbered capturing group (submatch)
  48. (?<name>re) named & numbered capturing group (submatch)
  49. (?'name're) named & numbered capturing group (submatch) NOT SUPPORTED
  50. (?:re) non-capturing group
  51. (?flags) set flags within current group; non-capturing
  52. (?flags:re) set flags during re; non-capturing
  53. (?#text) comment NOT SUPPORTED
  54. (?|x|y|z) branch numbering reset NOT SUPPORTED
  55. (?>re) possessive match of «re» NOT SUPPORTED
  56. re@> possessive match of «re» NOT SUPPORTED vim
  57. %(re) non-capturing group NOT SUPPORTED vim
  58. Flags:
  59. i case-insensitive (default false)
  60. m multi-line mode: «^» and «$» match begin/end line in addition to begin/end text (default false)
  61. s let «.» match «\n» (default false)
  62. U ungreedy: swap meaning of «x*» and «x*?», «x+» and «x+?», etc (default false)
  63. Flag syntax is «xyz» (set) or «-xyz» (clear) or «xy-z» (set «xy», clear «z»).
  64. Empty strings:
  65. ^ at beginning of text or line («m»=true)
  66. $ at end of text (like «\z» not «\Z») or line («m»=true)
  67. \A at beginning of text
  68. \b at ASCII word boundary («\w» on one side and «\W», «\A», or «\z» on the other)
  69. \B not at ASCII word boundary
  70. \G at beginning of subtext being searched NOT SUPPORTED pcre
  71. \G at end of last match NOT SUPPORTED perl
  72. \Z at end of text, or before newline at end of text NOT SUPPORTED
  73. \z at end of text
  74. (?=re) before text matching «re» NOT SUPPORTED
  75. (?!re) before text not matching «re» NOT SUPPORTED
  76. (?<=re) after text matching «re» NOT SUPPORTED
  77. (?<!re) after text not matching «re» NOT SUPPORTED
  78. re& before text matching «re» NOT SUPPORTED vim
  79. re@= before text matching «re» NOT SUPPORTED vim
  80. re@! before text not matching «re» NOT SUPPORTED vim
  81. re@<= after text matching «re» NOT SUPPORTED vim
  82. re@<! after text not matching «re» NOT SUPPORTED vim
  83. \zs sets start of match (= \K) NOT SUPPORTED vim
  84. \ze sets end of match NOT SUPPORTED vim
  85. \%^ beginning of file NOT SUPPORTED vim
  86. \%$ end of file NOT SUPPORTED vim
  87. \%V on screen NOT SUPPORTED vim
  88. \%# cursor position NOT SUPPORTED vim
  89. \%'m mark «m» position NOT SUPPORTED vim
  90. \%23l in line 23 NOT SUPPORTED vim
  91. \%23c in column 23 NOT SUPPORTED vim
  92. \%23v in virtual column 23 NOT SUPPORTED vim
  93. Escape sequences:
  94. \a bell (== \007)
  95. \f form feed (== \014)
  96. \t horizontal tab (== \011)
  97. \n newline (== \012)
  98. \r carriage return (== \015)
  99. \v vertical tab character (== \013)
  100. \* literal «*», for any punctuation character «*»
  101. \123 octal character code (up to three digits)
  102. \x7F hex character code (exactly two digits)
  103. \x{10FFFF} hex character code
  104. \C match a single byte even in UTF-8 mode
  105. \Q...\E literal text «...» even if «...» has punctuation
  106. \1 backreference NOT SUPPORTED
  107. \b backspace NOT SUPPORTED (use «\010»)
  108. \cK control char ^K NOT SUPPORTED (use «\001» etc)
  109. \e escape NOT SUPPORTED (use «\033»)
  110. \g1 backreference NOT SUPPORTED
  111. \g{1} backreference NOT SUPPORTED
  112. \g{+1} backreference NOT SUPPORTED
  113. \g{-1} backreference NOT SUPPORTED
  114. \g{name} named backreference NOT SUPPORTED
  115. \g<name> subroutine call NOT SUPPORTED
  116. \g'name' subroutine call NOT SUPPORTED
  117. \k<name> named backreference NOT SUPPORTED
  118. \k'name' named backreference NOT SUPPORTED
  119. \lX lowercase «X» NOT SUPPORTED
  120. \ux uppercase «x» NOT SUPPORTED
  121. \L...\E lowercase text «...» NOT SUPPORTED
  122. \K reset beginning of «$0» NOT SUPPORTED
  123. \N{name} named Unicode character NOT SUPPORTED
  124. \R line break NOT SUPPORTED
  125. \U...\E upper case text «...» NOT SUPPORTED
  126. \X extended Unicode sequence NOT SUPPORTED
  127. \%d123 decimal character 123 NOT SUPPORTED vim
  128. \%xFF hex character FF NOT SUPPORTED vim
  129. \%o123 octal character 123 NOT SUPPORTED vim
  130. \%u1234 Unicode character 0x1234 NOT SUPPORTED vim
  131. \%U12345678 Unicode character 0x12345678 NOT SUPPORTED vim
  132. Character class elements:
  133. x single character
  134. A-Z character range (inclusive)
  135. \d Perl character class
  136. [:foo:] ASCII character class «foo»
  137. \p{Foo} Unicode character class «Foo»
  138. \pF Unicode character class «F» (one-letter name)
  139. Named character classes as character class elements:
  140. [\d] digits (== \d)
  141. [^\d] not digits (== \D)
  142. [\D] not digits (== \D)
  143. [^\D] not not digits (== \d)
  144. [[:name:]] named ASCII class inside character class (== [:name:])
  145. [^[:name:]] named ASCII class inside negated character class (== [:^name:])
  146. [\p{Name}] named Unicode property inside character class (== \p{Name})
  147. [^\p{Name}] named Unicode property inside negated character class (== \P{Name})
  148. Perl character classes (all ASCII-only):
  149. \d digits (== [0-9])
  150. \D not digits (== [^0-9])
  151. \s whitespace (== [\t\n\f\r ])
  152. \S not whitespace (== [^\t\n\f\r ])
  153. \w word characters (== [0-9A-Za-z_])
  154. \W not word characters (== [^0-9A-Za-z_])
  155. \h horizontal space NOT SUPPORTED
  156. \H not horizontal space NOT SUPPORTED
  157. \v vertical space NOT SUPPORTED
  158. \V not vertical space NOT SUPPORTED
  159. ASCII character classes:
  160. [[:alnum:]] alphanumeric (== [0-9A-Za-z])
  161. [[:alpha:]] alphabetic (== [A-Za-z])
  162. [[:ascii:]] ASCII (== [\x00-\x7F])
  163. [[:blank:]] blank (== [\t ])
  164. [[:cntrl:]] control (== [\x00-\x1F\x7F])
  165. [[:digit:]] digits (== [0-9])
  166. [[:graph:]] graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~])
  167. [[:lower:]] lower case (== [a-z])
  168. [[:print:]] printable (== [ -~] == [ [:graph:]])
  169. [[:punct:]] punctuation (== [!-/:-@[-`{-~])
  170. [[:space:]] whitespace (== [\t\n\v\f\r ])
  171. [[:upper:]] upper case (== [A-Z])
  172. [[:word:]] word characters (== [0-9A-Za-z_])
  173. [[:xdigit:]] hex digit (== [0-9A-Fa-f])
  174. Unicode character class names--general category:
  175. C other
  176. Cc control
  177. Cf format
  178. Cn unassigned code points NOT SUPPORTED
  179. Co private use
  180. Cs surrogate
  181. L letter
  182. LC cased letter NOT SUPPORTED
  183. L& cased letter NOT SUPPORTED
  184. Ll lowercase letter
  185. Lm modifier letter
  186. Lo other letter
  187. Lt titlecase letter
  188. Lu uppercase letter
  189. M mark
  190. Mc spacing mark
  191. Me enclosing mark
  192. Mn non-spacing mark
  193. N number
  194. Nd decimal number
  195. Nl letter number
  196. No other number
  197. P punctuation
  198. Pc connector punctuation
  199. Pd dash punctuation
  200. Pe close punctuation
  201. Pf final punctuation
  202. Pi initial punctuation
  203. Po other punctuation
  204. Ps open punctuation
  205. S symbol
  206. Sc currency symbol
  207. Sk modifier symbol
  208. Sm math symbol
  209. So other symbol
  210. Z separator
  211. Zl line separator
  212. Zp paragraph separator
  213. Zs space separator
  214. Unicode character class names--scripts:
  215. Adlam
  216. Ahom
  217. Anatolian_Hieroglyphs
  218. Arabic
  219. Armenian
  220. Avestan
  221. Balinese
  222. Bamum
  223. Bassa_Vah
  224. Batak
  225. Bengali
  226. Bhaiksuki
  227. Bopomofo
  228. Brahmi
  229. Braille
  230. Buginese
  231. Buhid
  232. Canadian_Aboriginal
  233. Carian
  234. Caucasian_Albanian
  235. Chakma
  236. Cham
  237. Cherokee
  238. Chorasmian
  239. Common
  240. Coptic
  241. Cuneiform
  242. Cypriot
  243. Cypro_Minoan
  244. Cyrillic
  245. Deseret
  246. Devanagari
  247. Dives_Akuru
  248. Dogra
  249. Duployan
  250. Egyptian_Hieroglyphs
  251. Elbasan
  252. Elymaic
  253. Ethiopic
  254. Georgian
  255. Glagolitic
  256. Gothic
  257. Grantha
  258. Greek
  259. Gujarati
  260. Gunjala_Gondi
  261. Gurmukhi
  262. Han
  263. Hangul
  264. Hanifi_Rohingya
  265. Hanunoo
  266. Hatran
  267. Hebrew
  268. Hiragana
  269. Imperial_Aramaic
  270. Inherited
  271. Inscriptional_Pahlavi
  272. Inscriptional_Parthian
  273. Javanese
  274. Kaithi
  275. Kannada
  276. Katakana
  277. Kawi
  278. Kayah_Li
  279. Kharoshthi
  280. Khitan_Small_Script
  281. Khmer
  282. Khojki
  283. Khudawadi
  284. Lao
  285. Latin
  286. Lepcha
  287. Limbu
  288. Linear_A
  289. Linear_B
  290. Lisu
  291. Lycian
  292. Lydian
  293. Mahajani
  294. Makasar
  295. Malayalam
  296. Mandaic
  297. Manichaean
  298. Marchen
  299. Masaram_Gondi
  300. Medefaidrin
  301. Meetei_Mayek
  302. Mende_Kikakui
  303. Meroitic_Cursive
  304. Meroitic_Hieroglyphs
  305. Miao
  306. Modi
  307. Mongolian
  308. Mro
  309. Multani
  310. Myanmar
  311. Nabataean
  312. Nag_Mundari
  313. Nandinagari
  314. New_Tai_Lue
  315. Newa
  316. Nko
  317. Nushu
  318. Nyiakeng_Puachue_Hmong
  319. Ogham
  320. Ol_Chiki
  321. Old_Hungarian
  322. Old_Italic
  323. Old_North_Arabian
  324. Old_Permic
  325. Old_Persian
  326. Old_Sogdian
  327. Old_South_Arabian
  328. Old_Turkic
  329. Old_Uyghur
  330. Oriya
  331. Osage
  332. Osmanya
  333. Pahawh_Hmong
  334. Palmyrene
  335. Pau_Cin_Hau
  336. Phags_Pa
  337. Phoenician
  338. Psalter_Pahlavi
  339. Rejang
  340. Runic
  341. Samaritan
  342. Saurashtra
  343. Sharada
  344. Shavian
  345. Siddham
  346. SignWriting
  347. Sinhala
  348. Sogdian
  349. Sora_Sompeng
  350. Soyombo
  351. Sundanese
  352. Syloti_Nagri
  353. Syriac
  354. Tagalog
  355. Tagbanwa
  356. Tai_Le
  357. Tai_Tham
  358. Tai_Viet
  359. Takri
  360. Tamil
  361. Tangsa
  362. Tangut
  363. Telugu
  364. Thaana
  365. Thai
  366. Tibetan
  367. Tifinagh
  368. Tirhuta
  369. Toto
  370. Ugaritic
  371. Vai
  372. Vithkuqi
  373. Wancho
  374. Warang_Citi
  375. Yezidi
  376. Yi
  377. Zanabazar_Square
  378. Vim character classes:
  379. \i identifier character NOT SUPPORTED vim
  380. \I «\i» except digits NOT SUPPORTED vim
  381. \k keyword character NOT SUPPORTED vim
  382. \K «\k» except digits NOT SUPPORTED vim
  383. \f file name character NOT SUPPORTED vim
  384. \F «\f» except digits NOT SUPPORTED vim
  385. \p printable character NOT SUPPORTED vim
  386. \P «\p» except digits NOT SUPPORTED vim
  387. \s whitespace character (== [ \t]) NOT SUPPORTED vim
  388. \S non-white space character (== [^ \t]) NOT SUPPORTED vim
  389. \d digits (== [0-9]) vim
  390. \D not «\d» vim
  391. \x hex digits (== [0-9A-Fa-f]) NOT SUPPORTED vim
  392. \X not «\x» NOT SUPPORTED vim
  393. \o octal digits (== [0-7]) NOT SUPPORTED vim
  394. \O not «\o» NOT SUPPORTED vim
  395. \w word character vim
  396. \W not «\w» vim
  397. \h head of word character NOT SUPPORTED vim
  398. \H not «\h» NOT SUPPORTED vim
  399. \a alphabetic NOT SUPPORTED vim
  400. \A not «\a» NOT SUPPORTED vim
  401. \l lowercase NOT SUPPORTED vim
  402. \L not lowercase NOT SUPPORTED vim
  403. \u uppercase NOT SUPPORTED vim
  404. \U not uppercase NOT SUPPORTED vim
  405. \_x «\x» plus newline, for any «x» NOT SUPPORTED vim
  406. Vim flags:
  407. \c ignore case NOT SUPPORTED vim
  408. \C match case NOT SUPPORTED vim
  409. \m magic NOT SUPPORTED vim
  410. \M nomagic NOT SUPPORTED vim
  411. \v verymagic NOT SUPPORTED vim
  412. \V verynomagic NOT SUPPORTED vim
  413. \Z ignore differences in Unicode combining characters NOT SUPPORTED vim
  414. Magic:
  415. (?{code}) arbitrary Perl code NOT SUPPORTED perl
  416. (??{code}) postponed arbitrary Perl code NOT SUPPORTED perl
  417. (?n) recursive call to regexp capturing group «n» NOT SUPPORTED
  418. (?+n) recursive call to relative group «+n» NOT SUPPORTED
  419. (?-n) recursive call to relative group «-n» NOT SUPPORTED
  420. (?C) PCRE callout NOT SUPPORTED pcre
  421. (?R) recursive call to entire regexp (== (?0)) NOT SUPPORTED
  422. (?&name) recursive call to named group NOT SUPPORTED
  423. (?P=name) named backreference NOT SUPPORTED
  424. (?P>name) recursive call to named group NOT SUPPORTED
  425. (?(cond)true|false) conditional branch NOT SUPPORTED
  426. (?(cond)true) conditional branch NOT SUPPORTED
  427. (*ACCEPT) make regexps more like Prolog NOT SUPPORTED
  428. (*COMMIT) NOT SUPPORTED
  429. (*F) NOT SUPPORTED
  430. (*FAIL) NOT SUPPORTED
  431. (*MARK) NOT SUPPORTED
  432. (*PRUNE) NOT SUPPORTED
  433. (*SKIP) NOT SUPPORTED
  434. (*THEN) NOT SUPPORTED
  435. (*ANY) set newline convention NOT SUPPORTED
  436. (*ANYCRLF) NOT SUPPORTED
  437. (*CR) NOT SUPPORTED
  438. (*CRLF) NOT SUPPORTED
  439. (*LF) NOT SUPPORTED
  440. (*BSR_ANYCRLF) set \R convention NOT SUPPORTED pcre
  441. (*BSR_UNICODE) NOT SUPPORTED pcre