[FIXED] Regex in Java: match groups until first symbol occurrence

Issue

My string looks like this:

"Chitkara DK, Rawat DJY, Talley N. The epidemiology of childhood recurrent abdominal pain in Western countries: a systematic review. Am J Gastroenterol. 2005;100(8):1868-75. DOI."

What I want is to get letters in uppercase (as separate words only) until first dot, to get: DK DJY N. But not other characters after, like J DOI.

Here`s my part of code for Java class Pattern:

\\b[A-Z]{1,3}\\b

Is there a general option in regex to stop matching after certain character?

Solution

You can make use of the contionous matching using \G and extract your desired matches from the first capturing group:

(?:\\G|^)[^.]+?\\b([A-Z]{1,3})\\b

You need to use the MULTILINE flag to use this in a multiline context. If your content is always a single line you may drop the |^ from your pattern.

See https://regex101.com/r/JXIu21/3

Note that regex101 uses a PCRE pattern, but all features used are also available in Java regex.

Answered By – Sebastian Proske

Answer Checked By – Marie Seifert (Easybugfix Admin)

Leave a Reply

(*) Required, Your email will not be published