Hey mastodon, is there anything analogous to regex for CJK character sets? How do people, say, filter a Korean text based on the initial sound in each Hangul character? Or find all characters in a Chinese text that contain a certain radical?
Do people just use regular expressions plus some helper libraries that sort out the text encodings, and that contain some language-specific information??