1w6 uRPG 1w6 uRPG
  • Login
  • Public

    • Public
    • Groups
    • Popular
    • Directory

https://sn.1w6.org/file/cdxiao-20170712T092803-2yckkt6.html

https://sn.1w6.org/file/cdxiao-20170712T092803-2yckkt6.html

Hey mastodon, is there anything analogous to regex for CJK character sets? How do people, say, filter a Korean text based on the initial sound in each Hangul character? Or find all characters in a Chinese text that contain a certain radical?

Do people just use regular expressions plus some helper libraries that sort out the text encodings, and that contain some language-specific information??

#unicode #regex

Notices where this attachment appears

  1. cdxiao cdxiao

    Hey mastodon, is there anything analogous to regex for CJK character sets? How do people, say, filter a Korean text based on the initial sound in each Hangul character? Or find all characters in a Chinese text that contain a certain radic…

    Wednesday, 12-Jul-17 05:13:14 UTC
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

1w6 uRPG is a microblogging service brought to you by Arne (Drak) Babenhauserheide. It runs the StatusNet microblogging software, version 1.1.1-release, available under the GNU Affero General Public License. The running version includes the patches from draketo.de/proj/statusnet-patches.

Creative Commons Attribution 3.0 All 1w6 uRPG content and data are available under the Creative Commons Attribution 3.0 license.