1) Model-based indicators (machine learning)    Toxicity/Aggression:  Use Jigsaw’s Perspective API  to obtain probabilities for “TOXICITY,” “INSULT,” “PROFANITY,” “IDENTITY_ATTACK,” etc. Japanese is supported (see the official language table).    Politeness:  Stanford’s research established a framework for estimating “politeness” from markers like request forms, hedges, honorifics, etc. It’s English-centric, but the methodology can be adapted to Japanese. The R package politeness  is also useful.    2) Lightweight rules for Japanese (highly interpretable)    Honorific/hedge rate:  Ratios of “です/ます,” “〜でしょうか,” “お手数ですが,” “お願いします,” and the like.    Presence of slurs/derogatory terms:  A custom NG-word list (including figurative or euphemistic forms).    Imperatives/strong assertions:  Frequency of “〜しろ,” “〜に決まってる,” heavy use of exclamation marks, ALL-KATAKANA bursts, etc.    Consideration/evidence markers:  Signs of dialogic and verifiable style such as “根拠:,” “出典:,” “もし〜なら.”    3) ...
 
Comments
Post a Comment