keyword extraction
- Related: keyphrase extraction.
- This word is used when we are very conscious of not stopping at one word.
- I'd like to see "100 Personnel Systems for 100 People" or "Evaluating by Harmony, Generalists are Chosen."
- Even the world's "key phrase extraction" is often subject to restrictions such as "noun sequence" or "
adjective*noun+ form"
- noun phrase approach
- It's not possible to extract key phrases like the above with such a constrained method.
- Often there are times when you want to use a string of characters that do not appear in a sentence as a key phrase.
- I want them to be connected by an "information sharing" link when the phrases "information sharing" and "sharing information" are used.
technique
- An approach that does not use linguistic knowledge
- Simple word frequency.
- Need for [stopword
- Throwing away information on word order.
- The "General Manager's Association" issue where the idiom is split.
- Synonyms are considered different
- cooccurrence
- co-location
- N-grams, etc.
- intra-window co-occurrence
- intra-document cooccurrence
- concentration (of one's attention)
- tf-idf
- Approach to map real-valued scores whereas the stop word was 0/1.
- 'The less frequently it appears in other texts, the more appropriate it is to characterize this text.'
- Frequent occurrence as a word, but sometimes an important key phrase in the form of an idiom
- RAKE
- graph based (e.g. graph)
- Graph word adjacencies and choose the one with the highest rank.
- Use PageRank
This page is auto-translated from /nishio/キーワード抽出 using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I'm very happy to spread my thought to non-Japanese readers.