/plurality-japaneseのベクトル検索
prev
reading
$ git clone https://github.com/nishio/omoikane-embed-core plurality-japanese-embed
$ pip install -r requirements.txt
ModuleNotFoundError: No module named 'distutils'
Ensure distutils is Installed: distutils is included with the standard library for Python versions prior to 3.10. For Python 3.10 and later, distutils has been deprecated and is not included by default. If you're using Python 3.10 or later, consider using setuptools instead for package management and distribution. はー、なるほど 作ったときは3.10だったけど今は3.12だな
諸々修正して動いた
:
% python make_vecs_from_json/main.py
processing 769 pages
100%|██████████████████████████| 769/769 [00:05<00:00, 139.02it/s]
total tasks: 7470, 0.0% was cached
processing 7470 tasks in 150 batches
100%|██████████████████████████| 150/150 [06:29<00:00, 2.60s/it]
upload :
% python upload_vecs/main.py
uploading plurality-japanese.pickle
100%|██████████████████████████| 74/74 [00:24<00:00, 3.06it/s]
OK
before/after
blocksize=100での実験
結果 :
% python make_vecs_from_json/main.py
processing 769 pages
100%|██████████████████████████| 769/769 [00:03<00:00, 224.82it/s]
total tasks: 19866, 13.4% was cached
processing 17205 tasks in 345 batches
100%|██████████████████████████| 345/345 [12:19<00:00, 2.14s/it]
% python upload_vecs/main.py
uploading plurality-japanese.pickle
100%|██████████████████████████| 239/239 [01:18<00:00, 3.05it/s]
OK
チャンクを小さくして実行した分は$0.36くらい
% git clone https://github.com/nishio/omoikane-vecsearch plurality-vecsearch-ja
% npm install
% npm run dev% git remote rename origin upstream
% git remote add origin https://github.com/nishio/plurality-vecsearch-ja.git
% git branch -M main
% git push -u origin main
Vercel dashboardを開く
buildとdeployはできたが、検索対象プロジェクトの設定がされてないな
before / after
after
うーん
まあここの改善は後でいいか
リリース!
/plurality-japanese/ベクトル検索の改善から今日やったこと
2024-04-04
build Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3, actions/setup-python@v4. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/. actions/checkout: Action for checking out a repo actions/setup-python: Set up your GitHub Actions workflow with a specific version of Python
:
The conflict is caused by:
The user requested protobuf==5.26.1
grpcio-tools 1.62.1 depends on protobuf<5.0dev and >=4.21.6
To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict